Markov Models Of Dual-Redundant Systems

 

Many practical systems require numerous sub-functions to be performed in order to support the overall function, and they are designed to be dual-redundant on each sub-function. Failure of the first component of any particular sub-function is annunciated, and repaired at a suitable rate to preclude complete failure. In addition, there is a possibility of a single failure from the "full-up" condition leading to overall functional failure. This type of system with n sub-functions is often represented by a Markov model as shown below.

 

 

This model neglects the possibility that multiple sub-functions may be degraded at any given time. In other words, if one component of a certain sub-function fails, it is assumed that the next transition is either a repair back to the full-up state or else a failure of the other component of that same sub-function, resulting in total system failure. Strictly speaking, it’s possible that a component of some other sub-function might fail before either of those two transitions occurs, and this would place the system in a state with two of the sub-functions partially failed. However, this would affect the overall system failure rate only if both components of that second sub-function failed within the repair interval of the first sub-function, and both of those failures would need to occur prior to the failure of the second component of the first sub-function. Thus (for example) the transition rate λ1,n+1 from state 1 to state n+1 should actually be augmented by the rate of failing both components of sub-function 2, consists not just of but also of (λ2,n+1)2 T where T is the repair interval. Similar terms would be needed to account for the other sub-functions. However, in many practical situations the repair intervals are small enough so that these second-order terms are negligible. This is why the simplified model shown above is often a useful representation of dual-redundant system reliability.

 

If we designate the “full up” state with the number 0, the n degraded states with the numbers 1 through n, and the final failure state with the number n+1, then the equations of the system can be written as

 

 

For the steady-state condition we have dPj/dt = 0 for all j, so we can solve the central equations to give the values of P1 through Pn as a function of P0

 

 

where λi,j signifies the rate of transition from state i to state j. Since the sum of all the state probabilities from P0 to Pn+1 is 1, we have

 

 

Also, the steady-state flow rate into State n+1 is

 

 

The exact failure rate for entering state n+1 is therefore

 

 

Naturally this rate is independent of λn+1,0, because the rate is, by definition, a measure of the propensity to enter a particular state for entities that are not presently in that state, which is clearly independent of the rate of leaving that state. Interestingly, if we define λj,j as infinite for each j, none of the state equations are affected, because the infinite "self-transition" flow Pjλj,j is both added to and subtracted from the jth equation, but this enables us to write the equation for the failure rate in the more unified form

 

 

This form emphasizes the fact that the overall rate for entering state n+1 is simply the weighted average of the individual transition rates λj,n+1 from each of the states 0, 1, 2, ..., n, with each rate weighted in proportion to the steady-state probability Pj of the respective state.

 

Since the time spent in the (n+1)th state is irrelevant to our result (because the reliability of the operational fleet does not depend on the length of time that inoperative systems are absent from the fleet while being repaired), we could simplify the analysis by deleting that state from the model, and point the total failure transitions directly to the full-up state. This is illustrated for a simple system with just two partial failure states in the figure below.

 

 

From the equations

 

 

we have the steady-state relations

 

 

Substituting into the conservation equation P0 + P1 + P2 = 1 allows us to easily solve for the steady-state probabilities

 

 

The total failure rate can then be computed as

 

 

This immediately generalizes to give the formula for N partial-failure states as shown previously:

 

 

Suppose there are only two distinct repair rates, denoted by μa = 1/Ta and μb = 1/Tb, and we wish to express the overall rate as an explicit function of the repair times Ta and Tb. Let the indices 1 through n signify the states with repair rate μa, and the indices n+1 to N signify the states with repair rate μb. We can then re-write the above equation in the form

 

 

Assuming the failures rates are smaller than the repair rates, we can expand the fractions in powers of σ/μ, and to the first order we get

 

 

Collecting terms in Ta and Tb, we arrive at

 

 

where the Ai and Bi coefficients are given by the summations

 

 

Return to MathPages Main Menu