Markov Models Of DualRedundant Systems 

Many practical systems require numerous subfunctions to be performed in order to support the overall function, and they are designed to be dualredundant on each subfunction. Failure of the first component of any particular subfunction is annunciated, and repaired at a suitable rate to preclude complete failure. In addition, there is a possibility of a single failure from the "fullup" condition leading to overall functional failure. This type of system with n subfunctions is often represented by a Markov model as shown below. 



This model neglects the possibility that multiple subfunctions may be degraded at any given time. In other words, if one component of a certain subfunction fails, it is assumed that the next transition is either a repair back to the fullup state or else a failure of the other component of that same subfunction, resulting in total system failure. Strictly speaking, it’s possible that a component of some other subfunction might fail before either of those two transitions occurs, and this would place the system in a state with two of the subfunctions partially failed. However, this would affect the overall system failure rate only if both components of that second subfunction failed within the repair interval of the first subfunction, and both of those failures would need to occur prior to the failure of the second component of the first subfunction. Thus (for example) the transition rate l_{1,n+1} from state 1 to state n+1 should actually be augmented by the rate of failing both components of subfunction 2, which at the end of the repair interval T would be just (l_{2,n+1})^{2}T. Similar terms would be needed to account for the other subfunctions, but in many practical situations the repair intervals are small enough so that these secondorder terms are negligible, especially since T is generally chosen so that the probability of even one fault occurring during that interval is quite low, and hence the probability of two is extremely low. This is why the simplified model shown above is often a useful representation of dualredundant system reliability. 

If we designate the “full up” state with the number 0, the n degraded states with the numbers 1 through n, and the final failure state with the number n+1, then the equations of the system can be written as 



For the steadystate condition we have dP_{j}/dt = 0 for all j, so we can solve the central equations to give the values of P_{1} through P_{n} as a function of P_{0} 



where l_{i,j} signifies the rate of transition from state i to state j. Since the sum of all the state probabilities from P_{0} to P_{n+1} is 1, we have 



Also, the steadystate flow rate into State n+1 is 



The exact failure rate for entering state n+1 is therefore 



Naturally this rate is independent of l_{n+1,0}, because the rate is, by definition, a measure of the propensity to enter a particular state for entities that are not presently in that state, which is clearly independent of the rate of leaving that state. Interestingly, if we define l_{j,j} as infinite for each j, none of the state equations are affected, because the infinite "selftransition" flow P_{j}l_{j,j} is both added to and subtracted from the jth equation, but this enables us to write the equation for the failure rate in the more unified form 



This form emphasizes the fact that the overall rate for entering state n+1 is simply the weighted average of the individual transition rates l_{j,n+1} from each of the states 0, 1, 2, ..., n, with each rate weighted in proportion to the steadystate probability P_{j} of the respective state. 

Since the time spent in the (n+1)th state is irrelevant to our result (because the reliability of the operational fleet does not depend on the length of time that inoperative systems are absent from the fleet while being repaired), we could simplify the analysis by deleting that state from the model, and point the total failure transitions directly to the fullup state. This is illustrated for a simple system with just two partial failure states in the figure below. 

From the equations 



we have the steadystate relations 



Substituting into the conservation equation P_{0} + P_{1} + P_{2} = 1 allows us to easily solve for the steadystate probabilities 



The total failure rate can then be computed as 



This immediately generalizes to give the formula for N partialfailure states as shown previously: 



Suppose there are only two distinct repair rates, denoted by m_{a} = 1/T_{a} and m_{b} = 1/T_{b}, and we wish to express the overall rate as an explicit function of the repair times T_{a} and T_{b}. Let the indices 1 through n signify the states with repair rate m_{a}, and the indices n+1 to N signify the states with repair rate m_{b}. We can then rewrite the above equation in the form 



Assuming the failures rates are smaller than the repair rates, we can expand the fractions in powers of s/m, and to the first order we get 



Collecting terms in T_{a} and T_{b}, we arrive at 



where the A_{i} and B_{i} coefficients are given by the summations 



