repairable systems

When performing various reliability tasks, non-repairable systems or products are treated differently from repairable systems or products.  Some of the tools that are used for one type are not applicable to the other.   Obviously, at some level, repairable systems are composed of non-repairable parts.   Examples of non-repairable systems would be “one-shot” devices like light bulbs or more complex devices like pacemakers.  Examples of repairable systems are computers, automobiles, and airplanes.

 

What is unique about repairable systems?  Availability becomes a key measure of importance.  In simple terms, availability is the percentage of time that the product or system is able to perform its required functions.  When the required functions cannot be performed because a failure has occurred, the system must be repaired to restore the functionality.  This is where another measure, maintainability, impacts the system availability.  The faster the system can be repaired, the greater the availability to the customer.  For systems that require high reliability or availability, redundancy can improve the design.  However, repairable systems will benefit significantly more than non-repairable systems when using redundancy.

 

Common metrics used in measuring system types are shown in the table below.

METRIC

NON-REPAIRABLE

REPAIRABLE

Time to Failure MTTF Time to First FailureHazard Rate MTBF Time to First FailureROCOF/Failure Rate
Probability Reliability Availability(Reliability)
Maintainability N/A Maintainability Downtime
Warranty Product replacement within warranty period Part/product replacement within warranty period

The table below compares some additional areas of non-repairable systems and repairable systems.

NON-REPAIRABLE

REPAIRABLE

Discarded (recycled?) upon failure Restored to operating conditions without replacing entire system
Lifetime is random variable described by single time to failure Lifetime is age of system or total hours of operation
Group of systems – lifetime assumed independent & identically distributed (from same population) Random variables of interest are times between failure and number of failures at particular age.
Failure rate is hazard rate of a lifetime distribution – a property of time to failure Failure rate is rate of occurrence of failures (ROCOF) – a property of a sequence of failure times

 

Reliability modeling is usually more complex for repairable systems.  Often, methods like Markov models (chains) is required to adequately model repairable systems as opposed to simple series block diagram methods for non-repairable systems.

In the area of monitoring or analysis, the following table compares methods for both types of systems.

METHOD

NON-REPARIABLE

REPAIRABLE

Weibull Useful method (single failure modes only) Not used at system level
Reliability Growth –  Duane

– AMSAA

Usually not used Used during development testing
Mean Cumulative Function (MCF) Usually not used Useful method (non-parametric)
Event Series (Point Processes) HPP (For random, constant average rate events) NHPP (Parametric method) – complex

 

It is important to understand the type of system being designed and use the appropriate reliability methods and tools to match that system.  This may require some research but it’s important to use the correct methods so as not to have misleading results.

What has been your experience in doing analysis of repairable systems compared to non-repairable systems?

Many of the “standard” reliability methods are intended for non-repairable systems. That is, when a component, sub-assembly or system fails, it is not repaired and returned to service. The Weibull distribution and other well-known distributions which effectively describe the time to failure assume the failures are “terminal”. That is, the whole system is replaced.

In contrast, repairable systems may fail multiple times during their lifetimes and this results in “recurrent events” in which system components may be repaired or replaced to bring the system back on line. In this case, a single system actually has multiple ages, i.e. components which have been repaired or replaced are “younger” than the rest of the system.

Reliability data comprised of recurrent events should be analyzed differently than time to failure data from non-repairable systems. In particular, it is important to recognize the sequence of the events for individual systems represented in the data. This is done by modeling the cumulative failures (repairs, or costs) versus the system age (time). This model can then be used to predict the total failures (repairs, or costs) at some future point in time.

As an example, warranty data are a collection of recurrent events on many products in the field. Events include repair, replacement and preventive maintenance. Warranty data can be analyzed to estimate the cost of extending the time on a standard factory warranty. The resulting model can be used to estimate such things as cost per unit or number of repairs per unit. This information can then be used to decide whether revenue would increase sufficiently to make a longer warranty period beneficial.

Training and consulting is available for repairable systems applications.

Greg Larsen, MS, CRE
Senior Reliability Consultant
gregl@opsalacarte.com