Monthly Archives: July 2011

When a manufacturer of a material that is used for an entire line of products informs you that they can no longer supply this material it can be difficult to find a replacement material to perform to the same specifications.  Once you’ve determined a number of potential alternative suppliers / materials, what’s next?  Depending on how the material is used in the products and the percent of parts in the product line that are made from this material will probably determine how stringent and extensive of a qualification process should follow.

In a recent case of this kind at my company, this was exactly the case.   A plastics manufacturer was experiencing issues with meeting demand and informed us that the material would no longer be available in 6 months.   The typical qualification process usually consists of several destructive and non-destructive tests that are meant to evaluate the integrity of the alternative material relative to the proposed replacement.  The tests that are common to our methods include moldability tests, drop test, tensile testing, UV exposure for color shift effects, and modest weatherability (more at environmental testing) tests for plastic warping.  The moldability tests included 30 piece sample measurements of critical and overall dimensions, mold-flow testing, and visual inspection of all features to ensure of no short shots or other molding issues.  To elaborate on the ‘weatherability tests’, essentially the materials were subjected to temperature chambers and run through high-heat and thermal cycling routines, but the only data collected was dimensional measurements before and after this testing only to gauge dimensional drift.

All the testing up to this point includes a relatively comprehensive testing platform and was a good overall gauge of the performance of the plastic as it is today.  However, we realized it was lacking one major element, the performance of the plastic in 5 years, or near the end of its warranty period.  There was only 1 test designed to evaluate the “performance” of the plastic after 5 years, and this was the UV test.  However, this test does not evaluate the mechanical properties or performance of the material, solely the physical appearance.  Not to mention there was no evaluation of acceleration factor for this test, only a subjective period of time in which the plastics are put into a UV chamber.  The only testing to evaluate material properties is the drop testing and tensile testing performed on newly molded plastic.

This is where reliability becomes useful.  Reliability forces you to consider the affects of the elements on the specimen / material / device at hand not just at the present moment, but in several hours, duty cycles, or years down the line.  Since there had never been any consideration of the mechanical integrity of the material near the end of the warranty period for this type of testing, it was up to us to determine how to do that.

We set out to determine what factors would accelerate the life of the material.  After debating on whether or not high temperature soaking would act as a stressor and accelerate the material to 5 years, we realized it probably would not since it would most likely only bring the plastic back up to its heat deflection temperature and only remove the residual stresses, perhaps allowing it to perform better in certain tests.

The next potential ‘stressor’ was UV.  It was easily agreed upon that UV exposure would affect the mechanical properties of the plastic, but we needed a system to determine its equivalent 5-year exposure.   The method we decided to implement was a 3-point test with molded tensile bars.  We’ve decided to expose the material to UV for 50 days, 75 days, and 100 days.  Once the material is finished with the ‘aging’ process, we intend to subject them to tensile testing along with ‘virgin’ tensile bars.  As of this date, units are still in the aging process in the UV chamber.  With 20 pieces of each, we should be able to determine the relationship of UV exposure to tensile strength with relatively high confidence.  Results pending.

See the Ops A La Carte Seminar on Fundamentals of Climatic Testing

There is a great deal of development effort going toward the development of “gearless” wind turbines. The reason being that the Gearbox in wind turbines remains as one of the single largest reliability problems in the industry.

On wind turbines: “Current designs can’t be scaled up economically. Most of the more than 25,000 wind turbines deployed across the United States have a power rating of three megawatts or less and contain complex gearbox systems. The gearboxes match the slow speed of the turbine rotor (between 15 to 20 rotations per minute) to the 2,000 rotations per minute required by their generators. Higher speeds allow for more compact and less expensive generators, but conventional gearboxes—a complex interaction of wheels and bearings—need regular maintenance and are prone to failure, especially at higher speeds.”

“On land, where turbines are more accessible, gearbox maintenance issues can be tolerated. In rugged offshore environments, the cost of renting a barge and sending crews out to fix or maintain a wind-ravaged machine can be prohibitive. “A gearbox that isn’t there is the most reliable gearbox,” says Fort Felker, direct of the National Renewable Energy Laboratory’s wind technology center.”

“To achieve the power output of a comparable gearbox-based system, a direct-drive system must have a larger internal diameter that increases the radius—and therefore the speed—at which its magnets rotate around coils to generate current. This also means greater reliance on increasingly costly rare-earth metals used to make permanent magnets.”

For more information, see:
and another post here: Wind Turbines Gearbox Reliability


While product reliability has become a major concern to most organizations, many have overlook developing good reliability specifications. This oversight can result in ambiguous and purposeless reliability testing during the validation phase of the product development. Effective reliability testing requires well-defined reliability specification. After all, the prime objective of a reliability engineering program is to test and assess product reliability.

A common element that is vastly ignored but rather critical to a sound reliability specification is definitions of equipment failure. Even the most vigorous reliability-testing program is of little use if the product being tested has poorly defined failure parameters. This article discusses the essential requirements for establishing concise and effective reliability specifications, and proposes a method to define equipment failure.


Among the requirements that are often used to specify equipment reliability is “mean time between failures” (MTBF), which is verified during subsystem and system reliability testing. Essential to reliability testing is the development of agreed-upon definitions of equipment failure and it must be clearly defined at the earliest stages of the product development. This may seem to be fairly obvious whether a product has failed or not, but such a definition is quite necessary for a number of different reasons.

One of the most important reasons is that different manufacturers may have different definitions as to what sort of behavior actually constitutes a failure. Identical tests can be performed on the same equipment by different groups may produce radically different results simply because the different groups may have different definitions of product failure. This can result in performance values that are sometimes significantly different, which in reality it may not be true. There are cases where one manufacturer has claimed MTBF value of 2000 hours/failure whereas another manufacturer has reported 100 hours/failure for the same product. This discrepancy may not be due to the fact that one having a vastly superior product; rather, it may be due to the differences in their definition and assessment of a “failure”.

In an effort to normalize reliability criteria, standards such as SEMI E10 have been created to give customers and suppliers in semiconductor manufacturing a guideline for measuring reliability, availability, and maintainability (RAM). SEMI E10 defines an interrupt as equipment inability to perform its intended function due to occurrence of assists or failures, or [1]

  • Equipment Interrupt = Sum of all Failures + Sum of all Assists

It has further defines assists and failures as:

Assist: Any unplanned interruption that occurs during equipment operation where all of the three following conditions apply:

  • Equipment operation is resumed through external intervention.
  • There is no replacement of a part, other than specified consumables.
  • There is no further variation from specifications of equipment operation.

Failure: Any unplanned interruption or variance from the specifications of equipment operation other than assists.

From the above definition, one may conclude that a failure is defined as replacement of a part and assists are any external intervention. In practice however, many customers view machine performance as the cost of operation. Thus, a customer may not favor equipment if it would require frequent external interventions (e.g., equipment adjustment) since they would need to allocate many resources to operate the equipment. For this reason, customer may tend to view equipment adjustments also as failures. Thus, even with standards, a common struggle still exists between the suppliers and the customers to classify an interrupt.

Equipment Interrupt (Failure/Assist) Classification

This article proposes classifying failures and assists by considering the modes of recovery; that is, to categorize failures and assists by considering the means by which a customer amends the problems.

Just like reliability testing that must simulate product usage in the field, failures and assists should also be profiled the way they are repaired by the customers. In semiconductor production environment, many customers categorize equipment repair activities by the nature of the interrupts. They have repair policies that allocate recovery actions to machine operators and engineering technicians. This recovery plan is practical because different interrupts require different skill sets for repairs. The plan requires that the customer have a good understanding of the equipment-operating behavior, so they can accurately staff maintenance and repair personnel.

This recovery plan technique can be used as a foundation for assist and failure classifications. In other words, assists may be classified as machine induced interrupts that is recovered by a machine operator, whereas, failures are machine induced interrupts that require skilled technicians for comprehensive troubleshooting and in-depth corrective actions. Deployment of this method also helps determine the cost of equipment ownership; it costs less to resolve an assist that is repaired by a machine operator and cost more to resolve a failure since technician involvement is required.


Product specifications are no longer limited to just meeting functionality measures (i.e., speed, capacity, range, etc) because for products with poor reliability and seldom available for use, functionality measures are meaningless. Reliability specification is the backbone of a reliability program and it is a prerequisite for reliability testing. Without this, the implementation of a reliability program will be difficult and frustrating process. Typical equipment reliability specification includes performance indices such MTBF and it must always be accompanied with clear definition if failure. Effective reliability testing heavily hinges on clear definition of equipment failure. Without this definition as a baseline, any reliability discussions become meaningless.

This article stressed the importance of equipment failure as an integral part of reliability specification, and proposed a method for classifying equipment interrupts.


1.     SEMI E10-99, “Standard for Definition and Measurement of Equipment Reliability, Availability, and Maintainability (RAM),” SEMI (Semiconductor Equipment and Material International), 805 East Middlefield Road, Mountain View, CA 94043, 1999.

Online dictionaries defines the word simulation as:

• the act or an instance of simulating
• a representation of a problem, situation, etc, in mathematical terms, esp using a computer
• imitation or enactment, as of something anticipated or in testing.
• the act or process of pretending; feigning.

Simulation is an integral part of reliability testing.

On September 14th Ops A La Carte and Tribal Engineering will have a look at simulation and how it can be used in reliability testing with their Integratiing Simulation into a Reliability Program webinar.

Some examples of simulation tools are techniques like Finite Element Analysis (FEA), Monte Carlo Analysis, and Probabilistic Design System (PDS). At this webinar we will show you have to integrate tools like these into a reliability program to optimize your reliability results.

If you have been involved in the field of reliability in the past several years you most likely have heard the term “Design for Reliability” (DfR). More and more companies these days advertize DfR as part of their design process and more and more web-based and life training is offered on the DfR methods. However, what is really “Design for Reliability”? The short answer – it hasn’t been defined yet. Terminology-wise DfR is becoming the next Failure Rate, Durability, or MTBF – the terms many engineers think they understand, but often don’t and consequently misapply.

The meaning of the DfR is intuitively obvious: it is an approach to design, which (contrary to “Test-Analyze-Fix” philosophy) moves the reliability-focused activities to the earlier phases of the design and helps to design reliable products using science- and analysis-based methods. However, the devil is always in the details.

Various attempts of defining Design for Reliability have been made in the past several years including ReliaSoft:

Design for Reliability: Overview of the Process and Applicable Techniques

Ops A La Carte DfR Seminar:

Andre Kleyner and Mike Silverman’s presentation at the Applied Reliability Symposium 2011

New DfR chapter in Practical Reliability Engineering 5th Edition (P. O’Connor and A. Kleyner, to be published December 2011)
and several others.

However, the questions remain: What are the exact DfR tools? What are the specific activities? When they are best applied? Who is responsible? What metrics to use? And many others still remain. I believe the engineering community needs to continue defining DfR and building up its practices in order to take confusion out of this subject and eventually turn this loosely defined discipline into an essential ingredient of the Reliability Science.

Andre Kleyner,  Delphi Electronics & Safety, Purdue Univeristy.