Monthly Archives: February 2012

International Reliability Innovations Symposium (IRIS)

March 16, 2012, San Jose, CA and via WEBINAR

The IEEE Reliability Society of Silicon Valley is sponsoring a conference specifically devoted to innovation. We have just opened the "Call for Papers". Deadline for submission is January 20, 2012.

For more info, please Contact Us Info

FREE WEBINAR on Tribology and Reliability

March 7, 2012 – 11:30am-12:30pm

Click here to REGISTER

This is part of our monthly FREE Webinar series

Tribology is the science and engineering of interacting surfaces in relative motion. In this seminar, we will introduce the basics of Tribology, their impact on reliability and do so from the perspective of machine applications. Lubrication, film thickness, loads and Hertzian contact types effect the reliability of a design. Weibull is frequently used to model wear characteristics, but is this always the best distribution for characterizing the reliability of machine elements? All of these areas will be explored.

Contact Us Info

For more info or to register, please Contact Us Info

Traditional Mahalanobis distance is a generalized distance, which can be considered a measure of the degree of similarity (or divergence) in the mean values of different characteristics of a population, considering the correlation among the characteristics. It has been used for many year in clustering classification and discriminant analysis. Mahalanobis distance is attributable to Prof.P.C. Mahalanobis , founder of the Indian Statistics Institute some 60 years ago. Mahalanobis distance has been used for various types of pattern recognition, e.g. inspection systems, face and voice recognition systems , counterfeit detection systems, etc. The figure below displays data published by Fisher (1936) and cluster analysis, where classification into three predetermined categories is demonstrated

Another generalized distance most engineers have encountered is the Euclidean distance between two multivariate points p and q. If p = (p1, p2,…, pn) and q = (q1, q2,…, qn) are two points in Euclidean n-space, then the distance from p to q, or from q to p is given by:

No consideration is given to the correlation between characteristics in Euclidean distance calculations.
Dr. G.Taguchi of Ohken Associates Japan developed an innovative method for determining the generalized distance from the centroid of a reference group (of multivariate data) to a multivariate point. For example, if a doctor were to have a group of very healthy patients, whose vital characteristics like blood pressure, body temperature, skin color, heart rate, and respiration rate, etc. were all considered exemplary, then he could define a Mahalanobis space, a reference space, with those healthy folks, and use the centroid as the zero point and define a unit distance for a continuous degree-of -health scale. If a not-so-healthy person came to the same doctor, and the same characteristics were measured, he would have an MHD number much higher than the reference group. His MHD number would be indicative of his generalized distance from the centroid of the healthy group. As time passed, the MHD number for the not-so-healthy patient could increase (or decrease) , depending on whether his health were failing or improving, respectively. In general, very healthy people tend to look quite similar , while unhealthy people tend to look quite different from one another, (and from the healthy group) . In addition, the changes in correlation structure among the unhealthy patients’ characteristics strongly affect their MHD numbers. In the case where a person’s MHD number reached a predetermined high threshold value, for example, hospitalization might be recommended by the doctor. If the MHD became similar to those of the reference group, the patient could be recommended for simple periodic occasional doctor visits.
From any number of multivariate characteristics measured, it is possible to readily identify those characteristics which are most important (in a pareto sense) . Reducing cost of measurement is an important consideration for many enterprises. There is usually a subset of measurement which provide all necessary data to make correct decisions. Strong correlations between measurement make it possible to eliminate measures that add little value. The information contained in a handful of multivariate measurements may be sufficient to identify abnormal conditions.
A medical trend chart of MHD illustrates the relative level of health of a person as a function of time. For example, daily collection of data for a patient, along with daily estimation of MHD, could be used to track overall health improvements (or deteriorations). Increasing trends could be used for prognostics, to initiate preventive countermeasures, before a threshold condition is reached. The corrective effect of the countermeasure could be captured in the MHD number from the following days. Multivariate process control charts, like Shewhart and Cusum charts are similar , but these are based on probabilistic control limits derived from various statistical distribution assumptions. No such assumptions are made with MHD. Rather, consideration of costs are used to set limits.
For manufactured products, multivariate measures from testing are typically collected following final assembly. If we assume that the health of a manufactured product is analogous to the health of a patient, we could use similar methods to identify abnormal conditions and calculate a continuous MHD number for the multivariate condition. By collecting a group of manufactured systems, with exemplary performance, a Mahalanobis space could be constructed from the multivariate characteristics. A zero point and unit distance scale would be estimated as before. The system’s health could be diagnosed at t=0, just after assembly, and even later at intervals dictated by a data collection schedule. The manufactured product could easily be classified into normal and abnormal states at t=0, and the product’s tendency to become abnormal could be tracked.

The MHD measure can be utilized for many interesting industrial problems including fault detection, fault isolation, degradation identification, and prognostics. For example, air bag deployment system decision relies on the ability to first establish a reference space for normal everyday driving, and then to release the air bags when multivariate shock loads and accelerations exceeds a threshold value. This is fault detection. Fire alarms should actuate when various fire conditions exist over and about that expected from simple kitchen cooking or cigarette smoking. Multivariate reference space would be collected from normal cooking conditions and abnormal fire condition would be declared above some threshold value. Tendency to fail for a high volume printer, with multivariate sensor data, could be inspected periodically, and a service agent could be dispatched or electronic countermeasure could be applied, before customer ever noticed. Availability of the printer would be higher without the fault downtime, and customer satisfaction would be higher.

Warranty Chain Management (WCM) Conference

March 6-8, 2012, Orlando, Florida

Ops A La Carte’s Fred Schenkelberg will be presenting a workshop on "Five Ways To Reduce Warranty Costs".

Contact Us InfoPlease Contact Ops A La Carte for More Information

This is an application where two Zener diodes were placed in series, in a back to back configuration.  They were placed across the primary winding of a transformer used to apply modulation to an AM transmitter.  The modulation was in the form of a single frequency, and the modulation level was not to exceed 30% by specification. The modulation level showed some variation with temperature, so that the diodes were selected to limit the voltage across the transformer primary to ensure that the 30% modulation limit would not be exceeded.

The drive for the transformer/limiter combination was from a low impedance source, adjusted to provide 30% modulation peaks during final test.  The pole mounted transmitter was required to operate in all weather, unsheltered conditions, at any airport in the United States.

The designer, who was an outside consultant, made some assumptions:

  1. The peak voltage for 30% modulation was 12 Volts, so two 11 Volt diodes were placed in back to back series configuration where one would provide an 11 volt drop in the reverse direction, and the other would provide a 1 volt (approximately) drop in the forward direction, meeting the 12 Volt requirement;
  2. The diodes operate at low dissipation because they are non-conducting, except for excessive peak conditions;
  3. The Reliability and Component Engineering functions of the company could be bypassed because they found too many things wrong, and their input cost too much.

Then there was real life:

  1. Field returns with discoloration on the circuit boards under the diodes;
  2. Field returns with no modulation;
  3. Field returns with modulation intermittently greater than 30%;
  4. Field returns with modulation intermittently low;
  5. By the time the field returns were on the receiving dock, the consultant was long gone.

The returns were turned over to the Failure Analysis lab of the Components Engineering function, and the design was examined by a Reliability Engineer and a Components Engineer.

The diode characteristics were determined to be:

VZ = 11V ± 5% @ 23 mA and TJ = 25°C;

PD = 1 Watt, maximum;

Temperature coefficient of voltage = +0.06%/°C typical;

VF = 1.2 V @ 200 mA, maximum.


Since the diode forward voltage drop would be expected to be considerably lower at low current, the diode forward drop could reasonably be assumed to be approximately 1 volt making the total drop of the diode set approximately 12 volts in either polarity.  At first glance, the initial design assumptions appear to be reasonable.

A simple tolerance analysis begins to show the problem.  The ±5% tolerance on the zener voltage equals 550 mV, placing the zener voltage in the range of 10.45V to 11.55V.  If we assume that the diode forward drop remains constant at 1V, the series combination can have a total voltage drop range of 11.45V to 12.55V.

A further complication is the temperature coefficient of the Zener voltage.  The operating temperature range over all US airports is on the order of -55 °C to +55 °C.  The temperature coefficient applies to the value of the zener voltage at the tolerance extremes, yielding two values of Zener voltage at each temperature extreme.

The calculation for the zener voltage over temperature is straight forward:

VZ(at temp) =  V+ Tempco * VZ *  ΔTemp

Calculation at low and high tolerance and low and high temperature yields four values:

Zener Voltage over Tolerance and Temperature




Zener Voltage







The forward biased diode is also affected by temperature.  It has a temperature coefficient of voltage of -2mV/°C, which yields a change in the forward voltage drop that is opposite in polarity to the change in the zener diode.

Combining the forward voltage over temperature with the zener voltage over tolerance and temperature yields the clipping voltage:

Clipping Voltage over Tolerance and Temperature




Zener Voltage







Based on this, the new clipping voltage range is 11.108V to 12.698V.

Further conditions that affect the temperature range over which clipping begins are internal temperature rise in the box, and direct heating of the box by sunlight.  Considering these would unnecessarily complicate this discussion.

Production test:

In order to reduce costs, the production boxes were only tested at room temperature.  Two engineering units were successfully temperature tested for type approval, and on that basis, full temperature testing was waived on the production units.

Boxes with diodes that had clipping voltages of 12 V and higher sailed through test without problem, since the modulation voltage could be set to peak at 12 V, meeting the 30% modulation requirement.  In the field, boxes that operated at elevated temperatures operated normally, with no diode failures.  These same boxes, when operated at low temperature, depending upon the exact clipping voltage, frequently failed for low modulation because their temperature coefficients forced the clipping voltage below 12 V.  In a few cases, the self-heating of the diodes due to the dissipation from clipping allowed their clipping voltages to reach thermal equilibrium near 12 V, allowing the boxes to operate satisfactorily.  The modulation level was a function of the ambient temperature, causing intermittent failures on some boxes.

Boxes that contained diodes with clipping voltages below 12 V, for the most part, also sailed through test without problem.  This seems unreasonable, since it should not have been possible for those boxes to be set at 12 V.  Here, the positive temperature coefficient of the zener voltage came to the rescue.  As the test technicians increased the modulation toward 30%,  diodes that broke down below 12 Volts began to conduct, thereby warming up, raising their breakdown voltage.  With enough drive, many of the diodes could be driven hard enough to reach thermal equilibrium at 12 Volts.  Typically, most of these diodes were heavily over dissipated.  When these boxes reached the field, they operated for a time, but eventually, the diodes began to get leaky, leading to increased dissipation and subsequent short.  Some of these boxes also displayed intermittent failure of the modulation level, due to the changes in ambient temperature.


The basic design approach to limiting the modulation level was flawed.  The diode clipper was intended to limit the temperature variability of the modulation source.  Instead, it induced additional temperature sensitivity and a high failure rate.  This approach was deemed to be less expensive than a design where the modulation level was actively sensed, and feedback applied to control the modulation source.

The initial assumptions did not take into account the zener diode voltage tolerance and temperature coefficient.  This is the root cause of all of the failed units returned from the field.  It is interesting to note that whether the symptom was low modulation, high modulation, or burned boards and shorted diodes, the source of the failure is traceable to the same source.  Inattention to and/or lack of understanding of the basic operating parameters of the zener diode was that source.

Lessons learned:

  1. Read and understand the datasheet;
  2. Bypassing oversight is more expensive in the end;
  3. Using engineering units for type approval carries the risk that the units may not represent the production items;
  4. Validation of the qualifications of your consultant is paramount to success.