Physical acceleration means that operating a unit at some higher stress level(s) (i.e, higher temperature, voltage, humidity or duty cycle, etc…) should produce similar failures as would occur at typical-use stresses, except that they are expected to occur much sooner.
Failures may be due to mechanical fatigue, corrosion, chemical reaction, diffusion, migration, etc. These are the same cause of failures under normal stress condition; the only difference is the time scale (the time to failure).
When there is true acceleration, changing stress is equivalent to transforming the time scale used to record when failures occur. The transformations commonly used are linear, which means that time-to-fail at high stress just has to be multiplied by a constant (the Acceleration Factor or AF) to obtain the equivalent time-to-fail at use stress.
For many engineers, this is where they encounter the biggest challenges: what is the preferred model? How do I fit the model to the condition, materiel and physical attributes of the UUT?
Too many knowledgeable users employ the Arrhenuis equation, but this is not always the preferred method, and care must be taken to apply the proper model.
What are the experiences with other models?
What are the success stories?
Mechanical failure can occur at any point in a product life cycle. These can be divided into infant mortality, constant failure rate, and wear out. As shown in the diagram, which is a plot of failure rate as a function of time, the individual curves of these three different classes of failure mechanisms sum together to form the classic bathtub curve of observed failure rate.
Infant mortality occurs early during product use. The failure rate declines as a function of time, so reliability actually increases until a point is reached where the constant failure rate becomes dominant and the infant rate becomes negligible. The Weibull Distribution is used to model the infant mortality period. A wear-in or burn-in period may be used to screen out defective units. Infant mortality is typically caused by defects in manufacturing, handling, and storage. Examples of these causes are:
- workmanship and assembly errors
- misalignment of belts, pulleys, gears, shafts and bearings
- over-tightening stresses parts, causes excessive friction
- under-tightening leaves parts loose to fall off, vibrating shafts
- parts are out of tolerance from design specifications
- rubbing contact of moving parts
- loose fits lead to vibration and galling
- excess friction between mating, moving parts
- excess flash from molding
- similar problems to tolerance issues above
- flash breaks off, contaminating system with debris
- damaged parts from improper processing and handling
- cracks from damage propagate under stress leading to failure
- over-heating changes material properties
- solvents and residues lead to stress corrosion cracking
- environmental conditions cause swelling and warping
Constant Failure Rate
Most of the product life is spent in the random failure state, which has a constant failure rate. The reliability with a constant failure rate is predicted using the exponential function. Failures are caused by mechanisms inherent in the design.
It should be noted that a preventative maintenance program during the constant failure rate phase can actually reduce reliability by reintroducing infant mortality into the system. The scheduled replacement of parts presents an opportunity for errors in workmanship as well as adding the possibility of failure of the parts themselves.
This has led to the current practice of Reliability Centered Maintenance. RCM determines the maintenance requirements of individual components to replace only those components which actually need replacing while monitoring the condition of all components which are prone to wear and eventual failure. Not all components in a system follow the bathtub curve. Reliability centered maintenance identifies the reliability curve for a component and provides an applicable maintenance strategy to match.
The wear-out phase precedes the end of the component or product life. At this point, the probability of failure increases with time. While the parts have no memory of previous use during the random failure phase, yielding a constant failure rate, when they enter wear out, the cumulative effects of previous use are expressed as a continually increasing failure rate. The normal distribution is often used to model wear out. Weibull may also be used to approximate this and every other period. Scheduled preventative maintenance of replacing parts entering the wear-out phase can improve reliability of the overall system.
Examples of failure mechanisms in wear out are:
- Fatigue – constant cycle of stress wears out material
- Corrosion – steady loss of material over time leads to failure
- Wear – material loss and deformation, especially loss of protective coatings
- Thermal cycling – not only fatigue, but change in chemical properties, alloyed metals can migrate to grain boundaries, changing properties
- Radiation – Ultraviolet, X-ray, nuclear bombardment in environment changes molecular structure of materials
Reliability has several definitions, mostly covering some likelihood expression of functionality under certain conditions and a given timeframe. And designers and manufacturers often put significant effort into preventing failure. But this begs the question of “why?”
The answer is that reliability is a potential benefit to the user and, through sales and profit, to the supplier. Benefits come in the form of safety, of the product confidently delivering its intent, and of lowered operating costs. However, there are also potential downsides, of a higher up-front cost, changed aesthetics, less performance, etc. And, sales may be driven by these more visible attributes. Hence, benefits to the supplier of maybe increased investment (to deliver assured high reliability) might be negative without increased sales and / or profit margin. And perhaps development of improved reliability could delay product launch and thus lose key marketing opportunities.
Hence, we must recognize that reliability is a competing attribute.
So, how to compete?
Compete on cost by presenting operating $ benefits. Compete on performance by factoring % success. (There’s no point in having a high performance aircraft that never completes it mission.) Relate cost of warranty returns to direct loss of profits. Factor reliability and product assurance (including more generous warranties) into marketing activities, such as focus groups, to highlight potential gains in sales.
And remember that good reliability is often achieved simply by ensuring good quality. There are many instances where poor supplier and manufacturing quality generate far more warranty returns than “design” reliability issues. And good supplier and manufacturing quality can often be achieved at low cost, and has no negative impact on product performance.
So, to be effective, a reliability engineer should use the tools of marketing, field service, supply chain management, quality and accounting departments (as well as design), and be a positive agent in developing corporate strategy. Without these links, we would miss several opportunities to best optimize products for customers and supplier alike.
Question: Where does the reliability engineer go to learn those tools, beyond forging links with these several departments?
Commercial grade wind turbines generally have a target design life of 20 years. This goal is frequently elusive with key components like the gearbox often showing much shorter life in service.
Continuous condition monitoring and motion control are technologies which are beginning to improve the reliability of wind turbines. Condition monitoring can reduce downtime by advance warning of the problem and in some cases making simpler repairs as opposed to full replacement.
There are many ways for advanced monitoring and control technologies to improve the reliability of wind turbines. For more details, see the article at:
THEME: Why do this? An abbreviated view of how to do it
A recent article in Bloomberg/Business Week (Dec 10-16, 2012) interview by Josh Tyrangiel of Apple’s CEO Tim Cook noted a key point in business practice/philosophy: “There are always things unknowable – if we are finding zero issues, our performance bar is in the wrong place”
WHY THINK THIS WAY? You need to improve – it must be a way of business performance in all areas
- People, knowkedge, & technology/information
There is a need to understand your business performance attitude (Change it or Perish)
HOW? (An abbreviated view)
- Understand Value Analysis
- Identify your Competitive Advantage
(Quality, Availability, Flexibility, Cost)
- Tool to employ (System Audit)
Need a champion/advocate
Use a standard (ISO – 19011)
How do you compare? For more information or questions refer to firstname.lastname@example.org
Starting with a low cost/low complexity design alternatives in large engineering programs and then systematically raising the cost and complexity , where warranted, is a far better approach than starting with higher cost design complex designs and later doing cost down and simplification activities. I have worked on many product teams where the latter was the norm. Subsystem teams would select costly more complex technologies, more costly high precision components and assemblies, costly manufacturing and control systems, all in an effort to get a jump start on functional performance and time to market requirements. Early demonstration of performance, even though the costs were over allocations, were considered perfectly fine, as long as everyone tacitly understood that there would be a cost-down and simplify activities at the end of the development activities. Many engineering teams understood that their cost allocations would be waived initially to satisfy time to market and performance requirements.
The higher initial cost design approach was quite appealing, as it usually avoided the unwanted attention and pressures from engineering management. Company buyers, in turn, would prepare themselves to put undue amounts of pressure on suppliers to reduce costs for their deliverables. Manufacturing engineering would invest time and resources in higher precision and many processes with secondary operations, again with the idea that both design and manufacturing cost down would come later. Sales and marketing people, in turn, would prepare for a higher priced offering than originally planned. Extra pressure was put on sales teams to push the higher prices along to loyal customers.
The high cost design approach many time proved difficult in later development stages as the cost down would negatively affected performance. This is really something one would like to deal with early on in the design cycle, not at the end. Rationale for original design choices were sometimes lost or forgotten. The cost engineers tasked with the cost down and simplification activities were usually not the original design engineers. This in turn created new difficulties usually for downstream service engineers and manufacturing quality engineers.
When starting with low cost design alternatives, it becomes imperative to quickly identify a set of robust technologies and robust manufacturing processes that simultaneously satisfy quality, cost, and delivery requirements. In selecting low cost alternatives, engineers are tasked with exploring available design space (using flexible fixtures) to identify first a working prototype condition and directions for improvement without adding cost. Using experimental design methods /parameter design methods to find an optimal set of nominal values, has been widely used. The rule of thumb was that if you could get the functions to work just once with the low cost approach, then you could begin the optimization process without adding cost. There would be many opportunities to capitalize on better combinations of control factors and signal factors. If the optimization efforts fell short, then adding cost incrementally until the trajectory to design maturity improved, could be done. Nevertheless, the initial low cost approach would still end at a better place than starting with high cost and trying to drive the cost and complexity down late in the cycle.
When we introduce a new chip, we plan and execute a comprehensive reliability qualification plan. This plan will be based on many different reliability stresses addressing infant mortality rate test, early life failure rate test, long term life test failure rate prediction based on a small population of samples pulled from early production lots .
Due to the fact of limited device sample sizes, we are trying to assign a confidence level to our failure rate predictions using “industry standard” chi-square adjustment in the hope, our prediction will be closer to real field failure rates.
This is a “standard” approach of the semiconductor industry because testing very large sample sizes of chips is economically not feasible, especially for small-and fabless semiconductor companies.
IBM Corp.’s Semiconductor Division calls the above practice “finding the tip of the iceberg “only indicating if there are major catastrophic failure mechanisms”. IBM and major semiconductor manufacturers are stressing large sample sizes in ongoing reliability testing of the outgoing device population.
Above approach requires capabilities and facilities for ORT (ongoing reliabiliy testing) of tens of thousands of devices per year. Only major dedicated manufacturers do this
(like Intel, National Semiconductor, Micron, etc. )
In the course of 2-3 years of intensive ongoing reliability testing of samples of the outgoing population combined with field failure information will one be able to make reliability assessment and meaningful prediction of the maturing semiconductor product.
FMEA method is quite known today. There are a lot of guides, articles, and standards written about FMEA method. However, not so much written about FMEA links-relationship between FMEAs and other processes. Are these links important? To have a really strong FMEA approach we should take into consideration links-outputs/inputs between FMEA´s. In most cases we develop/produce products within supply chain where our product is part of upper system or customer application and consist from components co-developed/produced by our suppliers. We are located betwen customer domain and supplier domain. Each domain has own FMEA (Application FMEA and Supplier FMEAs). The key point is to create interfaces betwen domain FMEAs. When we don´t care these interfaces than we develop “our product“ not “customer product” and we can fail in customer application. There are a lot of real examples where product fail because there was missing customer voice. We should keep in mind such approach and transfer Voice of Customer to Suppliers. Important is to setup FMEA communication platform between all supply chain entities with same risk evaluation criteria (Severity, Occurence, Detection). The objective such approach is identification of all risks and their failure cause – failure effect chain from supplier through us to customer. This approach can give us complete view what happen if parameter of supplier component fail in our domain and what will be failure effect on customer application.
Other aspect of more robust FMEA approach are links between FMEAs and other company procesess within our domain. Benefit of such approach is view on FMEAs from various functional perspectives. Following items describe how such proces can empower FMEAs and FMEAs can empower other procesess.
Requirement Management – to understand what custormer really need,how application works, what can be failure effects and their severity on customer application. It´s basic input for FMEA.
Quality Planning – FMEA is part of quality planning process and other quality tools and methods are dependent on it like control plan, measurement system analysis, process capability analysis, verification and validation planning, etc.
Risk Management – FMEA is source of product and proces risks which has to be evaluated from other risk perspective like financial effect, project timing, product porfolio, technology roadmap.
Supplier Management – FMEA is good communication platform to speak about component failures and their efects on customer system. Customer is learning from suppliers and suppliers are learning from customers.
Continual Improvement – FMEA is good source of product or process potential improvement projects definition based on highest risks.
Reliability Engineering – FMEA is integral part of product reliability analysis. Help to engineers to understand failure mechanism. Is good source of reliability test planning and after test failure analysis. Change Management – When any change in process or product is planned than we should analyze with support of FMEAs.
Problem Solving – FMEA can be good reference for team to learn from past failures. New failure events should be added to FMEA.
It´s quite complex task to manage all these links. When we will think about these links than our FMEA can bring us more interesting results than before. It will not be separate method but will become integral part of our company procesess. There is very interesting tool to help company to manage all these links. See to www.cognition.us
Function Point Analysis (FPA) has been proven as a reliable method for measuring the size of computer software. First made public by Allan Albrecht of IBM in 1979, the FPA technique quantifies the functions contained within software in terms that are meaningful to the software users. It can be readily applied across a wide range of development environments and throughout the life of a development project, from early requirements definition to full operational use. In addition to measuring output, Function Point Analysis is extremely useful in estimating projects, managing change of scope, measuring productivity, and communicating functional requirements. The Function Point Analysis technique provides an objective, comparative measure that assists in the evaluation, planning, management and control of software production.
The function point measure itself is derived in a number of stages. Using a standardized set of basic criteria, each of the functions is a numeric index according to its type and complexity. These indices are totaled to give an initial measure of size which is then normalized by incorporating a number of factors relating to the software as a whole. The end result is a single number called the Function Point index which measures the size and complexity of the software product.
There are many benefits in using Function Point Analysis:
- Function Points can be used to communicate more effectively with user groups.
- Function Points can be used to reduce overtime.
- Function points can be used to establish an inventory of all transactions and files of a current project or application. This inventory can be used as a means of financial evaluation of an application. If an inventory is conducted for a development project or enhancement project, then this same inventory could be used to help maintain scope creep and to help control project growth. Even more important this inventory helps understand the magnitude of the problem.
- Function Points can be used to size software applications. Sizing is an important component in determining productivity (outputs/inputs), predicting effort, understanding unit cost, so on and so forth.
- Unlike some other software metrics, different people can count function points at different times, to obtain the same measure within a reasonable margin of error. That is, the same conclusion will be drawn from the results.
- FPA can help organizations understand the unit cost of a software application or project.
Once unit cost is understood tools, languages, platforms can be compared quantitatively instead of subjectively.
Further information can be found at http://www.ifpug.org
How often have you cut a board, a piece of material, or even wrapping paper, and guess what? You come up Short Agh! We’ll there may be a few factors in not effectively measuring the material. It can be any of the following: 1) Your measurement system or rule is not accurate enough for the material you measuring. 2) You need to measure in Milimeters and not inches. 3) You don’t have repeatable measurements, and have too much variance in your system.
If number 3 is the case for repeatability, we would suggest a Gage R&R study, and an Ops Ala Carte Consultant can help you here.
Per the 2nd cause, in my recent experience with a Supplier Quality Issue, the manufacturing tech. measured a hi-tech German Cable for medical applications, and found that a couple of pigtail lengths wire were out-of-spec!
We’ll this is an expensive error found, as the whole cable assembly would have to be shipped back to Germany to
Instead, I Measure the assembly pig tails of the cable the Second Time in mm, and guess what, they all were within spec. +/- 1 mm.
So, measure twice, at least, it will save you money, time, and your company will be more productive with less rejects.
Greg Swartz, CQE