TOP EVENTS: HALT and HASS Labs Open House AND OALC Reliability Symposium print version

Winter-Spring 2007

Ops A La Carte LLC
(408) 472-3889
Fax: (408) 255-5789

Theme this newsletter:

COURSES and SEMINARS - Both public and in-house courses
EVENTS - Workshops and Symposiums
SPECIAL OFFERS - Free Admission to CMSE, Mar 12-15, Los Angeles
NEWS - Ops A La Carte in the news
FEATURED SERVICE - Reliability Prediction
PROBLEM SOLVER - Solve and you can win a prize
ADVERTISEMENTS - Advertise here and watch your business grow
JOB OPENINGS - Looking for a job? Check our new section here !


Certified Quality Engineer (CQE) Preparation Course (pdf) - Apr 3-May 22, 2007
To register, email us at
For details see the Ops A La Carte Schedule


2007 Reliability Symposium (click here for more details)
We will be holding our 2007 Reliability Symposium the week of May 7-11 and we are putting on the following seminars:

Design for Reliability (DfR), May 7-8
Design of Experiments (DoE), May 9-10
Design for Manufacturability (DfM), May 9
Design for Warranty Cost Reduction (DfW), May 10
Best Accelerated Reliability Test Methods: HALT, ALT, and RDT, May 7-8
Design for Climatic Conditions, May 9-10
Software Design for Reliability, May 11

Inquire if interested at

All our seminars, including those being offered during the symposium as well as those that are not, are offered as in-house tailored seminars. To view a list of all our seminars, go to Ops A La Carte Course List

For details see the Ops A La Carte Schedule


Conference on Components for Military and Space Electronics (CMSE) - March 12-15, 2007, L.A., CA

The annual CMSE is happening in March in Los Angeles. We will have a booth at the show and will be one of the presenters. If you are planning to go, please check out our booth.

Best Practices of Failure Analysis - March 27-28, 2007 - Milpitas, CA

Ops A La Carte partner DfR Solutions, through PTI Seminars, will present a two-day seminar focused on "Best Practices of Failure Analysis." This seminar will provide reliability engineers and management a foundation by providing a comprehensive review of the best practices in engineering and reliability assurance with case studies that will provide guidance on the practices most appropriate for a given design, use environment, desired lifetime, and available resources.

Open House for HALT and HASS Labs - April 12, 2007 - Santa Clara, CA

Ops A La Carte's test lab HALT and HASS Labs will be having an open house on April 12th from 11:30am to 6pm and will feature a few great presentations on HALT, ALT, HASS, and Traditional Reliability Testing. We will be giving tours of our new vibration machine and our new temp/humidity chamber. We will also have some great food and a few prizes and surprises. Come over and meet our team. Click on the attache flyer for more details. Open House Flyer

Address: 990 Richard Ave, Suite 101, Santa Clara, CA 95050

For details on any of these events, please email us at

SPECIAL OFFERS Free 1 hour of Software Reliability Consultation to the first 3 people who respond to this offer. Our Software Reliability program director will be in town on March 22-23. The first 3 people who respond can set up a meeting to discuss their Software Reliability program and find out if they are on the right path. Often, just a little bit of advice is all that is needed to put a program back on course. If interested, please reply at

Free Admission to the seminar of your choice during our 2007 Reliability Symposium to the first individual that turns us onto a new consultant. We are growing rapidly and are looking for the best technical operations' consultants out there. If you know of anyone, please pass their name to us at This is a $1195 value.


- February 28, 2007
Ops A La Carte lands major project in Singapore. As part of our effort to expand into Asia, we landed a large project with a major Singapore firm. Several other projects are currently in discussion.

- February 15, 2007
HALT and HASS Labs adds two more pieces of Reliability Test equipment to the lab. In addition to our two HALT chambers, we now have an electrodynamic shaker capable of doing two axis sine and random, and we also have a Combined Temperature/Humidity chamber. Both chambers add versatility to the types of reliability tests we can perform.

- February 1, 2007
Ops A La Carte gets a tutorial spot at the annual Components for Military and Space Electronics (CMSE) in March. This will be our first year presenting. This year, the event will be held in Los Angeles. We will also be exhibiting at this symposium and we will have free tickets to give away. See Special Offers section as well as Problem Solver section for details on how to win. Go to the following link for more details on this show CMSE Link.

- January 26, 2007
Ops A La Carte gets two papers accepted at the annual Applied Reliability Symposium in June. For the third year in a row, Ops has a paper accepted at the annual Applied Reliability Symposium. This year, the event will be held in San Diego. The two papers we will be presenting are Fred Schenkelberg's "Trapped by MTBF - A Study of Alternate Reliability Metrics" and Doug Farel's "Using Competitive Analysis to get the Competitive Advantage". We will also be exhibiting at this symposium and we will have free tickets to give away. Stay tuned for details of give-aways in our next newsletter edition.

- December 15, 2006
Ops A La Carte performs first project with its new partner Asian branch of SGS. After signing an agreement with SGS, Ops and SGS worked on their first project together in Taiwan. Since then, we have worked on several more in Taiwan, marking our entrance into the Asian market.

Presentations for some of the above events are available for download on our Resources Page.

For more information on news, please visit our News Page or call (408) 472-3889.

Reliability Prediction Updates

Over the past few months, the two most popular Reliability Prediction guidelines were updated, and these updates added accuracy and versatility to the models. MIL-HDBK-217 was upgraded to 217PlusTM (this was the first major update in almost two decades) and Telcordia SR332 was updated from Issue 1 to Issue 2 (this was the first major update in almost a decade).

217Plus Replaces MIL-HDBK-217 - Finally!

217Plus is the latest reliability prediction methodology available from the Department of Defense's Center of Excellence in reliability and is intended to replace MIL-HDBK-217. DoD-funded, DTIC-sponsored and RiAC developed, 217Plus is based on the electronic failure rate data contained in the RiAC (formerly RAC) databases as of September 2005. Because the RiAC is funded to continuously update this data, users of 217Plus will benefit from technology updates and current failure rate experience.

217Plus methodology uses failure rate equations in a similar fashion to MIL-HDBK-217. However 217Plus has updated the failure rate equations and integrated additional system level effects to overcome many of the limitations inherent in the outdated MIL-HDBK-217. Specific upgrades in the 217Plus methodology include:

Process Grading: Process grading is used to factor in robustness of the various processes used to design and build the item. Separate process grading factors are applied to design, manufacturing, part quality, and system management processes as well as others. Process grading in intended to model non-component failure causes and system level effects by assessing the various processes used to build the item and applying a grading factor at the item or system level.

Use of Predecessor System Test and Field Data encouraged: Failure rate experience from a similar or predecessor item can be integrated into the calculated prediction to better determine new item reliability.

Includes Software Reliability Prediction factors: Software can now be factored into the final predicted value.

Includes non-operating and duty cycle effects: Non operating and duty cycle contribution of each part can are factored into the failure rate model equations.

Allows for detailed environmental condition effects: Environmental conditions such as temperature cycling are factored into the failure rate model equations.

Models based on latest failure rate data: Failure rate models are derived from the latest failure rate data and factors in reliability growth in components technology since MIL-HDBK-217 was last updated.

217Plus will almost certainly be a blessing to those that use MIL-HDBK-217 to calculate reliability of commercial products as well as those that stopped using this standard because it became outdated over the past 10-15 years. Since MIL-HDBK-217 is so outdated it calculates an unrealistically pessimistic failure rate for modern commercial components. The 217Plus methodology takes into account reliability growth in modern commercial components since MIL-HDBK-217 and reliability predictions updated using the 217Plus methodology should yield a more optimistic and more realistic failure rate prediction.

New MIL-HDBK-217 Prediction Services from Ops A La Carte

Ops A La Carte has been performing MIL-HDBK-217 reliability predictions for many years (see our Reliability Predictions link for more details), and now, we are pleased to offer you predictions using this new methodology.

Also, we can perform a MIL-HDBK-217 to 217Plus upgrade service as well. And it doesn't matter what standard you used - ReliaSoft, Relex, Isograph, and others - we can use them all. Our experience has shown 217Plus will give you a much more realistic and usually greatly improved MTBF figure.

Telcordia SR332 Has Been Upgraded - What's New in Issue 2?

Reliability predictions are an important element in the process of selecting equipment. These predictions provide necessary input to system-level reliability models for predicting expected downtime per year and system availability.

The Telcordia Reliability Prediction Procedure SR-332 has a long and distinguished history of use within and outside the telecommunications industry. It provides the only hardware reliability prediction procedure developed from the input and participation of a cross-section of major industrial companies. This lends the procedure and the predictions derived from it a high level of credibility free from the bias of any individual supplier or service provider.


Telcordia SR-332 was originally released under the old Bellcore documentation system in the late 80's/early 90's to give people an alternative to MIL-HDBK-217. By that time, MIL-HDBK-217 was already starting to become a bit obsolete and it certainly did not address the needs of many commercial companies that had much more benign environments than those addressed by MIL-HDBK-217. he document started out as Bellcore TR-332 (Technical Report) and then was later changed to GR-332 (General Report). Through the 90's Bellcore revised the document 6 times and then finally when Bellcore changed their name to Telcordia, they re-released the document under a different part number system - Telcordia SR-332, Issue 1 (released as a Special report, this became rev 7 of the original document).

Over the past 5-7 years, engineers have been complaining that even this document was starting to become out of date. There are many new families of components that have come about in the past decade and for a prediction document to be accurate, it must keep up with the times.

Finally, Telcordia released its latest version of this document last fall, Telcordia SR-332 Issue 2.

What's New in Issue 2?

1) This issue addresses many of these problems. It includes revised tables of generic device failure rates based on new data. For several devices, the range of complexity covered by the procedures has been extended (especially memory and microprocessors). In addition, over 40 new devices have been added (primarily for fiber optic communication devices).

2) Another big area of improvement is the document contains a new section providing techniques to calculate the upper confidence level of a failure rate prediction. Issue 1 only provided techniques for calculating the 90% upper confidence level. In addition, those techniques were overly conservative. The new techniques provide additional precision in the estimate and allow calculation at any desired level of confidence and shows how to add together failure rates with different confidence levels. This is something that no other prediction guideline has done to date.

New Telcordia Prediction Services from Ops A La Carte

Ops A La Carte has been performing Telcordia reliability predictions for many years (see our Reliability Predictions link for more details), and now, we are pleased to offer you predictions using this new methodology, and we can incorporate the new families and ranges of components.

Also, we can perform a Telcordia SR-332 Issue 1 to Issue 2 upgrade service as well.

And it doesn't matter what standard you originally used - ReliaSoft, Relex, Isograph, and others - we can use them all. Our experience has shown Telcordia SR-332 Issue 2 will give you a much more realistic and usually greatly improved MTBF figure.

Finally, we can incorporate the standard deviation numbers for each component failure rate to provide a range in the final answer. In the past, the final MTBF number was not very believable because of all the inaccuracies of the prediction would be additive, thus giving an unrealistic end result. Now with standard deviations given with the failure rate for each component, we can give you lower and upper confidence bounds on the final MTBF numbers. This is a great addition to Issue 2.


If I am performing a reliability prediction and have the following 3 assemblies with their respective failure rates and standard deviations (assume quantity of 1 for each),

a) what is the Total Failure Rate and Standard Deviation for this potion of the circuit?

b) how is it that the fan has a lower failure rate than either the microprocessor and the memory when we know fans to have a higher failure rate in the field?


This problem illustrates one of the new uses for the Telcordia Prediction guide SR-332 Issue 2. It contains standard deviations for all failure rate numbers in an effort to give bounds around the final result. In the past, many people saw this prediction standard as either inaccurate or outdated. Now this new version helps to accomplish both of these problems.

Send Responses to:

You can email us at The first individual that emails us the correct solution shall receive Free Admission to the seminar of your choice during our 2007 Reliability Symposium. This is a $1195 value.

Solution to Last Quarter's Problem of the Month on Software Reliability:


1) A software defect as a logic error within the static source code, i.e., a defect can exist regardless of whether the software is executed.

2) A software fault is the run-time triggering of a logic error by executing the source code associated with a defect, i.e., a fault exists only when at run-time when the associated code is triggered.

3) A software failure is the a problem with the run-time software that is visible to a customer or connecting system(s), i.e., a failure is the visible aspects of one or more faults.

Based on these definitions, create an example scenario to explain the following scenario: Two, identical systems at different customer sites trigger the same software fault with the same input data. However, only one of the two systems experiences a software failure. How is this possible?


Several solutions were submitted. Below is a summary of the solution provided by Archana Pawse. Since Archana solution was correct, she was offered free admisssion to the Reliability, Availability and Maintainability Symposium (RAMS) that was held in Orlando, FL, from January 22-25, 2007.

Summary of Archana Pawse's submitted Solution

Two software functions which can be invoked by a customer, function in_C and function in_D; let's assume these two functions are triggered by different user input screens. Both functions share a common source code module "mod_B" which outputs two values: val_F1 (for "Field 1") and val_F2 (for "Field 2"). Function in_C only uses the module mod_B output val_F1 for processing. Function in_D only uses the module mod_B output val_F2 for processing. When module mod_B is invoked, a fault will be triggered which always generates an invalid output value for val_F1 but generates a value for val_F2 that is valid.

Customer Alpha invokes function in_C on system A. Customer Beta invokes function in_D on system B. Since both systems invoke module mod_B, the same software fault is triggered. On system B, function in_D will only process the value for val_F2, which is valid; there is no failure that results from the fault in module mod_B. On system A, function in_C will only process the value for val_F1, which is invalid. This results in a failure from the software fault in module mod_B.

Comments As Archana correctly assessed, this problem highlights the difference between run-time, software faults and failures, which are frequently separated in time. Here are some likely software fault/failure behaviors:

o Multiple software faults are triggered before a failure occurs

o A software fault sets up the necessary conditions for a failure to occur, but the failure is delayed until a specific portion of software is executed or data is processed.

For many software engineers, the difference between software faults and failures demonstrates why debugging efforts can result in chasing symptoms and not the root cause defects. Ops A La Carte offer software reliability training seminars and services for software engineers to help them address these problems using:

o failure analysis techniques to statically, detect software faults and failures,

o fault tolerance techniques to mitigate the effects of run-time faults, and

o unit testing techniques aimed at maximizing software module coverage for those sections that will be most frequently used by customers.

Contact us if we can assist your organization in these and other areas of software reliability.

Link Silicon of Valley, LLC (LinkSV) is an online networking resource for researching, identifying and contacting the companies and people within the 6,000 plus active & inactive companies in the greater Bay Area. Our company records identify the senior team, board members, financing, key partners and customers. There are many features which allow you to view the information from different angles and "connect all the dots".

LinkSV helps you tap into previously scarce and extraordinarily hard-to-find information on early stage companies and the key people associated with them. This will improve your effectiveness without the hassle and time of trying to do it all on your own. LinkSV is ideally suited to help you quickly identify and leverage your own connections in career search, in identifying new business opportunities, new investors and Board members.

DfR Solutions has world-renowned expertise in applying the science of Reliability Physics to electrical and electronics technologies, and is a leading provider of quality, reliability, and durability (QRD) research and consulting for the electronics industry. The company's integrated use of Physics of Failure (PoF) and Best Practices provides crucial insights and solutions early in product design and development and throughout the product life cycle. DfR Solutions specializes in providing knowledge- and science-based solutions to maximize and accelerate the product integrity assurance activities of their clients in every marketplace for electronic technologies (consumer, industrial, automotive, medical, military, telecom, oil drilling, and throughout the electronic component and material supply chain). for more information visit

Ops A La Carte's newsletter goes out to over 8000 subscribers. If you would like to advertise in next quarter's "Reliability News", email us at or call at (408) 472-3889.

Reliability Leader

KLA-Tencor: $2.5B company based in the SF Bay Area, global presence in the semiconductor industry, market leader.

Significant opportunity to contribute to net company results, commensurate rewards.

Broad, dynamic, demonstrated leadership skills are key.

Strong technical background, MS or Ph.D. in EE, Physics or other.

Significant and relevant system design engineering / engineering program management experience (10-15 years experience).

Demonstrated expertise in reliability engineering and achieving exceptional results required.

You will own (re)defining and deploying the program company-wide and driving results to breakthrough performance!

Contact: 408-875-7593

Sr. Reliability Engineer Needed !

Reliability engineering position for electromechanical medical devices. This person will work in conjunction with cross-functional product development teams and will lead efforts to improve reliability in new product development and on market products. Please contact Mike Moriarty at 408-782-3252.

Major Duties and Responsibilities

1. Responsible for reliability analysis, testing, statistical analysis, modeling and prediction throughout the new product development process.

2. Develop and implement HALT/HASS, RDT and ORT protocols. Identify stress limits against product specifications and intended use to ensure design robustness.

3. Analyze reliability test data and generate suitable recommendations regarding design defects and weaknesses and latent product defects.

4. Analyze verification and validation test data and generate suitable recommendations regarding performance and/or limitations of product.

5. Lead risk management cross-functional teams in the development of hazards analysis, fault tree analysis and failure modes effects and criticality analysis (FMECA).

6. Review project plans, product requirement documents, software requirements documents and user interface specifications for appropriate reliability inputs.

Position interacts directly with electro-mechanical medical device and software development professionals and leads, program managers, directors of product development, quality assurance managers and engineers. Position influences overall considerations and tradeoffs associated with reliability and quality aspects of medical devices and software. Position may interact with customers in response to resolution of field problems and issues.


BS Engineering (EE preferred). Strong oral and written communication skills. Candidate must have direct experience and demonstrated skills in reliability theory, innovative reliability modeling, reliability analytical tools and testing for new product development. Candidate's experience must be focused on the practical implementation of using reliability tools to drive engineering solutions and they must be able to support their experiences through examples. Experience in medical device regulated product development a plus. Minimum of 9 years experience, with 5 years in engineering discipline. 1-3 years of project and/or department budget management experience preferred.

TESLA MOTORS: Reliability Engineer Needed !

About Tesla Motors

Tesla Motors is a small, aggressive, well-funded start-up in the San Francisco Bay Area that is making advanced electric vehicles, batteries, and drive systems a reality and working to change the future of the transportation industry. We are building a technically strong, fast-moving team that prides itself on superior execution. Please contact Ninh Le at 650-413-4045.

Our Mission

Tesla Motors designs and sells high-performance, highly efficient electric sports cars- with no compromises. Tesla Motors cars combine style, acceleration, and handling with advanced technologies that make them among the quickest and the most energy-efficient cars on the road.

Position Description and Objectives

1. The Reliability Engineering group is responsible for facilitating and ensuring the quality of the company's products. These responsibilities are accomplished through development, documentation, and deployment of the business processes which comprise the company's Quality Management System.

2. The successful candidate will manage, develop, and direct the activities associated with the reliability engineering function.

3. The reliability group originates and develops analysis methods for determining reliability of components, equipment, and processes. Acquires data and analyzes the data. Prepares diagrams, charts, drawings, calculations, and reports for defining reliability problems and makes recommendations for improvements. Conducts an analysis of reliability problems and investigates to determine the reliability required for the particular situation considering the cost limitations for equipment up/down time, repair/replacement costs, weight, size, and availability of materials/equipment.

4. Determines the cost advantages of alternatives for developing action plans to comply with internal/external customer demands for reliable processes/equipment to avoid failures.

5. The successful candidate must have the ability to meet tight deadlines, have strong interpersonal and communication skills, be detail oriented, and have the ability to work independently. The ability to establish and maintain a professional working relationship with external and internal teams at all levels across the company, is a must.


4 year Technical Degree, a BSME or BSEE is preferred, 5-8 years of quality systems and/or reliability experience. Experienced in QS-9000, ISO/TS 16949, APQP, PPAP, FMEA, DoE required. Applicants must possess strong written and verbal communication skills across all levels of the organization. Certified Six Sigma Black Belt preferred from a recognized Six Sigma organization a plus. Domestic and international travel required.

Senior Reliability Consultants Needed !

Ops A La Carte is looking for Senior Reliability Consultants around the world with their own consulting practice to join our team of consultants and work on some of the most exciting and challenging projects in the industry.


- Set your own hours
- Control your own future
- Work on fascinating projects in new industries
- Travel as little/much as you'd like
- Be looked upon as an expert
- Work with the best consultants in the industry
- Run your own business
- Eligible for free seminars and symposia
- Freedom to work on the projects you want

If interested, please email or call (408) 472-3889.

Ops A La Carte's newsletter goes out to over 8000 subscribers. If you would like to put a job opening in next quarter's "Reliability News", email us at or call at (408) 472-3889.

reliabiltyeducationabout usresourcesnewscontact us

ReliabilityNews is a service of Ops A La Carte. To subscribe yourself or a friend, send an email
with the word SUBSCRIBE in the subject line to
To remove yourself from this email, send an email with the word REMOVE in the subject line.
© Copyright 2005 Ops A La Carte, LLC. All rights reserved.