Blog

Using FMEA to Reduce Risk in a NASA Program

Due to the sensitive nature of many NASA programs, the client name and design details are not provided in this case study.

This case involves FMEA training that was funded by a Massachusetts Workforce Development grant. Quality Support Group wrote and secured the grant for the client and they engaged me to provide the training. The client in this case specializes in the development and production of instruments and assemblies for space and other airborne applications. They had been engaged by a major customer to develop a subsystem for an upcoming NASA program. That customer required that Failure Mode & Effects Analysis (FMEA) be deployed as a part of the project.

Major potential risks had to be identified and prevented to assure not only a robust design that could survive the stress and vibration of launch, but also deliver long-term reliability for consistent performance over an extended period of time in space. The client reached out to Quality Support Group for assistance and, as noted previously, QSG secured funding for the training with a state Workforce Development Grant and assigned the project to me.

I planned the project and training as follows:

Heavy emphasis on preparation.
Several discussions with the client’s program manager prior to the training to get a solid understanding of their project, their needs, and who would be attending the training.
These preliminary steps are critical to assure a maximum return on the client’s time that they invest in the training.

The solution in this case involved the delivery of my standard two-day FMEA workshop in January 2019. This included three rounds of applied design and process FMEA by three cross-functional teams of the client’s staff who were working on the project. Each “round” consisted of:

Identifying one failure mode that could affect their selected design or process.
Listing effects, causes, and design or process controls for each failure mode.
Ranking on a 1-10 scale the severity of the effects (S), probability of occurrence of each cause-failure mechanism (O), and the likelihood of detection of each cause-failure mechanism (D).
Calculating the risk priority number (RPN) of each cause-failure mechanism, determined by the product of the severity, occurrence, and detection rankings (S x O x D).
Listing recommended actions for each RPN greater than 100.
For each agreed-upon recommended action, identifying the person or group responsible and a target date for completion (because action plans without responsibilities and target dates aren’t plans!).

One month later, I returned to the client's site for a one-day “follow-up” session with the teams. The agenda for this type of follow-up session is very simple:

Morning

Review of progress on agreed-upon actions generated in the Design FMEA applications the previous month.
Two more rounds of Design FMEA (from failure mode to recommended actions) on the selected designs.

Afternoon

Review of progress on agreed-upon actions generated in the Process FMEA applications the previous month.
Two more rounds of Process FMEA (from failure mode to recommended actions) on the selected processes. Although the final design review had not yet occurred, there were still process risks facing the teams in an upcoming pilot production run.

Timeline and Results

Initial conversations were held with the project manager during November and December 2018. The two-day workshop was delivered in January 2019 and the follow-up session occurred one month later. Total timeline was about three months. The client’s teams completed thirteen rounds of Design FMEA and eight rounds of Process FMEA (from failure mode through recommended actions) on the current project. Recall that the client’s customer required that FMEA be used as a condition of the contract.

FMEA and other techniques, however, should never be applied with just a “compliance mentality” (i.e.,we’re doing this because it’s required by contract, by customer, by regulation, or by our own quality management system). This too often results in a “check-the-box” approach that does not benefit the project.

In this case, although the initial reason for scheduling the training was because the customer required FMEA for the project, the applications resulted in an improved design to reduce risk - the real reason for applying FMEA!

One of the failure modes identified in January ended up with a risk priority number (RPN) greater than 700. If the failure mode were to occur, it would result in mission failure (i.e., loss of primary function). The team agreed on action items to consider a change to the design and to develop better tests and design controls.

One month later, the actions had been taken and documented. In accordance with the FMEA methodology, the team re-calculated the RPN and it came out to 64 – a clearly documented and substantial reduction in risk.

Beyond the impressive reduction in the RPN, the project manager also reported an unexpected and very high level of enthusiasm about FMEA among the engineers and others involved in the project. One team leader noted during the follow-up session that, if they had applied FMEA earlier in the project, he was confident that they would be further ahead of schedule. Therefore, if management supports the timely and effective application of FMEA in future projects, they can reasonably expect a reduction in design and development lead time and costs. Once again, this is the real reason for using FMEA – not merely to “comply.”