Agile Systems Engineering Using the Middle-Out Process
Most of the community thinks the term “Agile” refers to Agile software development and the “Agile Manifesto.” The Agile Manifesto (Figure 1) is the...
4 min read
Steven Dam : 6/9/22 2:35 PM
“Failure Mode and Effects Analysis (FMEA) and Failure Modes, Effects and Criticality Analysis (FMECA) are methodologies designed to identify potential failure modes for a product or process, assess the risk associated with those failure modes, rank the issues in terms of importance, and identify and carry out corrective actions to address the most serious concerns.”[1]
FMECA is a critical analysis required for ensuring the viability of a system during the operations and support phase of the lifecycle. A major part of FMECA is understanding the failure process and its impact on the operations of the system. Figures 1 and 2 below show an example of how to model a process to include the potential of failure. Duration attributes, Input/Output, Cost, and Resource entities can be added to this model and simulated to begin estimating metrics. In particular, you can use this with real data to understand the values of existing systems or derive the needs of the system (thresholds and objectives) by including this kind of analysis in the overall system modeling.
Figure 1: Vehicle Failure Response
Figure 2: Failure Modes Expanded
Step one is to build these Action Diagrams (for details on how to do this please reference the Functional Modeling white paper). We create a loop to periodically enable the decision on whether a failure occurs. The time between these decisions can be adjusted by the number of iterations of the loop and the duration of the “F.4 Continue Normal Operations” action.
We adjust the number of iterations by selecting the loop action (“F.1 Continue to operate vehicle?”) and pressing the </>Script button (see Figure 3). A dialog appears asking you to edit the action’s script. You can use the pull-down menu to select Loop Iterations, Custom Script, Probability (Loop), and Resource (Loop). In our case, we select “Loop Iterations.” The type in the number (we chose 200) is seen in Figure 4.
Figure 3: Vehicle Failure Process in Innoslate
Figure 4: Loop Decision Point Script for 100 Iterations
We also want to change the duration of this action (F.1) and F.4, Continue Normal Operations. Since the loop decision is not a factor in this model, we can give it a nominally small time (1 minute as shown). For the “F.4 Continue Normal Operations,” we chose 100 hours, which when combined with the branch percentage of this path of 90%. This means that we have roughly 900 operating hours between failures, which is not unusual for a vehicle in a suburban environment. We could provide a more accurate estimate, including using a distribution for the normal operating hours.
Note that we mentioned the 90% branch probability. That comes from the script for the OR action (“F.2 Failure?”). That selection results in the dialog below.
Figure 5: OR Decision Point Probability Script Dialog
Now if we assume a failure occurs approximately 10% of the time, we can then determine the failure modes are probabilistic, and the paths need to be selected based on those probabilities. The second OR action (“F.3.1 Failure Mode?) shows three possible failure modes. We can add more by selecting F.3.1 and using the “+Add Branch” button. You can use this to add more branches to represent other failure modes, such as “Driver failure,” “Hit obstacle,” “Guidance System Loss,” etc.
Note to change the default names (Yes, No, Option) to the names of the failure modes, just click on the branch name and the name will appear on the sidebar, as in Figure 6. Just type in the name you prefer.
Figure 6: Changing the Branch Name
To finish off this model, we added durations to the various other actions that may result from individual failures. The collective times represent the impact of the failure on the driver’s time. Since we do not have any data at this time for how long each of these steps would take, we can estimate them by using Triangular distributions of time (see sidebar in Figure 7).
Figure 7: Adding Time to the Action
This shows an estimate from a minimum of a ½ hour to a maximum of 1 hour, with the mean being ¾ of an hour. If we do this for the other actions, we can now execute the model to determine the impacts on time.
Note that we can also accumulate costs by adding a related Cost entity to each of the actions. Simply create an overall cost entity (e.g., “Failure Costs”) and then decompose it by the various costs of the repairs. Then you can assign the costs to the actions by using a Hierarchical Comparison matrix. Select the parent process action (“F Vehicle Failure Process”) and use the Open menu to select the Traceability Matrix (at bottom of the menu). Then you get a sidebar that asks for the “Target Entity,” which is the “Failure Costs” we just created. Select the “Target Relationship,” which is only one “incurs” between costs and actions, then push the blue “Generate” button to obtain the matrix. Select the intersections between the process steps and costs. This creates relationships between the actions and the costs. The result is shown in Figure 8.
Figure 8: Traceability Matrix for Assigned Costs
If you have not already added the values of the costs, you can do it from this matrix. Just select one of the cost entities and its attributes show up on the sidebar (see below).
Figure 9: Adding Cost Values to Matrix
Note how we can add distributions here as well.
Finally, we want to see the results of the model. We do this by executing it using the discrete event and Monte Carlo Simulators. To access them, select “Simulate” from the Action Diagram for the main process (“F Vehicle Failure Process). You can see the results of a single discrete event simulation in Figure 10. Note that the gray boxes in the “Action Trace 3D” window mean that those actions were never executed. They represent the rarer failure mode of an engine failure (we assume that you change your oil regularly or this would occur much more often). You can also see that the Resource Radar window is only showing the two failure modes (Flat Tire and Out of Gas), indicating that no engine failures occurred.
Figure 10: A Single Discrete Event Simulation Run
To see the impact of many executions, we use the Monte Carlo simulator. The results of this simulation for 1000 runs are shown in Figure 11.
Figure 11: Monte Carlo Calculations for 1000 Iterations
As a result, you can see that for about a year in operation, the owner of this vehicle can expect to spend an average of over $3,956. However, you could spend as much as over $9,000 over two years!
For a more detailed analysis, you can click the download button in each window for reports that detail these runs (see Figure 12).
Figure 12: A Variety of Reports Are Available From Each Window
This is a simple problem, but you can analyze much more complex problems using this approach. FMECA is a very important part of systems engineering. The key is to build in the failure modes as you develop your model!
Want to take your systems engineering to the next level? Learn more about Innoslate's powerful features and how it can benefit your organization. Explore Innoslate now!
[1] From http://www.weibull.com/hotwire/issue46/relbasics46.htm accessed 1/18/201
Most of the community thinks the term “Agile” refers to Agile software development and the “Agile Manifesto.” The Agile Manifesto (Figure 1) is the...
What Is FMEA? FMEA is a proactive technique used to identify potential failure modes, their causes, and the effects of those failures on system...
After watching the movie “Deepwater Horizon” [Deepwater], I observed the catastrophic consequences when critical testing is skipped. The film...