
Space product assurance
Failure modes, effects (and criticality) analysis (FMEA/FMECA)
Foreword
This Standard is one of the series of ECSS Standards intended to be applied together for the management, engineering and product assurance in space projects and applications. ECSS is a cooperative effort of the European Space Agency, national space agencies and European industry associations for the purpose of developing and maintaining common standards. Requirements in this Standard are defined in terms of what shall be accomplished, rather than in terms of how to organize and perform the necessary work. This allows existing organizational structures and methods to be applied where they are effective, and for the structures and methods to evolve as necessary without rewriting the standards.
This Standard has been prepared by the ECSS-Q-ST-30-02 Working Group, reviewed by the ECSS Executive Secretariat and approved by the ECSS Technical Authority.
Disclaimer
ECSS does not provide any warranty whatsoever, whether expressed, implied, or statutory, including, but not limited to, any warranty of merchantability or fitness for a particular purpose or any warranty that the contents of the item are error-free. In no respect shall ECSS incur any liability for any damages, including, but not limited to, direct, indirect, special, or consequential damages arising out of, resulting from, or in any way connected to the use of this Standard, whether or not based upon warranty, business agreement, tort, or otherwise; whether or not injury was sustained by persons or property or otherwise; and whether or not loss was sustained from, or arose out of, the results of, the item, or any services that may be provided by ECSS.
Published by: ESA Requirements and Standards Division
ESTEC, ,
2200 AG Noordwijk
The
Copyright: 2009 © by the European Space Agency for the members of ECSS
Change log
|
ECSS-Q-30-02A
|
First issue
|
|
ECSS-Q-30-02B
|
Never issued
|
|
ECSS-Q-ST-30-02C
|
Second issue
|
Introduction
The Failure Mode and Effects Analysis (FMEA) and Failure Mode, Effects, and Criticality Analysis (FMECA) are performed to systematically identify potential failures in:
products (functional and hardware FMEA/FMECA);
or processes (process FMECA)
and to assess their effects in order to define mitigation actions, starting with the highest-priority ones related to failures having the most critical consequences. The failure modes identified through the Failure Mode and Effect Analysis (FMEA) are classified according to the severity of their consequences. The Failure Mode, Effects, and Criticality Analysis (FMECA) is an extension of FMEA, in which the failure modes are classified according to their criticality, i.e. the combined measure of the severity of a failure mode and its probability of occurrence.
The FMEA/FMECA is basically a bottom-up analysis considering each single elementary failure mode and assessing its effects up to the boundary of the product or process under analysis. The FMEA/FMECA methodology is not adapted to assess combination of failures within a product or a process.
The FMEA/FMECA, is an effective tool in the decision making process, provided it is a timely and iterative activity. Late implementation or restricted application of the FMEA/FMECA dramatically limits its use as an active tool for improving the design or process.
Initiation of the FMEA/FMECA is actioned as soon as preliminary information is available at high level and extended to lower levels as more details are available. The integration of analyses performed at different levels is addressed in a specific clause of this Standard.
The level of the analysis applies to the level at which the failure effects are assessed. In general a FMEA/FMECA need not be performed below the level necessary to identify critical items and requirements for design improvements. Therefore a decision on the most appropriate level is dependent upon the requirements of the individual programme.
The FMEA/FMECA of complex systems is usually performed by using the functional approach followed by the hardware approach when design information on major system blocks become available. These preliminary analyses are carried out with no or minor inputs from lower level FMEAs/FMECAs and provide outputs to be passed to lower level analysts. After performing the required lower level FMEAs/FMECAs, their integration leads to the updating and refinement of the system FMEA/FMECA in an iterative manner.
The Software (S/W) is analysed only using the functional approach (functional FMEA/FMECA) at all levels.
The analysis of S/W reactions to Hardware (H/W) failures is the subject of a specific activity, the Hardware-Software Interaction Analysis (HSIA).
When any design or process changes are made, the FMEA/FMECA is updated and the effects of new failure modes introduced by the changes are carefully assessed.
Although the FMEA/FMECA is primarily a reliability task, it provides information and support to safety, maintainability, logistics, test and maintenance planning, and failure detection, isolation and recovery (FDIR) design.
The use of FMEA/FMECA results by several disciplines assures consistency and avoids the proliferation of requirements and the duplication of effort within the same programme.
Scope
This Standard is part of a series of ECSS Standards belonging to the ECSS-Q-ST-30 “Space product assurance - Dependability”.
This Standard defines the principles and requirements to be adhered to with regard to failure modes, effects (and criticality) analysis (FMEA/FMECA) implementations in all elements of space projects in order to meet the mission performance requirements as well as the dependability and safety objectives, taking into account the environmental conditions.
This Standard defines requirements and procedures for performing a FMEA/FMECA.
This Standard applies to all elements of space projects where FMEA/FMECA is part of the dependability programme.
Complex integrated circuits, including Application Specific Integrated Circuits (ASICs) and Field Programmable Gate Arrays (FPGAs), and software are analysed using the functional approach. Software reactions to hardware failures are addressed by the Hardware-Software Interaction Analysis (HSIA).
Human errors are addressed in the process FMECA. Human errors may also be considered in the performance of a functional FMEA/FMECA.
The extent of the effort and the sophistication of the approach used in the FMEA/FMECA depend upon the requirements of a specific programme and should be tailored on a case by case basis.
The approach is determined in accordance with the priorities and ranking afforded to the functions of a design (including operations) by risk analyses performed in accordance with ECSS-M-ST-80, beginning during the conceptual phase and repeated throughout the programme. Areas of greater risk, in accordance with the programme risk policy, should be selectively targeted for detailed analysis. This is addressed in the RAMS and risk management plans.
This standard may be tailored for the specific characteristic and constrains of a space project in conformance with ECSS-S-ST-00.
Normative references
The following normative documents contain provisions which, through reference in this text, constitute provisions of this ECSS Standard. For dated references, subsequent amendments to, or revision of any of these publications do not apply, However, parties to agreements based on this ECSS Standard are encouraged to investigate the possibility of applying the more recent editions of the normative documents indicated below. For undated references, the latest edition of the publication referred to applies.
|
ECSS-S-ST-00-01
|
ECSS system – Glossary of terms
|
|
ECSS-E-ST-32-02
|
Space engineering – Structural design and verification of pressurized hardware
|
|
ECSS-Q-ST-10-09
|
Space product assurance – Nonconformance control system
|
|
ECSS-Q-ST-30
|
Space product assurance – Dependability
|
Terms, definitions and abbreviated terms
Terms from other standards
For the purpose of this Standard, the terms and definitions from ECSS-S-ST-00-01 apply.
For the purpose of this Standard, the following term from ECSS-E-ST-32-02 applies:
leak-before-burst
Terms specific to the present standard
active redundancy
redundancy wherein all means for performing a required function are intended to operate simultaneously
[IEC 60050-191]
area analysis
study of manproduct or manmachine interfaces with respect to the area where the work is performed
criticality
combined measure of the severity of a failure mode and its probability of occurrence
end effect
consequence of an assumed item failure mode on the operation, function , or status of the product under investigation and its interfaces
failure cause
presumed causes associated to a given failure mode
failure effect
consequence of an assumed item failure mode on the operation, function , or status of the item
failure propagation
physical or logical event caused by failure within a product which can lead to failure(s) of products outside the boundaries of the product under analysis
failure mode and effects analysis (FMEA)
analysis by which each potential failure mode in a product (or function or process) is analysed to determine its effects.
The potential failure modes are classified according to their severity.
[IEC 60050-191]
failure mode, effects and criticality analysis (FMECA)
FMEA extended to classify potential failure modes according to their criticality
[IEC 60050-191]
functional description
narrative description of the product functions, and of each lower level function considered in the analysis, to a depth sufficient to provide an understanding of the product and of the analysis
Functional representations (such as functional trees, functional block diagrams and functional matrices) are included of all functional assemblies to a level consistent with the depth of the analysis and the design maturity.
functional FMEA
FMEA in which the functions, rather than the items used in their implementation, are analysed
functional FMECA
FMECA in which the functions, rather than the items used in their implementation, are analysed
hardware FMEA
FMEA in which the hardware used in the implementation of the product functions is analysed
hardware FMECA
FMECA in which the hardware used in the implementation of the product functions is analysed
hardwaresoftware interaction analysis
analysis to verify that the software is specified to react to hardware failures as required
process FMECA
FMECA in which the processes are analysed, including the effects of their potential failures
Processes such as manufacturing, assembling and integration, prelaunch operations.
protection device
device designated to perform a specific protective function
[adapted from “protection equipment” in IEC 60050 191]
Abbreviated terms
For the purpose of this Standard, the abbreviated terms from ECSS-S-ST-00-01 and the following apply:
|
Abbreviation
|
Meaning
|
|
ASIC
|
application specific integrated circuit
|
|
CDR
|
critical design review
|
|
CIDL
|
configuration item data list
|
|
CIL
|
critical item list
|
|
CN
|
criticality number
|
|
DN
|
detection number
|
|
EEE
|
electronic, electrical, electromechanical
|
|
FDIR
|
failure detection, isolation and recovery
|
|
FESL
|
failure effect severity list
|
|
FMEA
|
failure modes and effects analysis
|
|
FMECA
|
failure modes, effects and criticality analysis
|
|
FPGA
|
field programmable gate array
|
|
HSIA
|
hardwaresoftware interaction analysis
|
|
H/W
|
hardware
|
|
PCB
|
printed circuit board
|
|
PN
|
probability (of occurrence) number
|
|
RAMS
|
reliability, availability, maintainability and safety
|
|
RB
|
requirements baseline
|
|
RBD
|
reliability block diagram
|
|
SEP
|
single event phenomena
|
|
SN
|
severity number
|
|
SOW
|
statement of work
|
|
S/W
|
software
|
|
TS
|
technical specification
|
FMEA requirements
General requirements
The FMEA shall be initiated for each design phase as indicated in clause 6 and updated to reflect design changes along the project life cycle.
The FMEA is an integral part of the design process as one tool to drive the design along the project life cycle.
The FMEA shall be used for the development of the product architecture, design justification and for the definition of test and operation procedures.
The FMEA shall be used for the identification of critical items.
- 1 Refer to clause 4.3 for the identification of critical item.
- 2 For each critical item the FMEA identifies recommendations for risk reduction if appropriate.
The FMEA shall be used in the definition of: - failure tolerance design provisions (i.e. redundancy, inhibits, FDIR),
- special test considerations,
- maintenance actions (preventive or corrective),
- operational constraints.
All recommendations which result from the FMEA shall be evaluated, dispositioned and documented as part of the Dependability Recommendations in conformance with ECSS-Q-ST-30, clause 5.7)
The FMEA shall be performed according the following steps: - Describe the product (i.e. function or hardware) to be analysed, by providing:
- functional descriptions,
- interfaces,
- interrelationships and interdependencies of the items which constitute the product,
- operational modes,
- mission phases.
The functional analysis, functional block diagram and reliability block diagram can be used as input for product definition.
- Identify all potential failure modes for each item and investigate their effect on the item under analysis and on the product and operation to be studied.
- Assume that each single item failure is the only failure in the product.
This implies that combination of failures are not considered.
- Evaluate each failure mode in terms of the worst potential consequences and assign a severity category.
- Identify failure detection methods.
- Identify existing preventive or compensating provisions for each failure mode.
- Provide for identified critical items (clause 4.3) corrective design or other actions (such as operator actions) necessary to eliminate the failure or to mitigate or to control the risk.
- Document the analysis and summarize the results and the problems that cannot be solved by the corrective actions.
- Record all critical items into a dedicated table as an input to the overall project critical item list (CIL).
Critical item control is described in ECSS-Q-ST-10-04.
Severity categories
A severity category classification, based on failure consequences, shall be assigned to each identified failure mode.
Severity categories shall be assigned without consideration of existing compensating provisions.
- 1 The compensating provision is highlighted by the suffix.
- 2 The objective is to provide a qualitative measure of the worst potential consequences resulting from item failure.
For analyses lower than system level the severity level due to possible failure propagation shall be identified as level 1 for dependability.
For example, for analysis at subsystem and equipment levels.
The number identifying the severity category shall be followed by a dedicated suffix as follows:
- the suffix SH to indicate safety hazards;
- the suffix R to indicate redundancy;
- the suffix SP to indicate single point failures.
- 1 For example, while 3SP indicates that the item failure mode under consideration can lead to the consequences listed in category 3, 3R indicates that the consequences listed in category 3 can occur only after the failure of all of the redundant items.
- 2 The suffix SH is used before the other suffixes.
The severity categories shall be applied as indicated in Table 41.
The customer can tailor the severity categories to suit the programme specific needs.
Table 41: Severity of consequences
|
Severity category
|
Severity level
|
Description of consequences (failure effects)
| |
|
Dependability effects
|
Safety effects
| ||
|
Catastrophic
|
1
|
Failure propagation
|
Loss of life, life-threatening or permanently disabling injury or occupational illness.
|
|
Loss of an interfacing manned flight system.
| |||
|
Severe detrimental environmental effects.
| |||
|
Loss of launch site facilities.
| |||
|
Loss of system.
| |||
|
Critical
|
2
|
Loss of mission
|
Temporarily disabling but not life-threatening injury, or temporary occupational illness.
|
|
Major detrimental environmental effects.
| |||
|
Major damage to public or private properties.
| |||
|
Major damage to interfacing flight systems.
| |||
|
Major damage to ground facilities.
| |||
|
Major
|
3
|
Major mission degradation
|
|
|
Minor or Negligible
|
4
|
Minor mission degradation or any other effect
|
|
The customer shall define the criteria for mission loss and mission degradation (major and minor).
- 1 Example of such criteria is loss of one or more essential mission objectives.
- 2 For analyses performed at subsystem, assembly or equipment level, the term “mission” is understood as functionality (i.e. the capability of meeting the specification requirements).
Identification of critical items
An item shall be considered a critical item if:
- a failure mode is identified as single-point failure together with at least a failure consequence severity classified as catastrophic, critical or major, or
- a failure mode has failure consequences classified as catastrophic.
The customer can tailor the criteria for critical item identification defining a failure mode as critical according to programme specific needs.
Level of analysis
The supplier shall analyse all failure modes leading to consequences with severity level 1, 2 and 3 down to a level allowing identifying all single point failures.
Different level of analysis to which failure modes are assessed can be agreed between the customer and the supplier.
The analysis shall provide failure effects on interfaces empathizing propagation of failure effects to redundant, cross-strapped, or interfacing assemblies.
For electronic equipment the FMEA shall include the analysis of part failure modes on interface circuitries.
A list of part failure modes is provided in Annex G.
Integration requirements
FMEAs of each level shall be integrated into their associated FMEA performed at one level higher.
The customer shall specify to the supplier the critical failure conditions (failure modes at customer level) which need to be focused on in the analyses at the level of the supplier.
In his FMEA, the supplier shall use the critical failure conditions identified by his customer as failure effects, when provided.
End effects identified by FMEA of each level shall become failure modes of their associated FMEA performed at one level higher.
Failure modes identified by FMEA of each level shall become failure causes of their associated FMEA performed at one level higher.
Additional failure modes shall be introduced at any level if missing (as failure effects) from lower level FMEAs.
At any level, additional failure causes, which can not be assessed at lower level as failure modes, shall be introduced into the FMEA.
Additional failures can be induced by physical layout or accommodation.
The effect of operational and failure behaviour of specific parts or equipment on other parts or equipment shall be assessed with regard to the physical layout of their mechanical, electrical and thermal interface.
- 1 Examples of effects are temperature, vibration, movement, power demand and heat flow.
- 2 A graphical representations of requirements 4.5a to 4.5h is given in Figure 41.
Dotted arrows present the flow down of critical failure conditions from upper level to lower level (see requirements 4.5b and 4.5c),
Line arrows present the bottom-up failure analysis integration process (see requirements 4.5a, 4.5d, 4.5e)
Figure 41: Graphical representation of integration requirements
Detailed requirements
All mission phases and related operational modes (including “safe mode”), unless otherwise agreed with the customer, shall be addressed by the FMEA.
The failure effects resulting from each failure mode shall be determined at the level of the item under investigation (local effect) and at the level of the product under analysis (end effect).
Failure modes that can propagate to interfacing functions, elements or functions and elements shall be identified.
The analysis shall indicate how each failure mode can be detected.
At a given level of analysis not all detection means and observable symptoms can be known. In the upper level analysis, the list of available detection means and observable symptoms is then completed.
Complex integrated circuits, including Application Specific Integrated Circuits (ASICs) and Field Programmable Gate Arrays (FPGAs), shall be analysed using the functional approach (functional FMEA).
Failures induced by physical layout or accommodation are considered for the complex integrated circuit.
At all levels S/W shall be analysed using only the functional approach (functional FMEA).
Software reactions to hardware failures shall be analysed by the HardwareSoftware Interaction Analysis (HSIA) as specified in clause 7.
If requested by the customer and when human performance is a significant contributor to mission success or safety possible human errors shall be highlighted and documented.
- 1 The FMEA should invoke the requirement for the performance of a human error effects analysis and a task analysis.
- 2 Requirement 4.6h is generally applied to manned systems.
Failures requiring failure detection and recovery action in a time interval greater than the time to an irreversible consequence shall be identified and subjected to recommendation for corrective action.
For electromechanical and electrical equipment, assembly or subsystem additional product design aspects shall include: - failure modes resulting from the location of the components, causing failure propagation due to components being mounted too close to each other;
The location of the components is considered for external failure propagation or internal failure propagation in case of internal redundancy.
- failure modes resulting from multiapplication of individual components;
Example of multi-applications is the use of one integrated circuit for two redundant paths.
- failure of grounding or shielding or insulation.
Annex H gives examples of checklist items for electromechanical and electrical equipment, assembly or subsystem.
FMEA report
The results of the FMEA shall be documented in a FMEA report in conformance with the DRD in Annex A.
FMECA requirements
General requirements
The customer shall determine the applicability of the FMECA requirements according to the specific project characteristics.
- 1 The FMECA is a FMEA extended to classify potential failure modes according to their criticality, i.e. the combined measure of the severity of failure modes and their probability of occurrence.
- 2 Typically FMECA is not performed for Telecommunication, Earth Observation & Scientific Spacecrafts and for ground segments.
All requirements reported in clause 4 shall apply with the exception of clause 4.3.
The acronym FMECA replaces FMEA.
Criticality ranking
The criticality number (CN) for a specific failure mode shall be derived from the severity of the failure effects and the probability of the failure mode occurrence.
A severity number (SN) shall be given to each assumed failure mode.
The existence of redundancy does not affect the severity classification and therefore relevant severity number. The highest numbers indicates the most severe categories.
The SNs shown in Table 51 shall be used.
Table 51: Severity Numbers (SN) applied at the different severity categories with associated severity level
|
Severity level
|
Severity category
|
SN
|
|
1
|
Catastrophic |
4
|
|
2
|
Critical |
3
|
|
3
|
Major |
2
|
|
4
|
Negligible |
1
|
An assessment of the probability of occurrence of the assumed failure mode during the specific mission shall be made.
In case of redundancy, the probability of failure of all redundant items is assessed with the support of the reliability analysis. The approach used for the assessment can be either qualitative or quantitative.
The qualitative approach based on engineering judgment shall be used if specific failure rate data are not available.
Failure mode probabilities of occurrence shall be grouped into defined levels which establish the qualitative failure probability level for entry into the FMECA worksheet column.
The probability levels and limits shall be approved by the customer.
Each level shall be identified by a probability number (PN).
- 1 The probability of occurrence levels, limits of the levels and relevant PNs are shown in Table 52 as an example.
- 2 The customer can tailor the probability levels to the individual programme through specific requirements and allocate the probability limits to the lower levels.
Table 52: Example of probability levels, limits and numbers
|
Level
|
Limits
|
PN
|
|
Probable |
P > 1E-1 |
4
|
|
Occasional |
1E-3 < P 1E-1 |
3
|
|
Remote |
1E-5 < P 1E-3 |
2
|
|
Extremely remote |
P 1E-5 |
1
|
The quantitative approach shall be used when specific failure rates and probability of occurrence data are available.
Data sources, approved by the customer, shall be listed.
The data sources shall be the same as those used for the other dependability analyses performed for the programme.
The failure probabilities shall be ranked as per Table 52 and relevant entry (the PN) listed in the FMECA worksheet column.
The CN for a specific failure mode shall be developed from the severity of the failure effects and the probability of the failure mode occurrence.
The CN shall be calculated as the product of the ranking assigned to each factor: CN = SN x PN.
Failure modes having a high CN shall be given a higher priority in the implementation of the corrective actions than those having a lower CN.
Identification of critical items
An item shall be considered a critical item if:
- a failure mode has failure consequences classified as catastrophic, or
- a failure mode is classified as CN greater or equal to 6 in conformance with Table 53.
The customer can tailor the criteria for critical item identification defining a failure mode as critical according to programme specific needs.
Table 53: Criticality matrix
|
Severity category
|
SNs
|
Probability level
| |||
|
PNs
| |||||
|
1
|
2
|
3
|
4
| ||
|
catastrophic
|
4
|
4
|
8
|
12
|
16
|
|
critical
|
3
|
3
|
6
|
9
|
12
|
|
major
|
2
|
2
|
4
|
6
|
8
|
|
negligible
|
1
|
1
|
2
|
3
|
4
|
FMECA report
The results of the FMECA shall be documented in a FMECA report in conformance with the DRD in Annex A.
FMEA/FMECA implementation requirements
General requirements
Formal delivery of the FMEA/FMECA shall be in accordance with the SOW.
Generally the report is presented at all design reviews.
In each phase, the FMEA/FMECA shall be reviewed, updated and changes recorded on a continuous basis to maintain the analysis current with the design evolution.
For the project phase definition refer to ECSS-M-ST-10.
The means of recording the FMEA/FMECA shall be agreed by the customer.
Phase 0: analysis or requirements identification
In this phase the FMEA/FMECA is, typically, not performed.
Phase A: Feasibility
The FMEA/FMECA shall assist the tradeoff among the various possible design concepts by assessing their impact on the project dependability and safety requirements.
The analysis contributes to the overall risk evaluation of each design concept. The functional approach is generally used.
The FMEA/FMECA shall make use of, as a minimum, the following inputs:
- the mission requirements, in particular the dependability and safety requirements;
- the design documentation of the different product concepts identified in phase 0;
- the hierarchical decomposition of the product functions.
The function decomposition is generally derived from the functional analysis.
The FMEA/FMECA shall be performed to provide the following results:
- evaluation of the conformance of each design concept function to the system dependability and safety requirements;
- identification of critical failure scenarios;
- identification of needs of focused analyses;
For example: fault tree.
- identification of the features to be implemented for each analysed function in order to meet the system dependability and safety requirements.
- 1 Example of the identified features are: functional redundancies or inhibits, possible alternative implementations.
- 2 A report for FMEA/FMECA is, typically, not required for phase A.
Phase B: Preliminary definition
The FMEA/FMECA shall be performed either according to the functional approach (functional FMEA/FMECA) or to the hardware approach (hardware FMEA/FMECA).
A list of part failure modes is provided in Annex G.
Rationale for selection of the approach shall be provided considering the following criteria:
- available design data;
- product complexity and level of integration;
- criticality of the product or function;
- segregation of function. The FMEA/FMECA shall:
- support the tradeoffs from the dependability and safety point of view;
- support the definition of the requirements to be implemented in the product as redundancies, inhibits, operations to be followed to avoid hazards or loss of mission, and others, such as failsafe, leak before burst, and maximum time allowable before compensation activation. The FMEA/FMECA shall make use of, as a minimum, the following inputs:
- The mission requirements and the mission profile.
- The product specification, considering in particular the dependability and safety requirements.
Examples of product specifications are: system or subsystem specification and performance specification.
- The current hierarchical decomposition of the product functions.
The function decomposition is generally derived from the functional analysis.
- The design of the product architecture.
Examples of product architecture are: design description, drawings and interfaces description.
- Available information from the product safety analyses relevant to hazard causes and controls.
- When applicable, available information from maintenance analysis relevant to replaceable unit definition.
- When available, FMEA/FMECAs performed at lower integration levels.
- For lower level FMEA/FMECAs, agreed list of parts failure modes
- For FMECA, item failure rates from data sources agreed by the customer. The FMEA/FMECA shall provide the following results:
- Inputs for dependability and safety requirements to be allocated for implementing the prevention and compensation methods and for minimizing the single point failures and the identified critical failure scenarios.
The dependability and safety requirements are in priority allocated to the product and lower levels. Recommendation to higher levels can be raised too.
- Input to safety analyses: identification of hazardous consequences due to failures at lower levels and relevant identified prevention and compensation methods.
- When applicable, input to maintainability analyses.
Example of the input is the identification of replaceable units for meeting the dependability and safety requirements.
- Input to software criticality analysis.
Example of the input is the identification of software functional failure consequences.
- Input to the critical function list or critical item list.
Example of these inputs is the identification of the critical items as defined in clause 4.3 or 5.3.
- Inputs for developing the FDIR system.
- For each hardware or function failure mode, the detection parameters that are generated following the occurrence of the failure as observable symptoms.
Examples of observable symptoms are: warning signal, sensor information, equipment status and current and voltage monitors).
- When available as design information, the precise monitor in terms of acquisition channel name.
- The monitor lists, as input for the FDIR development.
The objective is to allow the definition of algorithms, which detect any occurred failure in front of the registered detection signals.
Identification of failures requiring failure detection and recovery action in a time interval greater than the time to an irreversible consequence together with recommendation for corrective action.
- The propagation time (Tp) between the occurrence of the failure and the manifestation of the irreversible consequences
- Input to operation definition activity.
An example of this input is the identification of crew and system operations to be implemented to prevent or control critical dependability and safety events.
The FMEA/FMECA report shall be issued according to the SOW.
Phase C: Detailed definition
The FMEA/FMECA shall be performed according to the hardware approach (hardware FMEA/FMECA).
- 1 In this phase the hardware can be uniquely identified from the engineering design data. In some cases the functional approach or a combination of the two approaches can be used (rationale for selection to be provided and agreed by the customer).
- 2 A list of part failure modes is provided in Annex G.
The FMEA/FMECA shall allow to verify that the design fulfil the dependability and safety requirements, allocated to all of the project levels (system, subsystem and lower levels) in phase B.
The FMEA/FMECA shall review all of the following inputs and use those applicable: - The detailed mission and performance requirements and the environmental conditions.
- The dependability and safety requirements from the technical specification.
- The hierarchical decomposition of the product functions as derived from the updated functional analysis.
- The detailed mission profile (definition of the mission phases or modes).
- The detailed design architecture (design description, drawings, interfaces description).
- The detailed description of hazard causes and hazard control implementation in the design architecture from the relevant safety analysis.
- Definition of the Replaceable Units from the maintenance analysis.
- FMEA/FMECAs performed at lower integration level.
- For lower level FMEA/FMECAs, agreed list of parts failure modes.
- For FMECA, item failure rates from data sources agreed by the customer.
- Definition of the crew and product operations.
- Definition of the embedded monitors available for discovering any anticipated failure mode and of the automatic sequences to react to any malfunction from the FDIR analysis.
- Definition of the remote and man controlled (crew or ground operators) monitors available for discovering any anticipated failure mode and of the procedures to react to any malfunction from the FDIR analysis. The FMEA/FMECA shall provide the following results:
- Identification of the methods for preventing or compensating failure effects of critical items.
Examples of these methods are: redundancies and inhibits.
- Verification that the anticipated actions are able to prevent or control the consequences.
- Identification of remaining single point failures and identification of compensating features if the elimination is not possible or impractical.
- Input to safety analyses.
An example of this input is the identification of the implemented preventing or compensating methods for each identified hazardous consequence.
- Input to the critical function list or critical item list.
An example of this input is the identification of the items (component or equipment) to be considered critical according to the provided criticality definition.
- Input to the FDIR system activity:
- list of specific monitor parameters that allow the failure to be detected;
- verification of the effectiveness of the recovery methods or proposal of alternative methods;
- identification of failure modes that are not monitored.
- Input to operation definition activity,
Examples of these inputs are the identification of crew and system operations implemented to prevent or control critical dependability and safety events and verification of their capability to effectively control the failure consequences.
- Input to test definition activity (if required at the analysed integration level).
Examples of input to test definition activity are:
- List of failure modes with relevant effects and observable symptoms provided for generating test requirements and procedures.
- Identification of functional paths and redundancies that cannot be tested.
- Identification of tests to verify the assumptions used within the FMEA/FMECA that the system reacts according to the anticipated manner.
- Input to user manual and operation procedures.
For example, at system level the list of failure modes with relevant effects and observable symptoms are provided for establishing data recording requirements, and to determine the required frequency of monitoring in testing, checkout and mission use.
- Input to contingency analysis.
The FMEA/FMECA provide input such as failure detection means-observable symptoms and compensating provisions for the implementation of the contingency analysis.
The FMEA/FMECA report shall be issued according to the SOW.
Phase D: Production or ground qualification testing
The FMEA/FMECA performed in phase C shall be updated with regard to design changes decided after the critical design review (CDR) and according to test results.
The FMEA/FMECA shall be utilized as a diagnostic tool in order to support the failure diagnosis during the qualification and the elimination of potential failures.
Phase E: Utilization
The FMEA/FMECA performed at system level in phase C/D shall be utilized as support to diagnostic activities (inflight and on ground) in order to support the system maintenance and restoring.
In case of design evolution (mainly for ground segment) the FMEA/FMECA shall be updated.
Phase F: Disposal
In this phase the system level FMEA/FMECA shall be used together with the system safety analysis to support the identification of potential hazardous characteristics of used items (items at the end of its utilization phase) or of the design to define system disposal activities.
Examples of potential hazardous characteristics are material and radiation.
Hardwaresoftware interaction analysis (HSIA)
Overview
HSIA is an activity performed to ensure that the software reacts in an acceptable way to hardware failures. Particular attention is paid to each failure mode of hardware used in compensatory provisions (redundancy, protection) and controlled by software.
The HSIA can be performed with the aid of the checklist shown in Annex I. The questions can be tailored to the project.
Technical requirements
The HSIA shall be performed concurrently with the FMEA/FMECA to influence the hardware design and the software requirements.
The HSIA shall be used to verify that the software specifications as expressed in the requirements baseline (RB) or the technical specification (TS) cover the hardware failures according to the applicable FDIR requirements.
For more details on RB and TS, see ECSS-E-ST-40
Suppliers of products combining H/W and S/W shall perform a HSIA covering all hardware failures which can interact with internal S/W.
In the performance of the System HSIA, the supplier shall integrate the HSIAs performed at one level lower than the level of the supplier.
For each failure mode the following information shall be used:
- Symptoms triggering the software action (observable symptoms from FMEA/FMECA).
Refer to the RB or TS relevant section for justification.
- Action of the software.
Refer to RB or TS relevant section for justification
- Effect of the software action on the product functionality (through induced possible sequence softwarehardware effects). The HSIA shall be performed to provide the following results:
- inputs to the list of critical items;
For example: no or nonconforming software action and software action having adverse effects on hardware.
- inputs for FDIR policy;
- recommendations.
For example: hardware or software to be added or modified.
Nonconforming cases shall be identified and formally dispositioned in conformance with ECSS-Q-ST-10-09.
The HSIA shall be documented by completing a form in conformance with the DRD in Annex D.
Findings and recommendations arising from the HSIA shall be recorded and tracked together with the ones coming from FMEA/FMECA.
Implementation requirements
The HSIA shall be performed in early phase of design, typically in phase B, to support the definition of software requirements (RB).
No formal documentation of the analysis is necessary.
In phases C/D of the design, the HSIA shall be used to verify software requirements (RB/TS).
The HSIA shall be released according to clause 7.2.
Process FMECA
Purpose and objective
Process FMECA is the application of the FMECA methodology to processes. Its purpose is to identify potential critical process steps and to determine their effects on:
Safety;
product;
process itself;
programmatic aspects.
Possible typical weak points are human errors, failures of related hardware, or environmental stress in existing or planned processes, such as:
manufacturing;
assembly or integration;
ground operations (e.g. mating a satellite to the launcher, filling or draining of tanks, precooling of cryogenic equipment);
tests;
inorbit operations.
The objective of the process FMECA is to initiate measures to eliminate the potential critical process steps or to reduce their criticality to an acceptable value.
The process FMECA should be performed by the process specialist supported if required by the Dependability and Safety specialist.
Selection of processes and inputs required
Process FMECA shall be performed on processes agreed with the customer.
-
1 These processes are those considered to have effects as reported in Table 81.
-
2 The inputs needed to start the work depend strongly on the process to be analysed.
Typical inputs are: -
working and control plan;
-
assembly procedure;
-
integration procedure;
-
test procedure;
-
handling procedure (manual).
General process FMECA requirements
A Process FMECA report shall be issued, in conformance with Annex E.
The documentation of the process FMECA shall be accomplished by completing the columns of the Process FMECA worksheet in conformance with the DRD in Annex F.
The severity of failure effects shall be identified by assigning a severity number (SN) according to a table agreed with the customer.
Table 81 gives an example of definitions for Severity Numbers (SN) for some categories of failure effects. It can be customised or completed depending on the process analyzed and on the purpose of the analysis.
The probability of occurrence of failure modes shall be identified by assigning a probability number (PN) according to a table agreed with the customer.
Table 82 gives an example of Probability numbers (PN) for probability of occurrence.
The probability of detection of failure modes shall be identified by assigning a detection number (DN) according to a table agreed with the customer.
Table 83 gives an example of Detection numbers (DN) for probability of detection.
The criticality number (CN) shall be defined as the product of the numbers assigned to failure mode severity, probability of occurrence, and probability of detection according to:
Equation CN = SN x PN x DN
Since a failure mode can have more than one failure effect, the highest SN shall be considered.
The value of SN, PN, and DN are based on engineering judgement and previous experience.
The CN value is in the range from 1 to 64, whereby the meaning of the extremes is:
- negligible, i.e. there is no risk if CN = 1;
- extremely critical, i.e. there is an extremely high risk if CN = 64.
Table 81: Example of Severity numbers (SN) for severity of failure effects
|
SN
|
Definition
| |||
|
Safety related effects
|
Process result (i.e. product) related effects
|
Process related effects
|
Programmatic related effects
| |
|
4
|
Loss of life, life threatening or permanently disabling injury or occupational illness
|
N/A
|
The process is not recoverable and needs to be modified
|
Financial loss > 50 % of overall programme cost
|
|
3
|
Temporary disabling but not life threatening injury, or temporary occupational illness
|
Loss of the product
|
Repetition of several process steps or of the complete process
|
Financial loss between 50 % and 30 % of overall programme cost
|
|
2
|
|
Major degradation of the product
|
Repetition of the faulty step
|
Financial loss between 30 % and 10 % of overall programme cost
|
|
1
|
|
No or minor degradation of the product
|
No or minor impact on the analysed process
|
Financial loss < 10 % of overall programme cost
|
Table 82: Probability numbers (PN) for probability of occurrence
|
PN
|
Definition
|
|
4
|
Very likely
|
|
3
|
Likely
|
|
2
|
Unlikely
|
|
1
|
Extremely unlikely
|
Table 83: Detection numbers (DN) for probability of detection
|
DN
|
Definition
|
|
4
|
Extremely unlikely |
|
3
|
Unlikely |
|
2
|
Likely |
|
1
|
Very likely |
Identification of critical process steps
A process step shall be considered critical if:
- the severity number SN 3, or
- the probability number PN = 4, or
- the detection number DN = 4, or
- the criticality number CN 12
The customer can tailor the criteria for critical process step identification according to specific needs.
Recommendations for improvement
If the process step is regarded as critical (according to the criteria in clause 8.4) a recommendation shall be given.
The relevant failure modes shall then be analysed again on the same process FMECA worksheet to show the improvement, i.e. to show how the Criticality Number is reduced.
This is done by assuming that the recommendation is already implemented, so that it can be entered as an existing provision. If, as result of this second analysis run, the acceptance criteria of clause 8.4 are still not met, a second recommendation is made and analysed, and so on, until the acceptance criteria are met, or it can be shown and justified that no further risk reduction is feasible.
If no further risk reduction is feasible a justification for acceptability shall be given.
A case example is when the severity of a failure effect cannot be modified.
Followon actions
General
Decisions of the Project Management after consideration of the recommendations for improvement shall be:
- Case 1: the recommendation is implemented, or
- Case 2: the recommendation is rejected, or
- Case 3: an alternative recommendation is made.
Decisions on recommendations always involve the assessment of the impact on safety.
In case 1:
An actionee and a due date shall be entered for the implementation.
The analysis result of the implementation shall be compared with the results leading to the original recommendation.
In case of discrepancies, a clarification shall be entered and the relevant analysis steps repeated.
In case of no discrepancy, a closeout reference shall be entered.
For example, the reference to the change notice
In case 2:
The term “rejected” shall be entered (as closeout reference) together with the rationale for rejection.
The rationale is within the responsibility of the project.
In case 3:
An actionee and a due date shall be entered for the implementation of the alternative recommendation.
The modified situation shall be treated on the same process FMECA worksheet to identify the improvements.
The final closing of the action by the project can only be:
- acceptance according to case 1, or
- rejection according to case 2.
ANNEX(normative)FMEA/FMECA report – DRD
DRD identification
Requirement identification and source document
This DRD is called from ECSS-Q-ST-30-02, requirement 4.7a.
Purpose and objective
The purpose of the FMEA/FMECA report is to document the results of the FMEA/FMECA.
Expected response
Scope and content
Cover sheet
The FMEA/FMECA report shall include the title of the analysis and reference number, issue, revision and date, supplier signoff date, and the names and signatures of the analyst(s) and the approval authority.
Introduction
The FMEA/FMECA report shall provide concise statements on the objectives of the analysis including definition of the level of the analysis.
Documents
The FMEA/FMECA report shall list the applicable and reference documents, including design reference, analyses performed by lower level suppliers, used in the preparation of the FMEA/FMECA.
Acronyms and abbreviations
The FMEA/FMECA report shall list of acronyms, abbreviations and definitions of special terms used.
Product
The FMEA/FMECA report shall include narrative description of the product functions to provide an understanding of the analysis.
The FMEA/FMECA report shall include a functional partition in the design between hardware and software, including a reference to the corresponding HSIA shall be addressed.
Block diagrams and schematics
The FMEA/FMECA report shall include block diagrams and schematics to assist in describing the product, provide schematic diagrams, functional block diagrams and reliability block diagrams (RBDs) to a level consistent with the depth of the analysis and with design maturity.
Functional tree is also useful to describe functional relationships.
An appropriate identification number shall be used to provide consistent identification and complete visibility of the relationship between each block and the applicable failure modes.
Design
The FMEA/FMECA report shall provide the definition of the status of the design of the product under analysis by reference to a configuration document.
For example, CIDL.
If the design is not mature enough to provide this document, then the design shall be defined by reference to reports used to perform the analysis.
Description and listing of any incomplete design areas shall be identified.
Basic rules and assumptions
The FMEA/FMECA report shall include the description of the ground rules adopted for the analysis (including list of items omitted from the analysis) and all the assumptions made regarding mission phases and times, operational modes, environmental conditions, failure modes and failure criteria.
This list is not conclusive.
All rules and assumptions shall be approved by the customer.
For information a list of failure modes for EEE parts is provided in Annex G.
The list of failure modes of each part may be amended or additional modes included, depending on specific applications.
Failure detection or isolation criteria
The FMEA/FMECA report shall describe the FDIR policy and criteria including reference to relevant documents and to detection reference.
The detection reference includes telemetry, housekeeping data, and health check.
Results and recommendations
The FMEA/FMECA report shall provide results and recommendations based upon the detailed analysis presented by the FMEA/FMECA worksheets.
Critical items
The FMEA/FMECA report shall provide a list of all the critical items identified (as per 4.3 or 5.3, respectively) including item identification and crossreference with FMEA/FMECA worksheets.
Failure Effect Summary List (FESL)
The FMEA/FMECA report shall provide a list of the failure effects leading to consequences classified in severity category 1, 2 and 3 and identify all relevant failure modes including item identification and crossreference with FMEA/FMECA worksheets.
Status on recommendations
The FMEA/FMECA report shall make reference to the document providing the status of the recommendations.
FMEA/FMECA Worksheets
The FMEA/FMECA report shall include the FMEA/FMECA worksheets in conformance with Annex B/Annex C, respectively.
Special remarks
None.
ANNEX(normative)FMEA worksheet – DRD
DRD identification
Requirement identification and source document
This DRD is called from ECSS-Q-ST-30-02, requirement Annex A.2.1<14>a.
Purpose and objective
The purpose of the FMEA worksheet is to document the analysis performed in a tabular form.
Expected response
Scope and content
Header information
The FMEA worksheet shall contain the identity of the product (hardware or function) and the identity of corresponding equipment, subsystem, and system (as applicable).
Identification number
The FMEA worksheet shall contain the identification number for traceability purposes.
Item/block
The FMEA worksheet shall contain
- the name of the item or function being analysed, and
- the block of the reliability block diagram that is applicable to the analysis entry. Function
The FMEA worksheet shall contain a concise statement of the function performed by the item.
Failure mode
The FMEA worksheet shall contain the identification and description of all potential failure modes of the item or function under analysis.
With reference to 4.5d, end effects of lower level FMEA are failure modes of the higher level FMEA.
Failure cause
When requested, the FMEA worksheet shall contain the identification and description of the most probable causes associated with the assumed failure mode.
- 1 With reference to 4.5e failure modes of lower level FMEA are failure causes of the higher level FMEA.
- 2 The failure cause are generally not identified when components are analysed (equipment level FMEA).
phase/Operational mode
The FMEA worksheet shall contain a concise statement of the mission phase and operational mode in which the failure is assumed to occur.
These elements can be addressed in the header of the worksheet. Although all of the different mission phases or operational modes are taken into account, the record of results is limited to the phase or mode in which the worst failure effects occur.
Failure effects
The FMEA worksheet shall contain the identification of the consequences of each assumed failure mode at local effects and end effects levels.
- 1 Local effects
Local effects concentrate specifically on the impact of the failure mode on the operation, function, or status of the item identified in the second column of the worksheet. The local effects are recorded when different from the failure modes.
The purpose of defining local effects is to provide a basis for evaluating compensating provisions and for recommending corrective actions.
- 2 End effects
End effects define the effect that the analysed failure mode has on the operation, function, or status of the product under investigation and its interfaces, such that it allows integration into the next higher level FMEA.
Severity classification
The FMEA worksheet shall contain the severity classification category assigned to each failure mode according to the worst potential end effect of the failure (see clause 4.1 and 4.2).
Failure detection method - Observable symptoms
The FMEA worksheet shall identify the failure detection method and the observable symptoms.
The failure detection means include telemetry (exact label), visual or audible warning devices, sensing instrumentation, other unique indications (e.g. the failure effect itself), or none.
Compensating provisions
The FMEA worksheet shall identify the existing compensating provisions, such as design provisions or operator actions, which circumvent or mitigate the effect of the failure.
-
1 Design provisions
Compensating provisions are considered design provisions when they feature a design that nullifies the effects of a malfunction or failure, control, or deactivate product items to halt generation or propagation of failure effects, or activate backup or standby items. Design compensating provisions include: -
redundant items or alternative modes of operation that allow continued and safe operation, and
-
safety or relief devices which allow effective operation or limit the failure effects.
-
2 Operator actions
Compensating provisions are considered operator actions when the operator circumvents or mitigates the effect of the postulated failure mode.
Recommendations
Recommendations for corrective actions shall be noted. Each recommendation shall have a non-ambiguous identifier for tracking purpose.
Remarks
The FMEA worksheet shall contain any pertinent remarks relevant to and clarifying any other column in the worksheet line.
Example of FMEA worksheet
Figure B-1 gives an example of FMEA worksheet.
|
Failure Modes and Effects Analysis (FMEA)
| ||||||||||||||
|
Product:
|
System:
|
Subsystem:
|
Equipment:
| |||||||||||
|
Ident. number |
Item/ block |
Function |
Failure mode |
Failure cause |
phase/ Op. mode |
Failure effects a. Local effects b. End effects |
Severity classification |
Failure detection method/ observable symptoms |
Compensating provisions |
Recommendations |
Remarks |
|||
Figure: Example of FMEA worksheet
ANNEX(normative)FMECA worksheet – DRD
DRD identification
Requirement identification and source document
This DRD is called from ECSS-Q-ST-30-02, requirement Annex A.2.1<14>a.
Purpose and objective
The purpose of the FMECA worksheet is to document the analysis performed in a tabular form.
Expected response
Scope and content
General
The FMECA worksheet shall provide the data elements identified in FMEA worksheet of Annex B.2.1.
Severity number
The FMECA worksheet shall contain the severity number (SN) assigned to each assumed failure mode.
The SNs applied at the different severity categories are given in Table 51.
Failure mode probability
The FMECA worksheet shall contain an assessment of the probability of occurrence of the assumed failure mode and the relevant probability number (PN)
The PNs applied at the different probability levels are given in Table 52.
Criticality number
The FMECA worksheet shall contain a criticality number (CN) assigned to each assumed failure mode, as per 5.2n.
Example of FMECA worksheets
Figure C-1 and Figure C-2 give examples of FMECA worksheets.
|
Failure Modes Effects and Criticality Analysis (FMECA)
| |||||||||||||||||
|
Product:
|
System:
|
Subsystem:
|
Equipment:
| ||||||||||||||
|
Ident. number |
Item/ block |
Function |
Failure mode |
Failure cause |
phase/ Op. mode |
Failure effects a. Local effects b. End effects |
Severity classification |
Failure detection method/ observable symptoms |
Compensating provisions |
Severity Number SN |
Probability and PN |
Criticality Number CN |
Recommendations |
Remarks |
|||
Figure: Example 1 of FMECA worksheet
|
Failure Modes Effects and Criticality Analysis (FMECA)
| |||||
|
Product:
|
System:
|
Subsystem:
|
Equipment:
| ||
|
Number:
|
Item/block:
| ||||
|
Function:
| |||||
|
Failure mode:
| |||||
|
Failure cause:
| |||||
|
phase/Operational mode:
| |||||
|
Failure effects: a. Local effects
| |||||
|
Severity classification
| |||||
|
Failure detection method/Observable symptoms
| |||||
|
Compensating provisions:
| |||||
|
Severity Number SN:
|
Probability and PN:
|
Criticality Number CN:
| |||
|
Recommendations:
| |||||
|
Remarks:
| |||||
Figure: Example 2 of FMECA worksheet
ANNEX(normative)HSIA form - DRD
DRD identification
Requirement identification and source document
This DRD is called from ECSS-Q-ST-30-02, requirement 7.2h.
Purpose and objective
The purpose of the HSIA form is to document the analysis performed in a tabular form.
The HSIA check list is an aid for performing the analysis, see Annex I.
Expected response
Scope and content
Subsystem or equipment
The HSIA form shall contain the identification of subsystem or equipment submitted to HSIA.
HSIA sheet number
The HSIA form shall contain the HSIA running sheet number.
FMEA/FMECA reference
The HSIA form shall contain the identification of the reference number of the failure mode in the design FMEA/FMECA.
Failure mode
The HSIA form shall contain a summary of failure mode description.
RB/TS reference
The HSIA form shall contain a reference to the software specification used for the HSIA (number, issue).
Identification of parameters used to trigger the software
The HSIA form shall contain identification of the information processed by the software to notify the presence of the failure or initiate an isolation or corrective action in response.
The HSIA form shall contain the identification of corresponding health signal (health signal = result of comparison between detected and reference values).
RB/TS requirement number for S/W triggering
The HSIA form shall contain the requirement number in the RB/TS corresponding to the information at D.2.1<6>.
Description of software (S/W) action
The HSIA form shall contain a summary of the actions specified in RB/TS which are provided to negate the effects of or isolate the failure (isolation/recovery).
RB/TS requirement number for S/W action
The HSIA form shall contain the requirement number in the RB/TS corresponding to the information at D.2.1<8>.
Description of the effect of the S/W action on the product functionality
The HSIA form shall contain a summary of the effects of the actions taken by S/W (as described in RB/TS) on the functions of the product and on interfacing items.
Identified adverse effects on hardware (H/W)
The HSIA form shall contain a description of any identified adverse effect.
Examples of adverse effects are overstress of H/W, or failure propagation.
Assessment of the S/W action
The HSIA form shall contain an assessment of the S/W action.
The answer “yes” summarizes that the S/W action on the product functionality is conforming to the FDIR requirements (where applicable) and the S/W action is acceptable for the product functioning. In case of answer “no”, recommendations are reported in D.2.1<13>.
Recommendations
The HSIA form shall contain recommendations in case of insufficient S/W actions or in case of adverse effect on H/W.
Remarks:
The HSIA form shall contain any additional remark where relevant.
Example of HSIA form
Figure D-1 gives an example of HSIA form.
|
HARDWARESOFTWARE INTERACTION ANALYSIS (HSIA)
| |
|
1. Subsystem/Equipment:
|
2. HSIA sheet number:
|
|
3. FMEA/FMECA reference:
|
4. Failure mode:
|
|
5. RB/TS reference:
| |
|
6. Identification of parameters used to trigger the S/W action:
|
7. RB/TS requirement number for S/W triggering:
|
|
8. Description of S/W action:
|
9. RB/TS requirement number for S/W action:
|
|
10. Description of the effects of the S/W action on the H/W:
|
11. Identified adverse effect on H/W
|
|
12. Assessment of S/W action: Is the S/W action as expected? yes/no
| |
|
13. Recommendations:
| |
|
14. Remarks
| |
Figure: Example of HSIA form
HSIA integrated in FMEA/FMECA worksheet
In case the HSIA is provided inside the FMEA/FMECA, the FMEA/FMECA worksheet shall be completed as follows:
- add S/W Specification reference in the Reference document;
- in each completed column: for each failure mode where software is involved enter “S/W”;
- local/end effect: add points 10 and 11 of HSIA form;
- failure detection: add points 6 and 7 of HSIA form;
- recovery or compensation: add points 8 and 9 of HSIA form;
- recommendation: add points 12 and 13 of HSIA form.
ANNEX(normative)Process FMECA report – DRD
Requirement identification and source document
This DRD is called from ECSS-Q-ST-30-02, requirement 8.3a.
Purpose and objective
The purpose of the Process FMECA report is to document the analysis performed.
Expected response
Scope and content
Cover sheet
The Process FMECA report shall contain the title of the analysis and reference number, issue, revision and date, supplier signoff date, and the names and signatures of the analyst(s) and the approval authority.
Documents
The Process FMECA report shall list the applicable and reference documents, including applicable procedure, design reference, lower level supplier analyses, used in the preparation of the process FMECA.
Description of the analysed process
The Process FMECA report shall describe the analysed process.
Process FMECA worksheets
The Process FMECA report shall contain the Process FMECA worksheets in accordance with Annex F.
List of recommendations for improvement
The Process FMECA report shall list recommendations for improvement.
Follow-on actions
The Process FMECA report shall include follow-on action to be presented to the project team responsible for final decisions.
- 1 The follow-on actions (references for implementation, rejection, or analysis of alternative recommendations) apply to the updates of the report.
- 2 In the case where company “CONFIDENTIAL” processes are documented, the report can be split into:
- a summary report including recommendations and unacceptable points (to be submitted to the customer);
- the detailed process FMECA worksheets (company confidential).
- 3 See also clause 8.6 about “Follow-on actions”.
Special remarks
None
ANNEX(normative)Process FMECA worksheet – DRD
Requirement identification and source document
This DRD is called from ECSS-Q-ST-30-02, requirement 8.3b and Annex E.2.1<4>a.
Purpose and objective
The purpose of the Process FMECA worksheet is to document the analysis performed in a tabular form.
Expected response
Scope and content
Worksheet header
The Process FMECA worksheet shall identify the:
- Analysed process,
- System, subsystem, and equipment. Identification number
The Process FMECA worksheet shall contain the identification number for traceability purpose.
Item
The Process FMECA worksheet shall contain the identification of the individual process step.
Description
The Process FMECA worksheet shall contain the description of the process step.
Failure mode/failure cause
The Process FMECA worksheet shall contain the description of the assumed process step failure mode together with its causes.
Failure effects
Depending on the process analyzed and on the purpose of the analysis, the Process FMECA worksheet shall contain the description of all possible effects of the assumed failure modes on:
- Safety,
- Product (i.e. final result of the process),
- Process,
- Programmatic
For example, impact on costs schedule.
- Others, if any
For example, company image.
Detection means
The Process FMECA worksheet shall contain the description of the existing means and methods by which the effects can be detected.
Existing preventive or compensatory provisions
The Process FMECA worksheet shall contain the description of the existing preventive or compensatory provisions to prevent the failure mode, to reduce its effects, or to reduce its probability of occurrence.
Severity
The Process FMECA worksheet shall contain the identification of the severity of failure effect by assigning a severity number (SN) according to a table agreed with the customer.
Occurrence
The Process FMECA worksheet shall contain the identification of the probability of occurrence of the failure mode by assigning a probability number (PN) according to a table agreed with the customer.
Detection
The Process FMECA worksheet shall contain the Identification of the probability of detection of the failure mode by assigning a detection number (DN) according to a table agreed with the customer.
Criticality
The Process FMECA worksheet shall contain the criticality number (CN).
Recommendations and remarks
The Process FMECA worksheet shall contain a description of the recommended preventive or compensatory provisions to eliminate the failure mode, to reduce its effects, to reduce its probability of occurrence, or to improve its detectability, as well as any additional information being useful.
Example of Process FMECA worksheet
An example of a Process FMECA worksheet is given in Figure F-1.
|
Process Failure Modes, Effects and Criticality Analysis (FMECA) |
|||||||||||
|
Analysed process:
|
System:
|
Subsystem:
|
Equipment:
| ||||||||
|
Ident. number |
Item |
Description |
Failure mode/ Failure cause |
Failure effects: a) safety b) product c) process d) programmatic e) others |
Detection means |
Existing preventive or compensatory provisions |
Severity SN |
Occurrence PN |
Detection DN |
Criticality CN |
Recommendations and remarks |
Figure: Example of process FMECA
ANNEX(informative)Parts failure modes (space environment)
Table: Example of parts failure modes
|
01. CAPACITORS (family/group 01 xx)
| ||||||
|
Type
|
Failure modes
|
Remarks
| ||||
|
01 01 ceramic01 02 ceramic chip01 04 tantalum non-solid01 06 glass01 07 mica01 09 aluminium solid01 10 feedthrough01 11 semiconductor
|
OCSC
|
|
||||
|
01 03 tantalum solid
|
OCSC
|
Depending on leakage value, final effect can be either short circuit or open circuit (in case of over heating and burst)
| ||||
|
01 05 plastic metallized
|
(epsilon)
|
For self-healing capacitor (typical PM94, PM96, PM90, …) the short circuit is considered in the FMEA/FMECA (for traceability aspects). The minimum self-healing energy is indicated in the FMEA/FMECA.
| ||||
|
02. CONNECTORS (family/group 02 xx)
| ||||||
|
Type
|
Failure modes
|
Remarks
| ||||
|
02 01 circular02 02 rectangular02 03 printed circuit board02 05 RF coaxial02 06 glassfibre02 07 microminiature02 08 RF filter02 09 rack and panel
|
Any single pin OCConnector disconnection (1)
|
(1): The number of critical connectors (i.e. the demating of which has critical effects on the mission) is minimized by design. A specific analysis is performed for identifying critical connectors. Connector disconnection is considered as a not credible failure in flight providing a locking device exists and verification of locking is performed during AIT. An appropriate justification is provided.
| ||||
|
03. PIEZO-ELECTRIC DEVICES (family/group 03 xx)
| ||||||
|
Type
|
Failure modes
|
Remarks
| ||||
|
03 01 crystal resonator
|
OC (no clock signal)Frequency drift
|
- drift means over the worst case range specified
| ||||
|
04. DIODES (family/group 04 xx)
| ||||||
|
Type
|
Failure modes
|
Remarks
| ||||
|
04 01 switching04 02 rectifier04 03 voltage regulator04 04 voltage reference/zener04 05 RF/microwave schottky (Si)04 06 pin04 07 hot carrier04 08 transient suppression04 09 tunnel04 10 high voltage rectifier04 11 microwave varactor (GaAs)04 12 step recovery04 13 RF/microwave varactor (Si)04 14 current regulator04 15 microwave schottky (GaAs)04 16 RF/microwave pin04 17 microwave gunn (GaAs)
|
Any single pin Any single terminal SC to structure
|
Terminal means pin or component case (if any) It is important to consider SC between terminal and structure according to technology for diodes directly mounted on the structure
| ||||
|
05. FILTERS (family/group 05 xx)
| ||||||
|
Type
|
Failure modes
|
Remarks
| ||||
|
05 01 feedthrough05 02 diplexers
|
Any single pin Any single terminal SC to structure
|
It is important to consider SC between terminal and structure according to technology | ||||
|
06. FUSES (family/group 06 xx)
| ||||||
|
Type
|
Failure modes
|
Remarks
| ||||
|
06 01 all
|
OC
|
Glass fuses are generally forbidden with the exception of wire link fuses
| ||||
|
07. INDUCTORS (family/group 07 xx)
| ||||||
|
Type
|
Failure modes
|
Remarks
| ||||
|
07 01 RF coil07 02 cores07 03 chip
|
between terminalsSC between turnsAny single terminal SC to core or structure
|
SC between terminals or turns to be considered except where specific provisions other than enamel are taken (e.g. specifically insulated wire, kapton layer or specific design rules)It is important to consider SC between terminal and core or structure according to technology for inductors mounted directly on the structureBreaking of the magnetic core is assimilated to SC and is considered except where specific provisions are taken (e.g. potting)
| ||||
|
08. MICROCIRCUITS (family/group 08 xx)
| ||||||
|
Type
|
Failure modes
|
Remarks
| ||||
|
08 10 microprocess/microcontrol/peripher08 20 memory SRAM08 21 memory DRAM08 22 memory PROM08 23 memory EPROM08 24 memory EEPROM08 29 memory others08 30 programmable logic08 40 ASIC technologies digital
|
Any single output SC to V+/V-Any single output stuck to 0/1Any single output in high impedanceAny single input SC to V+/V-Any single input SC to 0/1OC of any single power supplyV+ to V- SCSEPAny single functional failureAny single output SC to V+/V-Any single output stuck to 0/1Any single output in high impedanceAny single input SC to V+/V-Any single input SC to 0/1OC of any single power supplyV+ to V- SCSEP
|
OC of any single power supply including ground pinFor complex IC's (ASIC, FPGA, µP,…), a functional FMEA/FMECA is performed taking into account the physical implementation .
| ||||
|
09. RELAYS (family/group 09 xx)
| ||||||
|
Type
|
Failure modes
|
Remarks
| ||||
|
09 01 non latching09 02 latching
|
Relay stuck in one position
|
See details in Figure G-1, Figure G-2, Figure G-3 hereafter.
| ||||
|
10. RESISTORS (family/group 10 xx)
| ||||||
|
Type
|
Failure modes
|
Remarks
| ||||
|
10 01 metal oxide10 05 composition10 07 shunt10 08 metal film10 10 network (all)10 11 heater, flexible
|
OC
|
For film network the open circuit of the common connection is considered
| ||||
|
10 09 chip (all)
|
OC
|
No short circuit is considered possible for sizes 1206 or larger
| ||||
|
10 02 wirewound precision (including surface mount)10 03 wirewound chassis mounted
|
between terminals (epsilon)
|
|
||||
|
10 04 variable (trimmer)
|
between terminals
|
|
||||
|
11. THERMISTORS (family/group 11 xx)
| ||||||
|
Type
|
Failure modes
|
Remarks
| ||||
|
11 01 temperature compensating11 02 temperature measuring11 03 temperature sensor
|
between terminals
|
|
||||
|
12. TRANSISTORS (family/group 12 xx)
| ||||||
|
Type
|
Failure modes
|
Remarks
| ||||
|
12 01 low power, NPN (< 2 W)12 02 low power, PNP (> 2 W)12 03 high power, NPN (< 2 W)12 04 high power, PNP (> 2 W)12 05 FET N channel12 06 FET P channel12 08 multiple12 09 switching12 10 RF/microwave NPN low power/low noise12 11 RF/microwave PNP low power/low noise12 12 RF/microwave FET N-channel/P-channel12 13 RF/microwave bipolar power12 14 RF/microwave FET power (Si)12 15 microwave power (GaAs)12 16 microwave low noise (GaAs)12 17 chopper
|
Any single terminal between any two terminals
|
SC between terminal and structure are considered according to technology for transistors mounted directly on the structure
| ||||
|
13. WIRES AND CABLES (family/group 13 xx)
| ||||||
|
Type
|
Failure modes
|
Remarks
| ||||
|
13 01 low frequency13 02 coaxial13 03 fiber optic
|
OCSC
|
SC to be considered except in case of double insulation | ||||
|
14. TRANSFORMERS (family/group 14 xx)
| ||||||
|
Type
|
Failure modes
|
Remarks
| ||||
|
14 01 power14 02 signal
|
Any single terminal primary/secondary SC +/- primary
|
SC between terminals or turns to be considered except where specific provisions other than enamel are taken (e.g. specifically insulated wire, kapton layer or specific design rules)SC between terminal and core or structure are considered according to technology for transformers mounted directly on the structureBreaking of the magnetic core is assimilated to SC and is considered except where specific provisions are taken (e.g. potting) | ||||
|
16. SWITCHES (family/group 16 xx)
| ||||||
|
Type
|
Failure modes
|
Remarks
| ||||
|
16 01 standard DC/AC power toggle16 02 circuit breaker16 03 RF-switch16 04 microswitch16 05 reed switch
|
between terminals
|
Failure modes considered are reported and justified along with a description of the component and of its application
| ||||
|
18. OPTO-ELECTRONICS (family/group 18 xx)
| ||||||
|
Type
|
Failure modes
|
Remarks
| ||||
|
18 01 optocoupler18 03 phototransistor18 06 charge couple device (CCD)18 07 LCD display/screen
|
Diode OCTransistor OCSC between diode terminalsSC between transistor terminalsSC between any two diode and transistor terminals
|
SC between diode and transistor terminals are considered according to technology (epsilon for 3C91).This information should be contained in the optocoupler procurement specification.
| ||||
|
18 02 LED18 04 photo diode/sensor18 05 laser diode
|
between terminals
|
|
||||
|
19. THYRISTORS (family/group 19 xx)
| ||||||
|
Type
|
Failure modes
|
Remarks
| ||||
|
19 01 all
|
between any two terminalsSC between any single terminal and structure
|
SC between terminal and structure are considered according to technology | ||||
|
20. THERMOSTAT (family/group 20 xx)
| ||||||
|
Type
|
Failure modes
|
Remarks
| ||||
|
20 01 all
|
Blocked Open
|
It is important to consider SC between contact terminal and structure according to technology | ||||
|
23. LAMP (family/group 23 xx)
| ||||||
|
Type
|
Failure modes
|
Remarks
| ||||
|
23 01 all
|
TBD
|
It is important to report the considered failure modes and justify them along with a description of the component and of its application
| ||||
|
27. FIBEROPTIC COMPONENTS (family/group 27 xx)
| ||||||
|
Type
|
Failure modes
|
Remarks
| ||||
|
27 01 fibre/cable27 02 connector27 03 isolator27 04 switch
|
OCTransmission performance drift
|
|
||||
|
30. RF PASSIVE COMPONENTS (family/group 30 xx)
| ||||||
|
Type
|
Failure modes
|
Remarks
| ||||
|
30 01 coaxial couplers30 06 waveguide components30 07 isolator/circulator30 09 coaxial power dividers30 10 coaxial attenuators/loads
|
- Open Circuit of an access or connection
|
It is important to report the considered failure modes and justify them along with a description of the component and of its application
| ||||
|
31. (family/group 31 xx)
| ||||||
|
Type
|
Failure modes
|
Remarks
| ||||
|
31 01 all
|
Cell between terminals of any single cellCell ruptureCell leakage
|
|
||||
|
32. PYROTECHNICAL DEVICES (family/group 32 xx)
| ||||||
|
Type
|
Failure modes
|
Remarks
| ||||
|
32 01 initiators32 02 cutters
|
between terminalsAny single terminal SC to structure
|
Failure modes considered are reported and justified along with a description of the component and of its application
| ||||
|
40. HYBRIDS (family/group 40 xx)
| ||||||
|
Type
|
Failure modes
|
Remarks
| ||||
|
40 01 thick film40 02 thin film
|
OC
|
Failure modes of components when viewed as discrete parts
| ||||
|
40 03 crystal oscillators
|
OCFrequency drift
|
|
||||
|
99. MISCELLANEOUS PARTS (family/group 99 xx)
| ||||||
|
Type
|
Failure modes
|
Remarks
| ||||
|
99 01 all
|
TBD
|
Failure modes considered are reported and justified along with a description of the component and of its application
| ||||
|
Heater
|
OC, including heater delamination (for thermofoil)SC between terminalsAny single terminal SC to structureSC between any two terminals of redundant lines
|
SC between terminal and structure are considered according to technologySC between redundant line terminals are considered according to technologySC between redundant lines at intermediate points not considered because of application of specific design rules.Specific design rules to be formulated or referred.
| ||||
|
Heat pipe
|
RuptureLeakageInsufficient thermal transfer
|
|
||||
|
Solar Cell (Si or AsGa)
|
- Short Circuit
|
- Total or partial surface loss; low probability of occurrence
| ||||
|
All pressurized element (tank, tubing, welded & screwed connections, filter, valve, regulator, pressure transducer, ...)
|
- Rupture
|
Failure mode to be confirmed by the supplier. The stuck open failure and leakage of both propellants have a very low probability of occurrence
| ||||
|
Pressure transducer
|
- Incorrect measurement
|
|
||||
|
Filter
|
- Clogging
|
|
||||
|
Pyrotechnic valve, Electro valve (isolation)
|
- Internal leakage
|
|
||||
|
Bi-propellant thruster valve
|
- Internal leakage
|
|
||||
|
Pressure regulator
|
- High output pressure
|
- Compared to normal pressure
| ||||
|
Non-return valve
|
- Internal leakage
|
|
||||
|
Fill and Drain valve
|
- Rupture
|
|
||||
|
Non Explosive Actuators
|
between terminalsAny single terminal SC to structure
|
Failure modes considered are reported and justified along with a description of the component and of its application
| ||||
The following table and figures identifies the failure modes, which are analysed for relays.
Table: Example of relay failure modes
|
Failure modes
|
Mono-stable relays (type J412, T12, GP5 or equivalent)
|
Bi-stable relays(type J422, TL12, GP250 or equivalent)
|
Bi-stable relays(type EL210 or equivalent)
|
Bi-stable relays(type GP3 or equivalent)
|
|
Relay stuck in OFF position:
|
|
|
|
|
|
- coil Open Circuit
|
A
|
A
|
A
|
A
|
|
- contact stuck OFF
|
A
|
A
|
A
|
A
|
|
Relay stuck in ON position:
|
|
|
|
|
|
coil Open Circuit
|
N/a
|
A
|
A
|
A
|
|
contact stuck ON
|
A
|
A
|
A
|
A
|
|
Coil short circuit
|
N/A
|
N/A
|
N/A
|
N/A
|
|
2 open contacts (relay stuck in intermediate position)
|
N/A
|
A (2)
|
N/A
|
A (1)
|
|
2 contacts in opposite positions
|
A (1)
|
A (1)
|
N/A
|
A (1)
|
|
Short circuit between fix contacts
|
A (1)
|
A (1)
|
N/A
|
A (1)
|
|
Short circuit between coil and one contact
|
A (1)
|
A (1)
|
N/A
|
A (1)
|
|
(1): Negligible probability of occurrence. To be considered in the FMECA for traceability aspects.
| ||||
|
A: applicable
| ||||
Figure: Two open contacts (relay stuck in intermediate position)
Figure: Two contacts in opposite positions
Figure: Short circuit between fix contacts
ANNEX(informative)Product design failure modes check list
Table: Example of a product design failure modes checklist for electromechanical electrical equipment or assembly or subsystems
|
Design failure modes
|
yes/no
|
|
Pin, wire sizing and PCB tracks not compatible with the overcurrent protection.
|
|
|
Mismating of adjacent connectors.
|
|
|
Connectors not used in flight configuration do not have flight qualified protection covers.
|
|
|
Power supply lines and data lines mixed in the same connector or harness.
|
|
|
Pyrotechnic lines and other lines mixed in the same connector or harness.
|
|
|
More than one wire per crimped connection.
|
|
|
Connectors not clearly labelled.
|
|
|
Harness, connectors and tie points shared in common by otherwise redundant paths.
|
|
|
Not every box or assembly has an external safety grounding stud.
|
|
|
Vent hole sizing not adequate.
|
|
|
Inadequate hermeticity for sealed devices.
|
|
|
Box or assembly attachment foot and bolt are not freely accessible for the associated tools.
|
|
|
PCB traces not properly derated.
|
|
|
Excessive fanout and fanin between interfacing PCBs or components.
|
|
|
Multiple functions performed by a single EEE part (e.g. redundant paths in one IC, a single multipole relay carrying redundant functions, redundancy paths integrated into a common multilayer PCB).
|
|
|
A sensing element is used in both control and monitoring.
|
|
|
Adjacent parts not spaced enough to preclude short circuit, stray capacitance or excessive thermal conduction.
|
|
|
Insufficient thermal isolation between redundant parts.
|
|
|
Thermal coupling between high dissipation and heat sensitive elements.
|
|
|
Hot spots.
|
|
|
Not all conductive surfaces are grounded.
|
|
|
Contact between metals with electrochemical potentials > 0,5 V.
|
|
|
Telecommands and telemetries are mapped so their sets of addresses are separated by at least two bits (critical telecommands or telemetries).
|
|
ANNEX(informative)HSIA check list
|
HARDWARESOFTWARE INTERACTION ANALYSIS (HSIA)
| |||
|
Subsystem:
|
FMEA/FMECA number:
| ||
|
Item:
|
Failure mode:
| ||
|
No.
|
Question
|
yes/no
| |
|
1a
|
Does the information provided to the software and its processing cause the presence of a failure to be passed to the software or initiate a corrective action in response?
|
|
|
|
1b
|
If the answer to 1a is “no”, does the hardware provide the information that the software can use to detect the failure?
|
|
|
|
1c
|
Are the answers to 1a and 1b consistent with the FMEA/FMECA analysis of observable symptoms?
|
|
|
|
2a
|
Does the software take action to negate the effects of the failure?
|
|
|
|
2b
|
If the answer to 2a is “no”, does the capability exist for the software to compensate for this failure mode?
|
|
|
|
3
|
As a result of this failure mode, can the software cause the hardware to be overstressed, or induce another failure?
|
|
|
|
4
|
Can this failure mode, in combination with software logic, adversely affect other functions?
|
|
|
|
5
|
What are the failure tolerance characteristics of the design regarding this failure mode (take into account ground or crew intervention, or software compensation); how many failures can be tolerated? (1 2 3)*
|
|
|
|
6
|
If ground or crew action is required to respond to this failure mode, is telemetry, or signal, provided to indicate the need for intervention?
|
|
|
|
7
|
Is the response time limited by mission success factors?
|
|
|
|
Change/Retention rationale summary
| |||
|
1. No H/W or S/W issues:
|
2. H/W accepts risk:
| ||
|
(crew or ground operators) (crew or ground operators) (crew or ground operators)
|
4. Detection during check-out:
| ||
|
5. Acceptance rationale:
|
6. Recommendations:
| ||
|
7. FMEA/FMECA change recommended:
| |||
|
Comments:
| |||
|
* circle number
| |||
Figure: Example of HSIA checklist
Bibliography
|
ECSS-S-ST-00 |
ECSS System – Description, implementation and general requirements |
|
ECSS-E-ST-40 |
Space engineering – Software general requirements |
|
ECSS-M-ST-10 |
Space project management – Project planning and implementation |
|
ECSS-M-ST-80 |
Space project management – Risk management |
|
ECSS-Q-ST-10-04 |
Space product assurance – Critical-item control |
|
ECSS-Q-ST-40 |
Space product assurance – Safety |
|
IEC 60050-191 (1990-12) |
International Electrotechnical Vocabulary. |
|
|
Chapter 191: Dependability and quality of service |