
Space product assurance
Availability analysis
Foreword
This Standard is one of the series of ECSS Standards intended to be applied together for the management, engineering and product assurance in space projects and applications. ECSS is a cooperative effort of the European Space Agency, national space agencies and European industry associations for the purpose of developing and maintaining common standards. Requirements in this Standard are defined in terms of what shall be accomplished, rather than in terms of how to organize and perform the necessary work. This allows existing organizational structures and methods to be applied where they are effective, and for the structures and methods to evolve as necessary without rewriting the standards.
This Standard has been prepared by the ECSS Q-ST-30-09C Working Group, reviewed by the ECSS Executive Secretariat and approved by the ECSS Technical Authority.
Disclaimer
ECSS does not provide any warranty whatsoever, whether expressed, implied, or statutory, including, but not limited to, any warranty of merchantability or fitness for a particular purpose or any warranty that the contents of the item are error-free. In no respect shall ECSS incur any liability for any damages, including, but not limited to, direct, indirect, special, or consequential damages arising out of, resulting from, or in any way connected to the use of this Standard, whether or not based upon warranty, business agreement, tort, or otherwise; whether or not injury was sustained by persons or property or otherwise; and whether or not loss was sustained from, or arose out of, the results of, the item, or any services that may be provided by ECSS.
Published by: ESA Requirements and Standards Division
ESTEC, ,
2200 AG Noordwijk
The
Copyright: 2008 © by the European Space Agency for the members of ECSS
Change log
|
ECSS-Q-30-09A
|
First issue
|
|
ECSS-Q-30-09B
|
Never issued
|
|
ECSS-Q-ST-30-09C
|
Second issue
|
Scope
This Standard is part of a series of ECSS Standards belonging to ECSSQST-30, Space product assurance – Dependability. The present standard defines the requirements on availability activities and provides where necessary guidelines to support, plan and implement the activities.
It defines the requirement typology that is followed, with regard to the availability of space systems or subsystems in order to meet the mission performance and needs according to the dependability and safety principles and objectives.
This Standard also describes the process that is followed and the most significant methodologies for the availability analysis to cover such aspects as
evaluation of the space element or system availability figure,
allocation of the requirement at lower level, and
outputs to be provided.
This Standard applies to all elements of a space project (flight and ground segments), where Availability analyses are part of the dependability programme, providing inputs for the system concept definition and design development.
The onground activities and the operational phases are considered, for availability purposes, in order to
acquire additional information essential for a better system model finalization and evaluation, and
monitor the system behaviour to optimize its operational performance and improve the availability model for future applications.
This standard may be tailored for the specific characteristic and constraints of a space project in conformance with ECSS-S-ST-00.
Normative references
The following normative documents contain provisions which, through reference in this text, constitute provisions of this ECSS Standard. For dated references, subsequent amendments to, or revisions of any of these publications do not apply. However, parties to agreements based on this ECSS Standard are encouraged to investigate the possibility of applying the most recent editions of the normative documents indicated below. For undated references the latest edition of the publication referred to applies.
|
ECSSSST0001
|
ECSS system — Glossary of terms
|
Terms, definitions and abbreviated terms
Terms from other standards
For the purpose of this Standard, the terms and definitions from ECSSSST0001- apply.
Terms specific to the present standard
achieved availability
probability that a system, subsystem or equipment, when used under stated conditions in an ideal support environment operates satisfactorily at a given time
The downtime is associated only to the active preventive and corrective maintenance.
active redundancy
every entity is operating and the system can continue to operate without downtime or defects despite the loss of one or more entities
corrective maintenance
maintenance performed to restore system hardware integrity following anomalies or equipment problems encountered during system operations
flight segment
product or a set of products intended to be operated in space
ground segment
all ground infrastructure elements that are used to support the preparation activities leading up to mission operations, the conduct of mission operations and all postoperational activities
hot redundancy
redundancy entity is “ON”, but not necessarily in the right configuration to accomplish the function
instantaneous availability
<intrinsic or inherent> probability that an item is in a state to perform a required function under given conditions at a given instant in time, assuming that the required external resources are provided
Preventive maintenance is generally not taken into account for intrinsic availability.
instantaneous availability
<operational> probability that an item is in a state to perform a required function under given conditions at a given instant of time, taking into account the maintenance strategy (spares policy and related in logistic delays and constraints)
lead time (supplier delay)
mean time for supplier to provide spares (including shipping time)
logistic delay
mean time for human and material maintenance means to be available (callout time)
mean availability
<intrinsic or inherent> percentage of time that a system, subsystem or equipment, used under stated conditions, without any scheduled or preventive action and with ideal logistical support, operates satisfactorily for a defined time period
mean availability
<operational> percentage of defined time period in which a system, subsystem or equipment, operates satisfactorily used under stated conditions in an actual support environment
The down time is relevant to the corrective maintenance, preventive maintenance, logistic and administrative delays.
mean down time
mean time between service interruption and service resumption
See Figure 31.
Figure 31: Relations between the various values that characterize the reliability, maintainability and availability of equipment
mean time between failures
mean time between two consecutive failures
mean time between outages
mean time of operation of an entity between two consecutive nonoperational phases caused by corrective or preventive maintenance activities
mean time to failure
mean time of working of an entity before its first failure
Also known as “mean time to first failure” (MTTFF).
mean time to outage
mean time of working of an entity before its first outage
mean time to repair
mean duration to repair equipment with human and material maintenance means being available
mean up time
mean time of working of an entity after corrective maintenance (covering repair and replacement)
outage
state of an item of being unable to perform its required function
[IEC Multilingual Dictionary:2001 edition]
- 1 Causes of outages can be failures, upsets or planned and unplanned events.
- 2 The failures can be due to cataleptic intrinsic events or external events.
passive redundancy
redundancy not activated before necessary
Also knows as “standby redundancy” or “cold redundancy”.
preventive maintenance
scheduled or oncondition maintenance actions performed on equipment to reduce its probability of failure or degradation
Preventive maintenance is performed to keep the system at designed reliability and safety levels before failure occurrence.
steadystate availability (asymptotic availability)
limit, if any, on the instantaneous availability as time approaches infinite
Abbreviated terms
For the purpose of this Standard, the abbreviated terms from ECSS-S-ST-00-01 and the following apply:
|
Abbreviations
|
Meaning
|
|
FMECA
|
failure modes, effects and criticality analysis
|
|
GPS
|
global positioning system
|
|
LD
|
logistic delay
|
|
MDT
|
mean down time
|
|
MTBF
|
mean time between failures
|
|
MTBO
|
mean time between outages
|
|
MTTF
|
mean time to failure
|
|
MTTFF
|
mean time to first failure
|
|
MTTO
|
mean time to outage
|
|
MTTR
|
mean time to repair
|
|
MUT
|
mean up time
|
|
NRB
|
nonconformance review board
|
|
PDF
|
probability density function
|
|
RAM
|
reliability availability and maintainability
|
|
SOW
|
statement of work
|
|
TWT
|
travelling wave tube
|
|
w.r.t.
|
with respect to
|
Objectives of availability analysis
The availability analysis is developed in order to
verify the conformance of the selected system design with the applicable availability requirements, and
provide inputs to estimate the life cycle cost of the system.
The above design activity leads to the optimization of the system concept definition with respect to design baseline, operations and logistics provisions.
The availability analysis identifies the unavailability contributors in order to quantify their impact in supporting the
decision making process, and
risk evaluation, reduction and control (see ECSSMST-80).
The availability activity is fully integrated into the development programme to ensure the correct support to the other disciplines (e.g. engineering, operations and logistics).
Specifying availability and the use of metrics
General
Introduction
The mission success criteria, from a probabilistic point of view, can be established in different ways. As a consequence, the selection of the most adequate dependability requirement depends on all of the operational constraints and mission objectives.
Availability requirements
Availability requirements shall respect the mandatory characteristics defined by the system engineering process.
E.g. traceable, identified, unique, or unambiguous.
For each availability requirement, a verification method shall exist.
Each availability requirement shall be a quantitative requirement.
The process leading to the definition of the availability requirement shall be user oriented (availability of mission service) and not design focused.
The process leading to the definition of the availability requirement shall include the following aspects necessary to characterize the project under development:
- Functional and performances objectives.
For example, what is the “threshold” between nominal behaviour and failure mode? What are the contributors to mission success under system visibility and responsibility?
- “Environmental” conditions.
For example, for which environment, interfaces, provisions, shall the above objectives be met?.
- Operational time frame.
For example, for which period, at what date.
- Unavailability contributors to be taken into account in the analysis on the basis of the supplier’s visibility and responsibility for the logistic scenario or support.
For example, detection, logistics, and administrative delays.
Availability requirements shall be specified according to one or several of the following classes of availability specifications detailed in clause 5.2.
Different ways of specifying availability
Probability figure convention
For each type of availability requirement, specified figures shall be defined as “mean” or “best estimate” probability figures (point estimation). Unit failure rates are generally computed in this way (or sometimes at 60 % confidence level).
Availability during mission lifetime for a specified service
Overview
Availability during mission lifetime for a specified service is currently used for missions where a “steady-state” nominal service is planned, and for which a percentage of the mission time can be specified as an availability performance measure.
The availability during mission lifetime applies to maintainable, on-ground or in-orbit (e.g. Space Station), and non-maintainable systems (e.g. satellites).
Generic potential contributors for outage periods can be, for instance, maintenance activities (preventive as far nominal service is impacted, corrective), periodic manoeuvres, reconfiguration delays for redundant payload, recoveries from safe mode, upsets, eclipses.
In some applications, the mission lifetime can be subdivided into several periods for which the availability requirement applies.
For example, “The system shall be operational during 11 months per year during the mission lifetime”.
Requirements
If the operative scenario duration is longer than the system or equipment mean down time (more than 5 MDT), so that the instantaneous or mean availability can reach an asymptotic (or steady state) behaviour, then the requirement shall be formulated in terms of steadystate availability, assuring a simplifying (and generally conservative) approach.
The availability during mission lifetime shall be computed as the ratio of time during which service is fulfilled over the total mission lifetime.
For nonmaintainable systems, the availability during mission lifetime requirements shall be established considering that the mission is still operational at end of life.
For example, no single point failure is considered as an unavailability contributor.
The availability during mission lifetime can take into account radiation’s effects: such as upset for logic parts, SET for opto and linear parts, and latch up.
Functional effects on equipment or subsystems due to radiation single events shall be evaluated to give quantified inputs to availability analyses.
ECSS-Q-ST-60 branch standard gives methodology to evaluate behaviour of electronic parts within their functional conditions.
Availability at a specific time (or time interval) for a specified service
Overview
Availability at a specific time applies mainly for systems where specific critical operations are scheduled in the mission timeline. Typical applications are a launcher control bench availability at a specific time or a scientific satellite with a planned comet rendezvous mission.
Requirements
The availability at a specific time requirement shall address the probability that this “quasiinstantaneous” operation is successfully handled.
For nonmaintainable systems, the availability at a specific time requirement shall specify by a single requirement both the availability and reliability characteristic.
Availability at a specific time shall be computed considering the mission loss probability.
Percentage or number of successfully delivered products
Overview
For some applications, the useroriented approach characterizes the system in a “blackbox” manner and specifying availability according to the number off, for instance, delivered products, services, or mission data with respect to user demands or nominal scenario.
Requirements
If the availability is specified by a percentage or number of successfully delivered products, the availability requirement shall be expressed as follows:
- A ratio of successfully delivered products over number of requested products.
For instance w.r.t. the applicable criteria for performance, delay, and coverage.
- Cumulated service hours during the mission.
For example, expected number of TWTs hours operation from a 12 out of 16 redundant channels configuration over 10 years.
- Acquired database volume or percentage.
For example, geographical coverage for an Earth observation mission.
Outage probability distribution
In the specific case that the availability is specified by an outage distribution and duration, and if a maximum duration is specified, a probability of exceeding this duration shall be associated.
- 1 This can apply at subsystem level when a short service interruption is masked or filtered by the upper level function.
- 2 For example, typically a GPS receiver temporary outage is tolerated by a navigation model. For this type of application, numerous short outages would be preferable to a few long ones.
If several classes of outage are identified, an availability specified through an outage probability distribution shall be allocated for each class (associated duration and probabilities).
Metrics commonly used
The availability requirements shall be quantified using one or several of the following metrics:
- inherent instantaneous availability;
- operational instantaneous availability;
- inherent mean availability;
- operational mean availability;
- inherent steadystate availability;
- operational steadystate;
- outage duration and occurrence;
- MUT and MDT (or mean time to restore);
- MTBF or MTBO, and MTTR;
- MTTF or MTTO;
- amount of successfully delivered products.
Metrics mapping
General
Following the definition of the system level availability requirements according to clause 5.2, the availability metrics and supporting metrics shall be selected according to Table 51.
Metrics mapping at system or subsystem level
The metrics selection performed at system level depends on all of the mission characteristics, in particular, the choice between an instantaneous availability, mean availability or steady state availability shall be based on the mission time schedule.
The choice between Inherent and Operational availability shall be based on the possibility to access information from the logistic support analysis necessary to assess the Operational availability.
If logistic and administrative delays necessary to assess the Operational availability cannot be obtained, the achieved availability may be used as the metric to take preventive maintenance into account in the assessment.
The choice between MUT with MDT and outage distribution shall be based on the number or duration of mission specific events, or only on the mean values for up time and down time.
Metrics mapping at equipment level
The choice between MTBF with MTTR and outage distribution shall be based on the number or duration of mission specific events, or only on the mean values for up time and down time.
For availability considerations, the equipment level refers to the lowest level of replaceable unit (LRU level).
Table 51 Availability and supporting metrics applicable at system and subsystem level
|
|
|
Metric
| ||||||||||
|
|
|
Inherent instantaneous availability
|
Operational instantaneous availability
|
Inherent mean availability
|
Operational mean availability
|
Inherent steadystate availability
|
Operational steadystate
|
Outage duration and occurrence
|
MUT and MDT (or mean time to restore)
|
MTBF/MTBO and MTTR
|
MTTF/MTTO
|
Amount of successfully delivered product
|
|
System/ Subsystem level
|
Availability during mission lifetime
|
|
|
|
|
|
|
|
|
|
|
|
|
Availability at a specific time interval
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Outage probability distribution
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Percentage of successfully delivered products
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Equipment level
|
Availability during mission lifetime
|
|
|
|
|
|
|
|
|
|
|
|
|
Availability at a specific time interval
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Outage probability distribution
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Percentage of successfully delivered products
|
Not applicable at equipment level
| |||||||||||
Availability assessment process
Overview of the assessment process
The availability assessment process is represented as shown in Figure 61. The process steps identified in the different sections of the figure are addressed in detail in clauses 6.2 through 6.4 and in Annex A for the assessment availability methods.
Figure 61: Availability assessment process
Availability allocation
The availability allocation shall be based on the following:
subsystem failure’s effect on the mission derived from the system analysis,
- previous experience from similar programmes,
- subsystem complexity or cost,
- subsystem technology maturity, and
- previously designed and developed subsystem.
The criteria order of priority is application dependent.
The availability requirement allocation process shall be addressed early in the design phases (according to clause 7) in order to realistically evaluate the criticality of each system section and therefore the most appropriate baseline.
Iterative availability assessment
A preliminary availability evaluation based on previous experience or judgement expertise shall be performed in order to assess a risk of not meeting the requirements.
Such a preliminary availability evaluation is performed during the allocation process if a realistic allocation cannot be achieved.
The assessment process shall be conducted as follows:
Identification of the most appropriate method for availability assessment (see Annex A).
- Collection and verification of data coming from the lower level analyses.
- System availability assessment (including compliance verification) and identification of the project criticalities.
- Architecture, operations or logistics modifications or more accurate analysis to reach the availability objective.
- 1 This can imply the subsystem or equipment level contribution.
- 2 Example of a more accurate analysis is a refinement of the working hypothesis on the standby failure rates, more realistic modelling of the functional redundancies.
- Decision making process to eliminate (or reduce the impact of) the criticalities.
- Assessment process reiteration in each project phase according to the system design evolution.
An appropriate method, Analytic, Markovian or MonteCarlo simulation, recognized as suitable for the assessment shall be used and the choice shall be explained and justified.
Sources of numerical data shall be provided.
For example, internal database from supplier data, field return experience, or calculation from standard handbooks, such as MIL HDBK 217 or UTEC 80810.
Each equipment item’s availability shall be estimated, taking into account random and deterministic events.
The dynamic behaviour models can be typically sketched as shown in Figure 62. More complex flow charts can be developed depending on the system architecture and renewal process characteristics.
The results of availability analyses shall be reiterated in a timely manner through the design, integration processes and operation engineering to reflect the actual system baseline.
For flight equipment, the availability analysis shall take into account radiation effects.
For example, upset for logic parts such as SET for opto and linear parts, and latch up.
Functional effects on flight equipment due to radiation single events shall be evaluated to provide quantified inputs for availability analysis.
The ECSSQST-60 branch standard describes a methodology to evaluate behaviour of electronic parts within their functional conditions.
Figure 62: Example of a dynamic behaviour model
Availability report content
The availability analysis performed in each project phase shall contribute to the preparation of the following:
- specifications,
- tradeoff reports, and
- availability assessment reports.
With regard to the specifications, the requirements defined at lower level as a result of the allocation process shall be reported in a dedicated section.
The specifications section shall also include all the additional information (e.g. logistics constraints, operations provisions, and reference mission scenario) useful for the correct implementation of the requirements.
The availability evaluations and considerations shall be clearly described with the relevant data and assumptions.
The availability assessment report shall provide all the information needed to understand correctly the evaluations performed and to allow appropriate integration of the results obtained with the higher level analysis.
The availability assessment report shall cover the following aspects: - A selfstanding description of the system or equipment baseline, logistics support and operations.
- The content, derived from the relevant reports, useful for acquiring all the elements taken into account in the availability model.
- The availability requirements description and interpretation (to enable the verification of the correct requirement implementation).
- The availability model description (including details of the selected mathematical approach and relevant assumptions or hypotheses).
- Inputs (e.g. reliability data, logistic times, and working hypotheses).
- The results obtained.
- The conclusions and recommendations. The availability assessment reports shall be delivered at project review as per business agreement’s SOW.
Implementation of availability analysis
Overview
Availability is regularly integrated into the design process. The availability characteristics can be traded with other system attributes such as cost and performance during the optimization of the design.
Availability teams are regularly integrated into the development teams during the design process. Availability analysis should be performed in close interaction with the following functions:
integrated logistics support;
operations;
engineering.
Availability activities and programme phases )
Feasibility phase (Phase A)
During Phase A, the availability analysis shall cover the following aspects:
- Identification of the methodology for the most realistic evaluation of the availability figures.
The methodology can be improved or even changed in the following phases.
- Support to the preliminary design definition in terms of tradeoff studies, rough availability estimations, identification of critical areas.
- Evaluation of the availability performance of the selected reference system or equipment baseline.
- Allocation (where necessary) of the applicable requirements at lower level.
- Planning of the availability tasks for the design definition phase (Phase B or Phase C).
Preliminary definition phase (Phase B)
During Phase B, the availability analysis shall cover the following aspects:
- Finalization of the availability methodology.
- A review of the lower level analyses.
- Support to local tradeoff studies and design definition.
- Contribution to maintenance strategy definition.
- Definition of input data for the availability model.
E.g. manufacturer data, lower level outputs, data sources, and logistics information.
- Evaluation of the availability performance of the selected reference system or equipment baseline.
- Revision of the allocation process (where necessary).
- Support to preparation of availability specifications.
- Identification of the critical areas and support to the decision making process.
- Planning of the availability tasks for the detailed design definition phase and development and preparation of the relevant section in the PA plan.
Detailed definition and production phases (Phase C/D)
During Phase C/D, the availability analysis shall cover the following aspects:
- A review of the lower level analyses.
- Consolidation of the input data (input data consistency check).
- Support to the design, logistics and operations activities.
- Contribution to design reviews.
- Evaluation of the availability performance of the system or equipment baseline.
- Identification of the critical parameters or points to be monitored or controlled.
- Support to quality assurance activity during manufacture, integration and test, nonconformance review board (NRB) and failure review board.
- Support flight readiness reviews.
Utilization phase (Phase E)
During Phase E, the availability analysis shall cover the following aspects:
- Support to ground and flight operations.
- Evaluation of the design and operational changes and their impacts on availability.
- Collection of availability data during operation to assess the operational availability and issue of the operational availability report (when required).
ANNEX(informative) Suitable methods for availability assessment
Overview
This annex provides a short description of the main methods available to assess availability performance.
The application of probability theory to the availability problems has led to the development of different methodologies that allow all practical situations to be managed with the accuracy required or specified by the customer. The selection of a particular mathematical approach depends on several considerations, such as:
a probability density function associated with the parameters involved;
complexity of the system design and associated operations and logistics support;
time constraints for project development;
preventive maintenance planned during the system’s operating life;
spares policy.
The main methods are listed in this annex; for further details, refer to the technical literature on reliability and availability engineering.
Analytical method
The calculations use the following mathematical modelling:
This generic formula can be adapted to the application (e.g. for operational or intrinsic for system as well as equipment level).
For components or functions that are physically independent, the resulting availability is evaluated using the basic formulae shown in Figure A-1, depending on the redundancy scheme.
Figure: Basic availability formulae
Markov process
This approach, shown in Figure A-2, is based on the exponential law for the time to failure and the time to repair. Markov process theory is important because:
it provides a good representation of system behaviour for communication with the engineering teams, and
it allows the estimation of good approximations for the asymptotic (or steadystate) availability of some space applications, and has, for example, been efficiently applied to space ground segments.
However, the system complexity can generate a high number of expected states that have impact on the calculation aspects (time and accuracy). Realistic representation of logistic times (generally associated with normal or log normal distributions) is also not possible. Markov Graph is for a simple parallel model, states 1 and 2 representing a functional system with or without redundancy being available for each state.
Figure: Example of Markov graph
MonteCarlo simulation
This numerical technique allows the evaluation of availability taking into account, in a realistic way, all aspects associated with the design, logistics and operations.
In a lot of applications, Petri nets are used to model the system operating scenario, shown in Figure A-3. The main advantages of MonteCarlo simulation are the ability to handle complex system scenarios with deterministic or probabilistic delays, and one shot reliability. However, this method can involve:
heavy effort for system modelling (not recommended for shortterm programmes), and
long calculation times (not acceptable during the tradeoff or feasibility study).
Figure: Example of Petri net modelling
ANNEX(informative) Typical work package description for availability activities
The system or subsystem level RAM group can advantageously develop the following activities accordingly to the business agreement’s SOW:
Review the availability requirements and verify their acceptability with preliminary evaluations based on previous experiences or approximate models. This step is important for avoiding the implementation of unachievable requirements considering, among others, the allowed logistics support, operations provisions, and power and mass budget.
Identify the most appropriate availability model taking into account the mission scenario, project complexity, and time and cost constraints. If the selected methodology is extended to a lower level, dedicated procedures shall be used.
Prepare the lower level specification to translate the system availability requirements.
Define the system availability model.
Review the lower level availability reports.
Verify and consolidate the inputs coming from the other design areas (e.g. engineering, logistics, and operations).
Evaluate the system availability.
Tradeoff analysis.
Provide support to project management to finalize the system operational cost.
Availability activities progress reporting.
Provide support to design reviews.
Prepare audits to verify the subcontractors knowledge and organization relevant to the availability discipline.
Support the logistics and operations department for specific probabilistic or qualitative assessments useful in the finalization of the availability model.
Support during the system exploitation phase for:
data collection,
decision making process, and
optimization of system operation.
Bibliography
|
ECSS-S-ST-00
|
ECSS system – Description, implementation and general requirements
|
|
ECSSQST-30
|
Space product assurance Dependability
|
|
ECSSM-ST-10
|
Space project management – Project planning and implementation
|
|
ECSSMST-80
|
Space project management Risk management
|
|
MIL HDBK 217
|
Military handbook Reliability prediction of electronic equipment
|
|
UTEC 80810
|
Modèle universel pour le calcul de la fiabilité prévisionnelle des composants, cartes et équipements électroniques, CNET
|