Skip to main content

Image

Space product assurance

Availability analysis

Foreword

This Standard is one of the series of ECSS Standards intended to be applied together for the management, engineering and product assurance in space projects and applications. ECSS is a cooperative effort of the European Space Agency, national space agencies and European industry associations for the purpose of developing and maintaining common standards. Requirements in this Standard are defined in terms of what shall be accomplished, rather than in terms of how to organize and perform the necessary work. This allows existing organizational structures and methods to be applied where they are effective, and for the structures and methods to evolve as necessary without rewriting the standards.

This Standard has been prepared by the ECSS Q-ST-30-09C Working Group, reviewed by the ECSS Executive Secretariat and approved by the ECSS Technical Authority.

Disclaimer

ECSS does not provide any warranty whatsoever, whether expressed, implied, or statutory, including, but not limited to, any warranty of merchantability or fitness for a particular purpose or any warranty that the contents of the item are error-free. In no respect shall ECSS incur any liability for any damages, including, but not limited to, direct, indirect, special, or consequential damages arising out of, resulting from, or in any way connected to the use of this Standard, whether or not based upon warranty, business agreement, tort, or otherwise; whether or not injury was sustained by persons or property or otherwise; and whether or not loss was sustained from, or arose out of, the results of, the item, or any services that may be provided by ECSS.

Published by:     ESA Requirements and Standards Division
    ESTEC, ,
    2200 AG Noordwijk
    The
Copyright:     2008 © by the European Space Agency for the members of ECSS

Change log

ECSS-Q-30-09A


7 December 2005


First issue


ECSS-Q-30-09B


Never issued


ECSS-Q-ST-30-09C


31 July 2008


Second issue


Minor editorial update to conform to ECSS drafting rules and to be consistent with the renumbering of ECSS standards.


Scope

This Standard is part of a series of ECSS Standards belonging to ECSSQST-30, Space product assurance – Dependability. The present standard defines the requirements on availability activities and provides where necessary guidelines to support, plan and implement the activities.

It defines the requirement typology that is followed, with regard to the availability of space systems or subsystems in order to meet the mission performance and needs according to the dependability and safety principles and objectives.

This Standard also describes the process that is followed and the most significant methodologies for the availability analysis to cover such aspects as

evaluation of the space element or system availability figure,

allocation of the requirement at lower level, and

outputs to be provided.

This Standard applies to all elements of a space project (flight and ground segments), where Availability analyses are part of the dependability programme, providing inputs for the system concept definition and design development.

The on­ground activities and the operational phases are considered, for availability purposes, in order to

acquire additional information essential for a better system model finalization and evaluation, and

monitor the system behaviour to optimize its operational performance and improve the availability model for future applications.

This standard may be tailored for the specific characteristic and constraints of a space project in conformance with ECSS-S-ST-00.

Normative references

The following normative documents contain provisions which, through reference in this text, constitute provisions of this ECSS Standard. For dated references, subsequent amendments to, or revisions of any of these publications do not apply. However, parties to agreements based on this ECSS Standard are encouraged to investigate the possibility of applying the most recent editions of the normative documents indicated below. For undated references the latest edition of the publication referred to applies.

ECSSSST0001


ECSS system — Glossary of terms


Terms, definitions and abbreviated terms

Terms from other standards

For the purpose of this Standard, the terms and definitions from ECSSSST0001- apply.

Terms specific to the present standard

achieved availability
probability that a system, subsystem or equipment, when used under stated conditions in an ideal support environment operates satisfactorily at a given time

The downtime is associated only to the active preventive and corrective maintenance.

active redundancy
every entity is operating and the system can continue to operate without downtime or defects despite the loss of one or more entities

corrective maintenance
maintenance performed to restore system hardware integrity following anomalies or equipment problems encountered during system operations

flight segment
product or a set of products intended to be operated in space

ground segment
all ground infrastructure elements that are used to support the preparation activities leading up to mission operations, the conduct of mission operations and all post­operational activities

hot redundancy
redundancy entity is “ON”, but not necessarily in the right configuration to accomplish the function

instantaneous availability
<intrinsic or inherent> probability that an item is in a state to perform a required function under given conditions at a given instant in time, assuming that the required external resources are provided

Preventive maintenance is generally not taken into account for intrinsic availability.

instantaneous availability
<operational> probability that an item is in a state to perform a required function under given conditions at a given instant of time, taking into account the maintenance strategy (spares policy and related in logistic delays and constraints)

lead time (supplier delay)
mean time for supplier to provide spares (including shipping time)

logistic delay
mean time for human and material maintenance means to be available (call­out time)

mean availability
<intrinsic or inherent> percentage of time that a system, subsystem or equipment, used under stated conditions, without any scheduled or preventive action and with ideal logistical support, operates satisfactorily for a defined time period

mean availability
<operational> percentage of defined time period in which a system, subsystem or equipment, operates satisfactorily used under stated conditions in an actual support environment

The down time is relevant to the corrective maintenance, preventive maintenance, logistic and administrative delays.

mean down time
mean time between service interruption and service resumption

See Figure 31.

Image Figure 31: Relations between the various values that characterize the reliability, maintainability and availability of equipment

mean time between failures
mean time between two consecutive failures

mean time between outages
mean time of operation of an entity between two consecutive non­operational phases caused by corrective or preventive maintenance activities

mean time to failure
mean time of working of an entity before its first failure

Also known as “mean time to first failure” (MTTFF).

mean time to outage
mean time of working of an entity before its first outage

mean time to repair
mean duration to repair equipment with human and material maintenance means being available

mean up time
mean time of working of an entity after corrective maintenance (covering repair and replacement)

outage
state of an item of being unable to perform its required function

[IEC Multilingual Dictionary:2001 edition]

  • 1    Causes of outages can be failures, upsets or planned and unplanned events.
  • 2    The failures can be due to cataleptic intrinsic events or external events.
    passive redundancy
    redundancy not activated before necessary

Also knows as “standby redundancy” or “cold redundancy”.

preventive maintenance
scheduled or on­condition maintenance actions performed on equipment to reduce its probability of failure or degradation

Preventive maintenance is performed to keep the system at designed reliability and safety levels before failure occurrence.

steady­state availability (asymptotic availability)
limit, if any, on the instantaneous availability as time approaches infinite

Abbreviated terms

For the purpose of this Standard, the abbreviated terms from ECSS-S-ST-00-01 and the following apply:

Abbreviations


Meaning


FMECA


failure modes, effects and criticality analysis


GPS


global positioning system


LD


logistic delay


MDT


mean down time


MTBF


mean time between failures


MTBO


mean time between outages


MTTF


mean time to failure


MTTFF


mean time to first failure


MTTO


mean time to outage


MTTR


mean time to repair


MUT


mean up time


NRB


nonconformance review board


PDF


probability density function


RAM


reliability availability and maintainability


SOW


statement of work


TWT


travelling wave tube


w.r.t.


with respect to


Objectives of availability analysis

The availability analysis is developed in order to

verify the conformance of the selected system design with the applicable availability requirements, and

provide inputs to estimate the life cycle cost of the system.

The above design activity leads to the optimization of the system concept definition with respect to design baseline, operations and logistics provisions.

The availability analysis identifies the unavailability contributors in order to quantify their impact in supporting the

decision making process, and

risk evaluation, reduction and control (see ECSSMST-80).

The availability activity is fully integrated into the development programme to ensure the correct support to the other disciplines (e.g. engineering, operations and logistics).

Specifying availability and the use of metrics

General

Introduction

The mission success criteria, from a probabilistic point of view, can be established in different ways. As a consequence, the selection of the most adequate dependability requirement depends on all of the operational constraints and mission objectives.

Availability requirements

Availability requirements shall respect the mandatory characteristics defined by the system engineering process.

E.g. traceable, identified, unique, or unambiguous.

For each availability requirement, a verification method shall exist.
Each availability requirement shall be a quantitative requirement.
The process leading to the definition of the availability requirement shall be user oriented (availability of mission service) and not design focused.
The process leading to the definition of the availability requirement shall include the following aspects necessary to characterize the project under development:

  • Functional and performances objectives.

For example, what is the “threshold” between nominal behaviour and failure mode? What are the contributors to mission success under system visibility and responsibility?

  • “Environmental” conditions.

For example, for which environment, interfaces, provisions, shall the above objectives be met?.

  • Operational time frame.

For example, for which period, at what date.

  • Unavailability contributors to be taken into account in the analysis on the basis of the supplier’s visibility and responsibility for the logistic scenario or support.

For example, detection, logistics, and administrative delays.

Availability requirements shall be specified according to one or several of the following classes of availability specifications detailed in clause 5.2.

Different ways of specifying availability

Probability figure convention

For each type of availability requirement, specified figures shall be defined as “mean” or “best estimate” probability figures (point estimation). Unit failure rates are generally computed in this way (or sometimes at 60 % confidence level).

Availability during mission lifetime for a specified service

Overview

Availability during mission lifetime for a specified service is currently used for missions where a “steady-state” nominal service is planned, and for which a percentage of the mission time can be specified as an availability performance measure.

The availability during mission lifetime applies to maintainable, on-ground or in-orbit (e.g. Space Station), and non-maintainable systems (e.g. satellites).

Generic potential contributors for outage periods can be, for instance, maintenance activities (preventive as far nominal service is impacted, corrective), periodic manoeuvres, reconfiguration delays for redundant payload, recoveries from safe mode, upsets, eclipses.

In some applications, the mission lifetime can be subdivided into several periods for which the availability requirement applies.

For example, “The system shall be operational during 11 months per year during the mission lifetime”.

Requirements

If the operative scenario duration is longer than the system or equipment mean down time (more than 5 MDT), so that the instantaneous or mean availability can reach an asymptotic (or steady state) behaviour, then the requirement shall be formulated in terms of steady­state availability, assuring a simplifying (and generally conservative) approach.
The availability during mission lifetime shall be computed as the ratio of time during which service is fulfilled over the total mission lifetime.
For non­maintainable systems, the availability during mission lifetime requirements shall be established considering that the mission is still operational at end of life.

For example, no single point failure is considered as an unavailability contributor.

The availability during mission lifetime can take into account radiation’s effects: such as upset for logic parts, SET for opto and linear parts, and latch up.
Functional effects on equipment or subsystems due to radiation single events shall be evaluated to give quantified inputs to availability analyses.

ECSS-Q-ST-60 branch standard gives methodology to evaluate behaviour of electronic parts within their functional conditions.

Availability at a specific time (or time interval) for a specified service

Overview

Availability at a specific time applies mainly for systems where specific critical operations are scheduled in the mission timeline. Typical applications are a launcher control bench availability at a specific time or a scientific satellite with a planned comet rendezvous mission.

Requirements

The availability at a specific time requirement shall address the probability that this “quasi­instantaneous” operation is successfully handled.
For non­maintainable systems, the availability at a specific time requirement shall specify by a single requirement both the availability and reliability characteristic.
Availability at a specific time shall be computed considering the mission loss probability.

Percentage or number of successfully delivered products

Overview

For some applications, the user­oriented approach characterizes the system in a “black­box” manner and specifying availability according to the number off, for instance, delivered products, services, or mission data with respect to user demands or nominal scenario.

Requirements

If the availability is specified by a percentage or number of successfully delivered products, the availability requirement shall be expressed as follows:

  • A ratio of successfully delivered products over number of requested products.

For instance w.r.t. the applicable criteria for performance, delay, and coverage.

  • Cumulated service hours during the mission.

For example, expected number of TWTs  hours operation from a 12 out of 16 redundant channels configuration over 10 years.

  • Acquired database volume or percentage.

For example, geographical coverage for an Earth observation mission.

Outage probability distribution

In the specific case that the availability is specified by an outage distribution and duration, and if a maximum duration is specified, a probability of exceeding this duration shall be associated.

  • 1    This can apply at subsystem level when a short service interruption is masked or filtered by the upper level function.
  • 2    For example, typically a GPS receiver temporary outage is tolerated by a navigation model. For this type of application, numerous short outages would be preferable to a few long ones.
    If several classes of outage are identified, an availability specified through an outage probability distribution shall be allocated for each class (associated duration and probabilities).

Metrics commonly used

The availability requirements shall be quantified using one or several of the following metrics:

  • inherent instantaneous availability;
  • operational instantaneous availability;
  • inherent mean availability;
  • operational mean availability;
  • inherent steady­state availability;
  • operational steady­state;
  • outage duration and occurrence;
  • MUT and MDT (or mean time to restore);
  • MTBF or MTBO, and MTTR;
  • MTTF or MTTO;
  • amount of successfully delivered products.

Metrics mapping

General

Following the definition of the system level availability requirements according to clause 5.2, the availability metrics and supporting metrics shall be selected according to Table 51.

Metrics mapping at system or subsystem level

The metrics selection performed at system level depends on all of the mission characteristics, in particular, the choice between an instantaneous availability, mean availability or steady state availability shall be based on the mission time schedule.
The choice between Inherent and Operational availability shall be based on the possibility to access information from the logistic support analysis necessary to assess the Operational availability.

If logistic and administrative delays necessary to assess the Operational availability cannot be obtained, the achieved availability may be used as the metric to take preventive maintenance into account in the assessment.

The choice between MUT with MDT and outage distribution shall be based on the number or duration of mission specific events, or only on the mean values for up time and down time.

Metrics mapping at equipment level

The choice between MTBF with MTTR and outage distribution shall be based on the number or duration of mission specific events, or only on the mean values for up time and down time.

For availability considerations, the equipment level refers to the lowest level of replaceable unit (LRU level).

Table 51 Availability and supporting metrics applicable at system and subsystem level



Metric




Inherent instantaneous availability


Operational instantaneous availability


Inherent mean availability


Operational mean availability


Inherent steady­state availability


Operational steady­state


Outage duration and occurrence


MUT and MDT (or mean time to restore)


MTBF/MTBO and MTTR


MTTF/MTTO


Amount of successfully delivered product


System/ Subsystem level


Availability during mission lifetime













Availability at a specific time interval













Outage probability distribution













Percentage of successfully delivered products













Equipment level


Availability during mission lifetime













Availability at a specific time interval













Outage probability distribution













Percentage of successfully delivered products


Not applicable at equipment level


Availability assessment process

Overview of the assessment process

The availability assessment process is represented as shown in Figure 61. The process steps identified in the different sections of the figure are addressed in detail in clauses 6.2 through 6.4 and in Annex A for the assessment availability methods.

Figure 61: Availability assessment process

Availability allocation

The availability allocation shall be based on the following:
subsystem failure’s effect on the mission derived from the system analysis,

  • previous experience from similar programmes,
  • subsystem complexity or cost,
  • subsystem technology maturity, and
  • previously designed and developed subsystem.

The criteria order of priority is application dependent.

The availability requirement allocation process shall be addressed early in the design phases (according to clause 7) in order to realistically evaluate the criticality of each system section and therefore the most appropriate baseline.

Iterative availability assessment

A preliminary availability evaluation based on previous experience or judgement expertise shall be performed in order to assess a risk of not meeting the requirements.

Such a preliminary availability evaluation is performed during the allocation process if a realistic allocation cannot be achieved.

The assessment process shall be conducted as follows:
Identification of the most appropriate method for availability assessment (see Annex A).

  • Collection and verification of data coming from the lower level analyses.
  • System availability assessment (including compliance verification) and identification of the project criticalities.
  • Architecture, operations or logistics modifications or more accurate analysis to reach the availability objective.
  • 1    This can imply the subsystem or equipment level contribution.
  • 2    Example of a more accurate analysis is a refinement of the working hypothesis on the stand­by failure rates, more realistic modelling of the functional redundancies.
  • Decision making process to eliminate (or reduce the impact of) the criticalities.
  • Assessment process reiteration in each project phase according to the system design evolution. An appropriate method, Analytic, Markovian or Monte­Carlo simulation, recognized as suitable for the assessment shall be used and the choice shall be explained and justified.
    Sources of numerical data shall be provided.

For example, internal database from supplier data, field return experience, or calculation from standard handbooks, such as MIL HDBK 217 or UTEC 80810.

Each equipment item’s availability shall be estimated, taking into account random and deterministic events.

The dynamic behaviour models can be typically sketched as shown in Figure 62. More complex flow charts can be developed depending on the system architecture and renewal process characteristics.

The results of availability analyses shall be reiterated in a timely manner through the design, integration processes and operation engineering to reflect the actual system baseline.
For flight equipment, the availability analysis shall take into account radiation effects.

For example, upset for logic parts such as SET for opto and linear parts, and latch up.

Functional effects on flight equipment due to radiation single events shall be evaluated to provide quantified inputs for availability analysis.

The ECSSQST-60 branch standard describes a methodology to evaluate behaviour of electronic parts within their functional conditions.

Image Figure 62: Example of a dynamic behaviour model

Availability report content

The availability analysis performed in each project phase shall contribute to the preparation of the following:

  • specifications,
  • trade­off reports, and
  • availability assessment reports. With regard to the specifications, the requirements defined at lower level as a result of the allocation process shall be reported in a dedicated section.
    The specifications section shall also include all the additional information (e.g. logistics constraints, operations provisions, and reference mission scenario) useful for the correct implementation of the requirements.
    The availability evaluations and considerations shall be clearly described with the relevant data and assumptions.
    The availability assessment report shall provide all the information needed to understand correctly the evaluations performed and to allow appropriate integration of the results obtained with the higher level analysis.
    The availability assessment report shall cover the following aspects:
  • A self­standing description of the system or equipment baseline, logistics support and operations.
  • The content, derived from the relevant reports, useful for acquiring all the elements taken into account in the availability model.
  • The availability requirements description and interpretation (to enable the verification of the correct requirement implementation).
  • The availability model description (including details of the selected mathematical approach and relevant assumptions or hypotheses).
  • Inputs (e.g. reliability data, logistic times, and working hypotheses).
  • The results obtained.
  • The conclusions and recommendations. The availability assessment reports shall be delivered at project review as per business agreement’s SOW.

Implementation of availability analysis

Overview

Availability is regularly integrated into the design process. The availability characteristics can be traded with other system attributes such as cost and performance during the optimization of the design.

Availability teams are regularly integrated into the development teams during the design process. Availability analysis should be performed in close interaction with the following functions:

integrated logistics support;

operations;

engineering.

Availability activities and programme phases )

Feasibility phase (Phase A)

During Phase A, the availability analysis shall cover the following aspects:

  • Identification of the methodology for the most realistic evaluation of the availability figures.

The methodology can be improved or even changed in the following phases.

  • Support to the preliminary design definition in terms of trade­off studies, rough availability estimations, identification of critical areas.
  • Evaluation of the availability performance of the selected reference system or equipment baseline.
  • Allocation (where necessary) of the applicable requirements at lower level.
  • Planning of the availability tasks for the design definition phase (Phase B or Phase C).

Preliminary definition phase (Phase B)

During Phase B, the availability analysis shall cover the following aspects:

  • Finalization of the availability methodology.
  • A review of the lower level analyses.
  • Support to local trade­off studies and design definition.
  • Contribution to maintenance strategy definition.
  • Definition of input data for the availability model.

E.g. manufacturer data, lower level outputs, data sources, and logistics information.

  • Evaluation of the availability performance of the selected reference system or equipment baseline.
  • Revision of the allocation process (where necessary).
  • Support to preparation of availability specifications.
  • Identification of the critical areas and support to the decision making process.
  • Planning of the availability tasks for the detailed design definition phase and development and preparation of the relevant section in the PA plan.

Detailed definition and production phases (Phase C/D)

During Phase C/D, the availability analysis shall cover the following aspects:

  • A review of the lower level analyses.
  • Consolidation of the input data (input data consistency check).
  • Support to the design, logistics and operations activities.
  • Contribution to design reviews.
  • Evaluation of the availability performance of the system or equipment baseline.
  • Identification of the critical parameters or points to be monitored or controlled.
  • Support to quality assurance activity during manufacture, integration and test, nonconformance review board (NRB) and failure review board.
  • Support flight readiness reviews.

Utilization phase (Phase E)

During Phase E, the availability analysis shall cover the following aspects:

  • Support to ground and flight operations.
  • Evaluation of the design and operational changes and their impacts on availability.
  • Collection of availability data during operation to assess the operational availability and issue of the operational availability report (when required).

ANNEX(informative) Suitable methods for availability assessment

Overview

This annex provides a short description of the main methods available to assess availability performance.

The application of probability theory to the availability problems has led to the development of different methodologies that allow all practical situations to be managed with the accuracy required or specified by the customer. The selection of a particular mathematical approach depends on several considerations, such as:

a probability density function associated with the parameters involved;

complexity of the system design and associated operations and logistics support;

time constraints for project development;

preventive maintenance planned during the system’s operating life;

spares policy.

The main methods are listed in this annex; for further details, refer to the technical literature on reliability and availability engineering.

Analytical method

The calculations use the following mathematical modelling:

Image This generic formula can be adapted to the application (e.g. for operational or intrinsic for system as well as equipment level).

For components or functions that are physically independent, the resulting availability is evaluated using the basic formulae shown in Figure A-1, depending on the redundancy scheme.

Image Figure: Basic availability formulae

Markov process

This approach, shown in Figure A-2, is based on the exponential law for the time to failure and the time to repair. Markov process theory is important because:

it provides a good representation of system behaviour for communication with the engineering teams, and

it allows the estimation of good approximations for the asymptotic (or steady­state) availability of some space applications, and has, for example, been efficiently applied to space ground segments.

However, the system complexity can generate a high number of expected states that have impact on the calculation aspects (time and accuracy). Realistic representation of logistic times (generally associated with normal or log normal distributions) is also not possible. Markov Graph is for a simple parallel model, states 1 and 2 representing a functional system with or without redundancy being available for each state.

Figure: Example of Markov graph

Monte­Carlo simulation

This numerical technique allows the evaluation of availability taking into account, in a realistic way, all aspects associated with the design, logistics and operations.

In a lot of applications, Petri nets are used to model the system operating scenario, shown in Figure A-3. The main advantages of Monte­Carlo simulation are the ability to handle complex system scenarios with deterministic or probabilistic delays, and one shot reliability. However, this method can involve:

heavy effort for system modelling (not recommended for short­term programmes), and

long calculation times (not acceptable during the trade­off or feasibility study).

Figure: Example of Petri net modelling

ANNEX(informative) Typical work package description for availability activities

The system or subsystem level RAM group can advantageously develop the following activities accordingly to the business agreement’s SOW:

Review the availability requirements and verify their acceptability with preliminary evaluations based on previous experiences or approximate models. This step is important for avoiding the implementation of unachievable requirements considering, among others, the allowed logistics support, operations provisions, and power and mass budget.
Identify the most appropriate availability model taking into account the mission scenario, project complexity, and time and cost constraints. If the selected methodology is extended to a lower level, dedicated procedures shall be used.
Prepare the lower level specification to translate the system availability requirements.
Define the system availability model.
Review the lower level availability reports.
Verify and consolidate the inputs coming from the other design areas (e.g. engineering, logistics, and operations).
Evaluate the system availability.
Trade­off analysis.
Provide support to project management to finalize the system operational cost.
Availability activities progress reporting.
Provide support to design reviews.
Prepare audits to verify the subcontractors knowledge and organization relevant to the availability discipline.
Support the logistics and operations department for specific probabilistic or qualitative assessments useful in the finalization of the availability model.
Support during the system exploitation phase for:
data collection,
decision making process, and
optimization of system operation.

Bibliography

ECSS-S-ST-00


ECSS system – Description, implementation and general requirements


ECSSQST-30


Space product assurance Dependability


ECSSM-ST-10


Space project management – Project planning and implementation


ECSSMST-80


Space project management Risk management


MIL HDBK 217


Military handbook Reliability prediction of electronic equipment


UTEC 80810


Modèle universel pour le calcul de la fiabilité prévisionnelle des composants, cartes et équipements électroniques, CNET