Why do functional failures matter?

July 26, 2021
Photo by Sebastian Pichler on Unsplash

Some failures can be lived with, but others are harmful and need to be prevented or avoided. It may seem obvious, but what systems should a maintenance organisation have to classify their assets, and why do functional failures matter?

Many organisations have a catalogue of their physical assets, which should be uniquely identified with a description of their location, or home facility if they are mobile. This data is likely to be held in a Computerised Maintenance Management System (CMMS). Holding a registry of assets is a core function according to IBM[1]. In addition, the CMMS enables the allocation of maintenance tasks to assets and asset components, as well as ordering spares for scheduled replacements and corrective tasks.

There are other requirements for classifying assets in the register that the CMMS might not support that allows us to better address the question posed in this blog. The first of these is classifying assets and their components for criticality.

The first instinct that many operating and maintenance staff may have whilst creating a criticality register, is to list their equipment by purchase cost. This is not wrong, but it is essential to remember that the intrinsic cost of the equipment may be overshadowed by the impact, consequences and costs of any part failing. It is often possible that relatively cheap parts have higher consequential costs after failure, exceeding most part costs.

Photo by Dominik Vanyi on Unsplash

Determining Failure Criticality

We can determine the criticality of failure from a number of dimensions which are important to the asset stakeholders. We can use this breakdown to assess, if our assets have any implications from the following:

1. Safety, Health, Environmental or Legislative compliance (SHEL)

  • Loss of life
  • Serious and minor injuries
  • Polluting the environment
  • Breaking the law, or agreed standards
  • Does maintenance risk exposure to hazardous materials or increase risks of injury

2. Operational impacts

  • Loss of production
  • Lower quality of production output (the goods produced or the service delivered)
  • Disruption to committed schedules

3. Economic impacts

  • The loss causes significant economic resources for recovery or replacement
  • Where the efficiency of operating assets is degraded and less yield is derived from them
  • The costs involved with sinking capital into procuring redundant or standby machinery, or conducting preventative or corrective maintenance on standby machinery.

4. Low significance impacts

SHEL impacts naturally have operational and economic impacts, and operational impacts have economic impacts, and so this list is in order of criticality.

But why do repeated functional failures that have SHEL impacts matter?

SHEL impacts have an existential threat to the whole organisation if they are not prevented. The loss of reputation, fines and compensation may severely impact an organisation for years. BP is an example.

We should take two perspectives on safety…

The first is the intrinsic safety of the asset. This is mainly addressed by the asset provider, the designer and manufacturer who will have expended a great deal of time and money in ensuring intrinsic safety is of a high standard. An asset will have safety features to deal with risks, such as aspects of design, protection and interlocks, reliable controls, and other devices including alarms and warnings that assist operators. If assets store energy or contain hazardous substances then containment is a major consideration.

The designer of assets will have knowledge of the main uses, operating contexts and environments their assets will be used in. But, if an operator chooses an unusual context or environment, they will need to be vigilant and analyse safety risks. If a machine is taken and integrated into a larger plant, then intrinsic safety needs to be further extended to the integrated plant as a whole entity. In oil and gas, the system integrators tend to be the large operating companies. These companies need to be experts in the design safety processes to ensure the intrinsic safety of the integrated plant is robust.

Photo by Dimitry Anikin on Unsplash

The second perspective is how an asset might be operated, and how it is maintained. For example, if a mobile asset breaks down and cannot be moved, then the efforts to recover the asset may have heightened safety risks for the maintenance crew because of the hazardous environment. A large mining truck breaking down in a mine pit is an example. These types of operational risks are likely to use a separate system from the criticality register to manage them. The analysis of SHEL needs data repositories beyond the asset register and its labelling, to contain the data produced by techniques, such as Failure modes and effects (FMEA) HAZOPs and fault tree analysis. These other techniques are also not supported by CMMS systems.

The Bow Tie Method

More recent models for addressing safety have emerged from many catastrophic events. These were caused by a number of smaller contributing events, that by themselves might only have minor impacts aligning. The swiss cheese model where barriers form layers of protection to guard against contributing causes may have holes. It’s where the holes align and all the barriers are defeated that catastrophe occurs. One recent tool that has emerged which enables deep analysis of protective layers for significant asset risk is the Bow Tie method. This has an intuitive and compelling visual representation.

Credit: Charles Dibsdale

Why do operational and economic impacts matter?

Customer dissatisfaction, loss of reputation and losing competitiveness over a longer period of time may put an organisation out of business. Economic impacts may also have similar effects including reduced bottom-line results that may impact market and shareholder value.

Why include the least critical ‘low significance’?

This is important because it shows the efforts and care the organisation has spent in analysing all assets to dig out all of the safety and operational critical items.

Other considerations

There are other considerations that can be included in criticality assessments. The organisation might include whether assets, equipment or parts are:

  • Difficult to procure, are rare or only have a single source of supply
  • Subject to obsolescence which risks difficult procurement in the future. Some organisations stockpile parts subject to obsolescence to mitigate this risk
  • Aspects of maintainability and how much outage time is necessary for recovery, or whether specialities and qualified people are necessary, that may be in short supply.
  • In an FMEA we may record local and next higher-level effects of failure from a component right up to the business level effects. some of the higher-level effects complement the criticality register

This list is not exhaustive. Do you have other interesting factors you include in asset criticality? What information systems do you use to store this data? How can criticality data be linked to maintenance (in the CMMS) through these systems? Please do leave your thoughts in the comments.

If you have missed the previous ‘Maintenance Guru’ blogs you can find them here. Please do share our blog with your friends and colleagues and remember to give the Maintenance Guru a follow!



Optimise your maintenance strategies with IronMan®
Get the most from your data.
Book a demo