Home Features Resilience versus Reliability: Examining the Right Measures for Electric Power

Resilience versus Reliability: Examining the Right Measures for Electric Power

by Brian Sims

On Friday 9 August, the UK felt the impacts of what was described by the National Grid as an “incredibly rare” event. The impacts were immediate: homes were plunged into darkness, traffic lights failed to work, trains were cancelled and passengers stranded, hospitals suspended non-essential procedures and many businesses suffered financial losses, writes Dr Sandra Bell.

The Government immediately launched an investigation by the Energy Emergencies Executive Committee and the National Grid was asked by business secretary Andrea Leadsom to “urgently review and report to Ofgem”.

The National Grid published a preliminary report a mere nine days after the event reiterating that what happened appeared to represent an “extremely rare and unexpected event”. The document states that immediately following a lightning strike on the Eaton Socon-Wymondley main transmission circuit, the Hornsea off-shore windfarm and the Little Barford power station both (almost simultaneously) reduced their energy supply to the grid.

The customer impacts of this were instantaneous and significant and included 1.1 million customers being without power for between 15 and 50 minutes, severe rail transport disruption caused by a certain class of train operating in the South East area being unable to remain operational without engineer intervention and, extremely worryingly, the disruption of critical facilities including those at Ipswich Hospital and Newcastle Airport.

Risk landscape has changed

There’s little doubt that the technical detail of what caused the gas-fired power station at Little Barford and the Hornsea off-shore wind farm to go offline will be collected and analysed in forensic detail together with a scrutiny of the automatic safety systems that shut off the power to some places to protect the integrity of the grid. Likewise, the communications procedures will be probed along with the ability of services such as rail transport and hospitals to survive a 42-minute power outage.

However, perhaps the answer to what went wrong lies not in the technical detail, but rather in the fact that the risk landscape has changed and that the metrics collectively used to drive investment and planning for infrastructure disruptions are now reaching their limit of usefulness.

This is certainly the view of the National Grid whose analysis of the chain of events concluded: “The voltage performance of the National Electricity Transmission System was within SQSS and Grid Code requirements”. Its preliminary report states that, while “associated” with a single lightning strike, the three events (ie the transmission circuit trip and the Hornsea and Little Barford supply reductions) were independent and therefore the situation of all three happening at the same time was “exceptional” and  beyond regulatory planning requirements.

Most electric power utilities, who have long been seen as leaders in the critical infrastructure community for contingency planning, have regulations that continue to drive them with “reliability” metrics such as number of customers interrupted, customer minutes lost and mean daily fault rates. Such metrics are good for normal operating conditions, but they undervalue the impact of large-scale events and price lost load at a flat rate.

Yet the value of lost load compounds the longer it’s lost. For example, most customers will value costs differently in the first few minutes of the disruption caused by an outage when it’s merely inconvenient than they do after days of disruption, or even weeks when modern life becomes simply impossible.

Likewise, the impact of large-scale events is disproportionately high, driven by abnormal restoration costs and widespread and complex infrastructure damage. Large-scale events are therefore often only included in the narrative of risk registers and the reliability metrics drive a planning and investment focus on smaller, more common events rather than larger, more uncommon and yet more disruptive ones. This is especially true when combined with an accessibility and affordability target.

Terrorism and organised cyber crime

Widespread economic instability, disruptive technologies, hyper-extended supply chains, terrorism and organised cyber crime are now commonplace. Likewise, grid operations have increased in their complexity due to changing power demand, increased reliance on renewable sources and the heightened introduction of smart technologies. Together, these have created a risk landscape that’s no longer relatively stable and interspersed with occasional shocks, but unremittingly characterised by uncertainty, complexity and risks with adversaries.

Low-probability, high-consequence events are now much more common and energy researchers such as Vugrin, Castillo and Silva-Monroy from the Sandia National Laboratories have recognised that historical data used for reliability calculations may not be suitable for characterising future potential outages simply because emerging threats can differ significantly from historical precedents.

What about the trains, hospitals and airports? It’s all very well blaming the National Grid or the regulators for failing to anticipate and plan for such events, but some blame for the impacts must also lie with those who use the power that the National Grid provides. There are very few organisational activities that don’t require power in some form or other and, recently, we’ve seen an increase in the number of businesses effectively managing the risks associated with power outages by using our workplace facilities as part of their business continuity arrangements.

According to our annual ‘Disaster Landscape’ invocation statistics, power outages were the foremost reason companies in the UK and the US relocated their workplace to one of our facilities. That’s a 77% increase from the previous year. For many organisations, simply relocating to an alternative workplace in the event of a power outage resolves the issue and all they experience is a small operational disruption as staff relocate to the recovery facility where they can carry on as normal. However, for large-scale infrastructure diversification, emergency response procedures and alternative power supplies are a ‘must’ if customer impact is to be minimised.

If it were not for Network Rail’s electrical power resilience, the impact on the transport system could have been far worse. It has been reported that no track supplies were lost and that traction power was maintained to the majority of the railway throughout the incident. It would appear that the majority of rail disruption was caused by a particular type of train reacting unexpectedly to the electrical disturbance. Class 700 and 717 trains shut down due to their internal protection systems being triggered. These trains then required manual intervention to restart.

Undoubtedly, Network Rail has emergency response procedures to cope with one or two of these trains requiring a manual restart, but there were 60 in use at the time of the incident and the resultant impact to the rail network was 591 cancelled or part cancelled trains, 873 delayed trains and thousands of delayed and stranded passengers.

Internal resilience

The National Grid Electricity System Operator has internal resilience (from generators, batteries and interconnectors, etc) to ensure a stable supply under certain loss circumstances, but automatic protection systems will disconnect users to preserve the integrity of the system and ensure power supply to critical infrastructure in the event of an exceptional event.

This system is known as the Low Frequency Demand Disconnection (LFDD), and it’s reported to have functioned as expected on 9 August. The Electricity Supply Emergency Code makes provision for critical infrastructure, such as airports and hospitals, to be registered as ‘Protected Sites’ and avoid disconnection. However, it would appear that, due to an administrative oversight, Newcastle Airport was not registered in this scheme.

The exact root cause of Ipswich Hospital’s issues remains a conundrum. The site’s internal protection systems were triggered within the same timeframe of the incident, yet UK Power Networks has stated that the hospital wasn’t part of the LFDD protection zone and that the substations supplying the hospital were unaffected.

However, the cause of impact to patients has been identified. An initial investigation by the hospital’s management has reported that, when the power protection systems were activated, all eleven of the back-up power generators kicked in immediately. Unfortunately, a faulty battery on one of the generators failed to switch the supply to the back-up and the main outpatients, X-ray and pathology areas of the hospital were left without power.

Complex socio-economic system

Dr Sandra Bell

Dr Sandra Bell

The concept of ‘resilience’ in complex socio-economic systems reliant on technology isn’t new, but it’s something that’s hard to regulate as it’s subjective and involves the combined effort of technology, systems, people, processes, leadership and culture.

However, if we’re to avoid more disruptions of the type we saw at the beginning of August this year then we need to change the way in which we incentivise infrastructure investment. Rather than simply promoting grid reliability that focuses effort on preventing a disruptive event from occurring, we also need to promote energy sector resilience to ensure that power generators, distributors and those organisations such as transport and health sector organisations that convert power into citizen services can continue to provide goods and services to the communities that rely upon them, regardless of the occurrence of disruptive events.

Dr Sandra Bell is Head of Resilience Consulting at Sungard Availability Services

You may also like