Lesson 05 of 06High Reliability Healthcare Systems

Designing for Reliability
Processes, Systems, and Tools

High reliability cannot be achieved through culture alone. It requires systems and processes deliberately engineered to perform consistently regardless of who is working, what the conditions are, or how fatigued or pressured people might be.

What you will learn
Apply reliability science principles to the design of healthcare processes
Explain the role of standardization in reducing harmful variation in care
Identify human factors design principles that make the right action easier than the wrong action
Use Failure Mode and Effects Analysis (FMEA) as a proactive reliability tool
Describe how checklists, forcing functions, and structured communication reduce error rates
Lesson Snapshot
Lesson05 of 6
Progress83% Complete
Est. Time~35 Minutes
Knowledge Checks5 Questions

Reliability science
in healthcare

Reliability science is the systematic study of how systems can be designed to perform their intended function consistently and without failure. Applied to healthcare, it asks: given that humans are fallible and systems are complex, how do we design processes that produce the right outcome with a consistently high degree of probability?

At 80% reliability, a process fails once in every ten opportunities. At 99% reliability, once in every hundred. At 99.9%, once in every thousand. The ambition of high reliability is to move healthcare processes from the 80–95% range — where most currently operate — toward the upper end of this spectrum for the most critical care processes.

Moving along this spectrum requires layering multiple interventions — each adding an additional defense — so that the failure of one does not immediately produce harm. This is the practical application of the Swiss cheese model: not eliminating holes, but ensuring that when one hole appears, others do not align with it.

Reliability Levels

Most healthcare processes operate at 80–95% reliability — failing 1 in 20 to 1 in 5 times. The ambition of high reliability is to move critical care processes toward 99–99.9% reliability through deliberate system design, not individual effort.

Standardization
protecting judgment by reducing unnecessary variation

Standardization is one of the most powerful and most misunderstood tools in reliable process design. It is frequently resisted in healthcare with the argument that clinical care cannot be standardized — that each patient is different and clinical judgment must prevail.

This misunderstands what standardization is for. Standardization does not eliminate clinical judgment. It protects it. By removing unnecessary variation from routine processes — how a central line is inserted, how a medication is prepared and labeled, how a surgical site is marked — standardization frees cognitive capacity for the genuinely complex decisions that require individual clinical judgment.

The WHO Surgical Safety Checklist — a standardized 19-item checklist completed before every surgical procedure — reduced surgical mortality by more than 40% in its original study across eight hospitals in eight countries. Not because surgeons did not know the items on the checklist, but because the structured process made it dramatically less likely that any item would be forgotten under pressure.

The hierarchy of
reliability interventions

Not all reliability interventions are equally effective. Human factors science describes a hierarchy of interventions, ranked from least to most reliable in terms of their ability to prevent errors independent of human performance.

At the bottom sit education and training — the most commonly used interventions in healthcare and the least reliable, because they depend entirely on individuals remembering and applying what they were taught. Moving up the hierarchy are warnings and reminders — alerts, checklists, and signage that prompt the right behavior. More reliable than training alone, but still dependent on human attention.

Higher still are forcing functions — system designs that make the wrong action physically impossible or automatically prevented. A medication dispensing system that will not release a drug without pharmacist verification. An anesthetic machine that will not allow a hypoxic gas mixture. These interventions work regardless of how tired or pressured the individual is, because they remove the human from the error pathway.

FMEA — Failure Mode and Effects Analysis — applies this thinking prospectively. Before a new process is implemented, teams ask: in what ways could this process fail? For each failure mode, what is the likelihood, detectability, and potential harm? Those with the highest combined risk score become priority for redesign — before any patient is harmed.

Hierarchy of Interventions (Least to Most Reliable)

Education and training → Warnings and reminders → Checklists and protocols → Double-checks and verification → Forcing functions and automation. Most healthcare safety programs focus on the least reliable end of this hierarchy.

Key concepts
from this lesson

Key Concept

Reliability Science

The systematic design of systems to perform their intended function consistently, with minimal failure across repeated opportunities.

Key Concept

Standardization

The reduction of unnecessary variation in routine processes — freeing clinical judgment for genuinely complex decisions.

Key Concept

Forcing Functions

System designs that make the wrong action physically impossible or automatically prevented, independent of human choice.

Key Concept

FMEA

Failure Mode and Effects Analysis — a proactive tool for identifying and prioritizing potential failure modes before they cause harm.

Key Concept

Reliability Hierarchy

A ranking of interventions from least reliable (education) to most reliable (forcing functions) in preventing errors.

Key Concept

Checklists

Structured tools that ensure critical steps are completed consistently — proven effective in surgery, aviation, and intensive care.

Case Study

The central line bundle that saved lives

Central line-associated bloodstream infections (CLABSIs) were among the most common and deadly healthcare-associated infections in intensive care units — killing thousands of patients annually and costing healthcare systems billions.

Peter Pronovost and colleagues at Johns Hopkins Hospital developed a simple five-item checklist for central line insertion: wash hands; clean the patient's skin with chlorhexidine; use full sterile barrier precautions; avoid the femoral site if possible; and remove unnecessary lines promptly.

None of these items was new. Clinicians already knew all five. The evidence for each had existed for years. The problem was not knowledge — it was consistent execution. Without a structured process, one or more items was skipped in roughly 30–40% of insertions.

When the checklist was implemented across Michigan ICUs — with nurses explicitly empowered to stop a procedure if any item was skipped — CLABSI rates dropped by 66% within three months. Estimated lives saved: over 1,500 in 18 months. The intervention was not new knowledge. It was reliable execution of existing knowledge through a standardized process.

What this illustrates

The Michigan Keystone Project demonstrated that the gap between knowing what safe practice looks like and consistently delivering it can be closed — not by telling clinicians to try harder, but by designing a system that makes consistent execution the path of least resistance.

Reflection Prompt

Where is memory carrying too much in your organization?

Identify one high-risk process in your work setting that currently relies heavily on individual memory, vigilance, or skill to go right every time. What would a checklist, a forcing function, or a structured double-check look like for that process? What would need to change — in culture, workflow, or resources — to make standardization possible? And what is the cost, in human terms, of the variation that currently exists?

IHI Open School — Further Learning

QI 101, 102, and 103 — Introduction to Healthcare Improvement, Using the Model for Improvement, and Testing Changes — provide the improvement methodology that underpins reliable process design. Available at ihi.org.

Knowledge Check — Lesson 05

1. A hospital implements a standardized pre-surgical checklist requiring verification of patient identity, surgical site, and consent before every procedure. A senior surgeon argues that this is unnecessary for experienced teams. What is the strongest counter-argument?

AChecklists are required by accreditation standards and must be followed regardless of experience
BExperienced teams make errors for the same reasons as inexperienced teams — cognitive overload, distraction, and time pressure — and checklists mitigate these risks regardless of seniority
CChecklists eliminate the need for clinical judgment and make procedures safer by removing human decision-making
DSenior surgeons have a legal obligation to follow standardized protocols under healthcare law

2. A pharmacy redesigns its dispensing system so that high-alert medications can only be withdrawn after a dual verification step is completed electronically by two pharmacists. This is an example of:

AAn administrative control relying on staff vigilance
BA training-based reliability intervention
CA forcing function that prevents the error pathway without relying on human choice
DA warning system that prompts correct behavior through reminders

3. Failure Mode and Effects Analysis (FMEA) is best described as:

AA retrospective tool for investigating what went wrong after a serious safety event
BA proactive tool for identifying and prioritizing potential failure modes in a process before harm occurs
CA staff competency assessment tool used during performance reviews
DA method for calculating the financial cost of healthcare-associated harm

4. In the hierarchy of reliability interventions, which type is considered LEAST reliable in preventing errors?

AAutomated forcing functions that prevent the wrong action
BDouble-checks and independent verification steps
CChecklists and structured protocols
DEducation and training programs

5. The WHO Surgical Safety Checklist was found to reduce surgical mortality by more than 40%. The primary reason for this improvement was:

AIt introduced new surgical knowledge to clinicians who were previously unaware of best practices
BIt replaced individual clinical judgment with a standardized algorithm for surgical decision-making
CIt ensured consistent execution of existing best practices that were already known but not reliably applied
DIt shifted responsibility for surgical safety from the surgeon to the nursing team