- Home
- Atul Gawande
The Checklist Manifesto Page 7
The Checklist Manifesto Read online
Page 7
So what did they do? They did not scrap the building or shrink it to a less ambitious size. Instead, McNamara proposed a novel solution called a "tuned mass damper." They could, he suggested, suspend an immense four-hundred-ton concrete block from huge springs in the building's crown on the fifty-ninth floor, so that when wind pitched the building one way, the block would swing the other way and steady it.
The solution was brilliant and elegant. The engineers did some wind-tunnel testing with a small model of the design, and the results were highly reassuring. Nonetheless, some chance of error and unpredictability always remains in projects of this complexity. So the builders reduced their margin of error the best way they knew how--by taking a final moment to make sure that everyone talked it through as a group. The building owner met with the architect, someone from the city buildings department, the structural engineers, and others. They reviewed the idea and all the calculations behind it. They confirmed that every concern they could think of had been addressed. Then they signed off on the plan, and the skyscraper was built.
It is unnerving to think that we allow buildings this difficult to design and construct to go up in the midst of our major cities, with thousands of people inside and tens of thousands more living and working nearby. Doing so seems risky and unwise. But we allow it based on trust in the ability of the experts to manage the complexities. They in turn know better than to rely on their individual abilities to get everything right. They trust instead in one set of checklists to make sure that simple steps are not missed or skipped and in another set to make sure that everyone talks through and resolves all the hard and unexpected problems.
"The biggest cause of serious error in this business is a failure of communication," O'Sullivan told me.
In the Citicorp building, for example, the calculations behind the designs for stabilizing the building assumed the joints in those giant braces at the base of the building would be welded. Joint welding, however, is labor intensive and therefore expensive. Bethlehem Steel, which took the contract to erect the tower, proposed switching to bolted joints, which are not as strong. They calculated that the bolts would do the job. But, as a New Yorker story later uncovered, their calculations were somehow not reviewed with LeMessurier. That checkpoint was bypassed.
It is not certain that a review would have led him to recognize a problem at the time. But in 1978, a year after the building opened, LeMessurier, prompted by a question from a Princeton engineering student, discovered the change. And he found it had produced a fatal flaw: the building would not be able to withstand seventy-mile-an-hour winds--which, according to weather tables, would occur at least once every fifty-five years in New York City. In that circumstance, the joints would fail and the building would collapse, starting on the thirtieth floor. By now, the tower was fully occupied. LeMessurier broke the news to the owners and to city officials. And that summer, as Hurricane Ella made its way toward the city, an emergency crew worked at night under veil of secrecy to weld two-inch-thick steel plates around the two hundred critical bolts, and the building was secured. The Citicorp tower has stood solidly ever since.
The construction industry's checklist process has clearly not been foolproof at catching problems. Nonetheless, its record of success has been astonishing. In the United States, we have nearly five million commercial buildings, almost one hundred million low-rise homes, and eight million or so high-rise residences. We add somewhere around seventy thousand new commercial buildings and one million new homes each year. But "building failure"--defined as a partial or full collapse of a functioning structure--is exceedingly rare, especially for skyscrapers. According to a 2003 Ohio State University study, the United States experiences an average of just twenty serious "building failures" per year. That's an annual avoidable failure rate of less than 0.00002 percent. And, as Joe Salvia explained to me, although buildings are now more complex and sophisticated than ever in history, with higher standards expected for everything from earthquake proofing to energy efficiency, they take a third less time to build than they did when he started his career.
The checklists work.
4. THE IDEA
There is a particularly tantalizing aspect to the building industry's strategy for getting things right in complex situations: it's that it gives people power. In response to risk, most authorities tend to centralize power and decision making. That's usually what checklists are about--dictating instructions to the workers below to ensure they do things the way we want. Indeed, the first building checklist I saw, the construction schedule on the right-hand wall of O'Sullivan's conference room, was exactly that. It spelled out to the tiniest detail every critical step the tradesmen were expected to follow and when--which is logical if you're confronted with simple and routine problems; you want the forcing function.
But the list on O'Sullivan's other wall revealed an entirely different philosophy about power and what should happen to it when you're confronted with complex, nonroutine problems--such as what to do when a difficult, potentially dangerous, and unanticipated anomaly suddenly appears on the fourteenth floor of a thirty-two-story skyscraper under construction. The philosophy is that you push the power of decision making out to the periphery and away from the center. You give people the room to adapt, based on their experience and expertise. All you ask is that they talk to one another and take responsibility. That is what works.
The strategy is unexpectedly democratic, and it has become standard nowadays, O'Sullivan told me, even in building inspections. The inspectors do not recompute the wind-force calculations or decide whether the joints in a given building should be bolted or welded, he said. Determining whether a structure like Russia Wharf or my hospital's new wing is built to code and fit for occupancy involves more knowledge and complexity than any one inspector could possibly have. So although inspectors do what they can to oversee a building's construction, mostly they make certain the builders have the proper checks in place and then have them sign affidavits attesting that they themselves have ensured that the structure is up to code. Inspectors disperse the power and the responsibility.
"It makes sense," O'Sullivan said. "The inspectors have more troubles with the safety of a two-room addition from a do-it-yourselfer than they do with projects like ours. So that's where they focus their efforts." Also, I suspect, at least some authorities have recognized that when they don't let go of authority they fail. We need look no further than what happened after Hurricane Katrina hit New Orleans.
At 6:00 a.m., on August 29, 2005, Katrina made landfall in Plaquemines Parish in New Orleans. The initial reports were falsely reassuring. With telephone lines, cell towers, and electrical power down, the usual sources of information were unavailable. By afternoon, the levees protecting the city had been breached. Much of New Orleans was under water. The evidence was on television, but Michael Brown, the director of the Federal Emergency Management Agency, discounted it and told a press conference that the situation was largely under control.
FEMA was relying on information from multiple sources, but only one lone agent was actually present in New Orleans. That agent had managed to get a Coast Guard helicopter ride over the city that first afternoon, and he filed an urgent report the only way he could with most communication lines cut--by e-mail. Flooding was widespread, the e-mail said; he himself had seen bodies floating in the water and hundreds of people stranded on rooftops. Help was needed. But the government's top officials did not use e-mail. And as a Senate hearing uncovered, they were not apprised of the contents of the message until the next day.
By then, 80 percent of the city was flooded. Twenty thousand refugees were stranded at the New Orleans Superdome. Another twenty thousand were at the Ernest N. Morial Convention Center. Over five thousand people were at an overpass on Interstate 10, some of them left by rescue crews and most carrying little more than the clothes on their backs. Hospitals were without power and suffering horrendous conditions. As people became desperate for food and water, looting bega
n. Civil breakdown became a serious concern.
Numerous local officials and impromptu organizers made efforts to contact authorities and let them know what was needed, but they too were unable to reach anyone. When they finally got a live person on the phone, they were told to wait--their requests would have to be sent up the line. The traditional command-and-control system rapidly became overwhelmed. There were too many decisions to be made and too little information about precisely where and what help was needed.
Nevertheless, the authorities refused to abandon the traditional model. For days, while conditions deteriorated hourly, arguments roared over who had the power to provide the resources and make decisions. The federal government wouldn't yield the power to the state government. The state government wouldn't give it to the local government. And no one would give it to people in the private sector.
The result was a combination of anarchy and Orwellian bureaucracy with horrifying consequences. Trucks with water and food were halted or diverted or refused entry by authorities--the supplies were not part of their plan. Bus requisitions were held up for days; the official request did not even reach the U.S. Department of Transportation until two days after tens of thousands had become trapped and in need of evacuation. Meanwhile two hundred local transit buses were sitting idle on higher ground nearby.
The trouble wasn't a lack of sympathy among top officials. It was a lack of understanding that, in the face of an extraordinarily complex problem, power needed to be pushed out of the center as far as possible. Everyone was waiting for the cavalry, but a centrally run, government-controlled solution was not going to be possible.
Asked afterward to explain the disastrous failures, Michael Chertoff, secretary of Homeland Security, said that it had been an "ultra-catastrophe," a "perfect storm" that "exceeded the foresight of the planners, and maybe anybody's foresight." But that's not an explanation. It's simply the definition of a complex situation. And such a situation requires a different kind of solution from the command-and-control paradigm officials relied on.
Of all organizations, it was oddly enough Wal-Mart that best recognized the complex nature of the circumstances, according to a case study from Harvard's Kennedy School of Government. Briefed on what was developing, the giant discount retailer's chief executive officer, Lee Scott, issued a simple edict. "This company will respond to the level of this disaster," he was remembered to have said in a meeting with his upper management. "A lot of you are going to have to make decisions above your level. Make the best decision that you can with the information that's available to you at the time, and, above all, do the right thing."
As one of the officers at the meeting later recalled, "That was it." The edict was passed down to store managers and set the tone for how people were expected to react. On the most immediate level, Wal-Mart had 126 stores closed due to damage and power outages. Twenty thousand employees and their family members were displaced. The initial focus was on helping them. And within forty-eight hours, more than half of the damaged stores were up and running again. But according to one executive on the scene, as word of the disaster's impact on the city's population began filtering in from Wal-Mart employees on the ground, the priority shifted from reopening stores to "Oh, my God, what can we do to help these people?"
Acting on their own authority, Wal-Mart's store managers began distributing diapers, water, baby formula, and ice to residents. Where FEMA still hadn't figured out how to requisition supplies, the managers fashioned crude paper-slip credit systems for first responders, providing them with food, sleeping bags, toiletries, and also, where available, rescue equipment like hatchets, ropes, and boots. The assistant manager of a Wal-Mart store engulfed by a thirty-foot storm surge ran a bulldozer through the store, loaded it with any items she could salvage, and gave them all away in the parking lot. When a local hospital told her it was running short of drugs, she went back in and broke into the store's pharmacy--and was lauded by upper management for it.
Senior Wal-Mart officials concentrated on setting goals, measuring progress, and maintaining communication lines with employees at the front lines and with official agencies when they could. In other words, to handle this complex situation, they did not issue instructions. Conditions were too unpredictable and constantly changing. They worked on making sure people talked. Wal-Mart's emergency operations team even included a member of the Red Cross. (The federal government declined Wal-Mart's invitation to participate.) The team also opened a twenty-four-hour call center for employees, which started with eight operators but rapidly expanded to eighty to cope with the load.
Along the way, the team discovered that, given common goals to do what they could to help and to coordinate with one another, Wal-Mart's employees were able to fashion some extraordinary solutions. They set up three temporary mobile pharmacies in the city and adopted a plan to provide medications for free at all of their stores for evacuees with emergency needs--even without a prescription. They set up free check cashing for payroll and other checks in disaster-area stores. They opened temporary clinics to provide emergency personnel with inoculations against flood-borne illnesses. And most prominently, within just two days of Katrina's landfall, the company's logistics teams managed to contrive ways to get tractor trailers with food, water, and emergency equipment past roadblocks and into the dying city. They were able to supply water and food to refugees and even to the National Guard a day before the government appeared on the scene. By the end Wal-Mart had sent in a total of 2,498 trailer loads of emergency supplies and donated $3.5 million in merchandise to area shelters and command centers.
"If the American government had responded like Wal-Mart has responded, we wouldn't be in this crisis," Jefferson Parish's top official, Aaron Broussard, said in a network television interview at the time.
The lesson of this tale has been misunderstood. Some have argued that the episode proves that the private sector is better than the public sector in handling complex situations. But it isn't. For every Wal-Mart, you can find numerous examples of major New Orleans businesses that proved inadequately equipped to respond to the unfolding events--from the utility corporations, which struggled to get the telephone and electrical lines working, to the oil companies, which kept too little crude oil and refinery capacity on hand for major disruptions. Public officials could also claim some genuine successes. In the early days of the crisis, for example, the local police and firefighters, lacking adequate equipment, recruited an armada of Louisiana sportsmen with flat-bottom boats and orchestrated a breathtaking rescue of more than sixty-two thousand people from the water, rooftops, and attics of the deluged city.
No, the real lesson is that under conditions of true complexity--where the knowledge required exceeds that of any individual and unpredictability reigns--efforts to dictate every step from the center will fail. People need room to act and adapt. Yet they cannot succeed as isolated individuals, either--that is anarchy. Instead, they require a seemingly contradictory mix of freedom and expectation--expectation to coordinate, for example, and also to measure progress toward common goals.
This was the understanding people in the skyscraper-building industry had grasped. More remarkably, they had learned to codify that understanding into simple checklists. They had made the reliable management of complexity a routine.
That routine requires balancing a number of virtues: freedom and discipline, craft and protocol, specialized ability and group collaboration. And for checklists to help achieve that balance, they have to take two almost opposing forms. They supply a set of checks to ensure the stupid but critical stuff is not overlooked, and they supply another set of checks to ensure people talk and coordinate and accept responsibility while nonetheless being left the power to manage the nuances and unpredictabilities the best they know how.
I came away from Katrina and the builders with a kind of theory: under conditions of complexity, not only are checklists a help, they are required for success. There must always be room for judgment, but
judgment aided--and even enhanced--by procedure.
Having hit on this "theory," I began to recognize checklists in odd corners everywhere--in the hands of professional football coordinators, say, or on stage sets. Listening to the radio, I heard the story behind rocker David Lee Roth's notorious insistence that Van Halen's contracts with concert promoters contain a clause specifying that a bowl of M&M's has to be provided backstage, but with every single brown candy removed, upon pain of forfeiture of the show, with full compensation to the band. And at least once, Van Halen followed through, peremptorily canceling a show in Colorado when Roth found some brown M&M's in his dressing room. This turned out to be, however, not another example of the insane demands of power-mad celebrities but an ingenious ruse.
As Roth explained in his memoir, Crazy from the Heat, "Van Halen was the first band to take huge productions into tertiary, third-level markets. We'd pull up with nine eighteen-wheeler trucks, full of gear, where the standard was three trucks, max. And there were many, many technical errors--whether it was the girders couldn't support the weight, or the flooring would sink in, or the doors weren't big enough to move the gear through. The contract rider read like a version of the Chinese Yellow Pages because there was so much equipment, and so many human beings to make it function." So just as a little test, buried somewhere in the middle of the rider, would be article 126, the no-brown-M&M's clause. "When I would walk backstage, if I saw a brown M&M in that bowl," he wrote, "well, we'd line-check the entire production. Guaranteed you're going to arrive at a technical error.... Guaranteed you'd run into a problem." These weren't trifles, the radio story pointed out. The mistakes could be life-threatening. In Colorado, the band found the local promoters had failed to read the weight requirements and the staging would have fallen through the arena floor.