It is almost a truism that organizations of very smart people are capable of very stupid things, or so it seems in hindsight. But hindsight is a dangerous thing.
In hindsight, it seemed obvious that pressurizing the Apollo Command Module with pure oxygen was foolhardy. And what happened on February 21, 1967, later struck everyone as all too inevitable. Astronauts Virgil Grissom, Edward White, and Roger Chaffee were overcome by smoke and incinerated trying futilely to escape a raging fire triggered in the module by an innocuous spark, during a ground rehearsal. They had no chance.
On April 24, 1990, the NASA shuttle Discovery lifted off with the Hubble Space Telescope aboard. It is seldom remembered today, but shortly after reaching orbit Hubble was found to be effectively blind, its primary astronomical mirror incorrectly figured and unable to focus. The contractor who had built the mirror, Perkin-Elmer, was not exactly new to the business, having built the optics for spy satellites flown by the National Reconnaissance Office. Nor was there a lack of warning flags before the mirror was shipped. They were raised and not ignored, simply misinterpreted, albeit, perhaps, willfully. This story, however, had a happy if rather expensive conclusion, as NASA was able to design a large corrective lens, which was then installed in orbit by the crew of the Endeavor three years later.
On September 23, 1999, the Mars Climate Orbiter completed a nearly year-long flight from Earth and was injected into Martian orbit, where it immediately proceeded to burn up and disintegrate. An investigation quickly found that one software program had delivered data in imperial units to another program expecting metric units. The result was to steer the Orbiter right into the Martian atmosphere. That the investigation was organized, found the source of the error, and announced its findings in little over a week, is a strong clue that the catastrophic error had lain undiscovered for a year, in plain sight.
And on January 17, 2013, the FAA issued an emergency directive immediately grounding all of the Boeing 787 “Dreamliners” operated by United Airlines, due to battery fires. The directive was immediately followed by similar actions around the world, and the aircraft remain grounded as investigations, recriminations, and rumors of litigation swirl.
Two of these engineering disasters may be appreciated in pictures. The effects of the Apollo1 fire are shown below. On the left is the interior of the Command Module before the fire. In the middle is a shot of the exterior after the incident showing burn marks where flames burst through the wall of the module. On the right is a shot of the demolished interior.
The next photo shows the lithium-ion battery that burned aboard flight JA804A which made an emergency landing at Takamatsu airport. On the left is the heavily damaged main battery. On the right is the APU battery, of the same model but undamaged.
NASA’s investigation of Apollo1 concluded that the reasons for the tragedy were:
- A sealed cabin, pressurized with an oxygen atmosphere.
- An extensive distribution of combustible materials in the cabin.
- Vulnerable wiring carrying spacecraft power.
- Vulnerable plumbing carrying a combustible and corrosive coolant.
- Inadequate provisions for the crew to escape.
- Inadequate provisions for rescue or medical assistance.
Quite a litany. And, like the bungled data transfer that cost NASA its Mars Climate Orbiter, seemingly lying in plain sight all along.
The Dreamliner fiasco is still unwinding, but the conclusions are likely to involve:
- The extremely reactive lithium-ion chemistry.
- Manufacturing defects or handling damage inducing cracks or other flaws in the electrodes of one or more cells.
- Inadequate or non-existent capability of detecting a short-circuited cell in flight.
- Close packing of individual cells facilitating propagation of fire from one to the other.
- Inadequate fire suppression.
Outsiders look at conclusions like these aghast. Heads should roll. Management took shortcuts and put costs ahead of safety. Engineers were ignored. Regulators were lax or captive of those they regulate.
But here’s the thing. Personal fault can and will be found, in some cases justifiably. But it won’t explain why things like this happen, nor will litigation or “tougher” regulation prevent future disasters. Quite the contrary.
Virtually everything we create in developed societies is the fruit of organization. Even apparently simple technology, like a game console, is astoundingly complex. Each 787 is said to contain two million discrete parts.
Big, complex projects are executed by big, complex organizations. And as the number of people who must be involved grows, the number of possible channels between them grows mathematically much faster than even the popular touchstone of “exponential growth”. They become virtually infinite, and only a small number can be comprehended and managed, a small number of channels of communication that must bear the burden of making crucial intellectual connections between disparate human minds, each grappling with a very small piece of the much bigger puzzle.
Project management is an evolving science. Concepts like “risk management” and “failure mode effects analysis” can be hard to define and confusing to the uninitiated, but mature organizations like NASA and Boeing deal with them daily, of necessity. But no matter how good we get at it, project management seems fated by our finite natures to a kind of inherent incompleteness, reminiscent of Godel’s theorems.
As long as this is so, organizations will do stupid things. It is the price we pay for their brilliance.