Monday 5 November 2012

Paradigms of failure in brigde design, or learning from failing

I've read a book some time ago - Design Paradigms: Case Histories of Error and Judgement in Engineering by Henry Petroski. It is mostly about bridge engineering (did you know there is a major bridge failure in US/UK every 30 years?)

"Paradigms" Petroski talks about turned out very much applicable to software engineering - to topics of security, availability and so on. Here is the summary of the paradigms, so that you do not have to read the book (unless you're into of bridge architecture)
  1. Conceptual errors. Fundamental errors made at the conceptual design stage are the most elusive. They manifest only when the prototype is tested (=too late), often with disastrous results. They are invariably human errors and by definition cannot be prevented.
  2. Overlooking effects of scale. Every design can be scaled only to a certain limit, after which the initial assumptions are no longer valid and a failure occurs. If you are scaling a successful design or a model, be mindful that this limit exists. You probably will not know what this limit is exactly.
  3. Design change for the worse. This paradigm involves improvements over existing safe designs without re-evaluating the original design constraints. Any such change can introduce a new failure mode. Any change, no matter how seemingly benign or beneficial, needs to be analysed with the objectives of the original design in mind. An "improved" or enlarged design could hold unpleasant surprises over the original.
  4. Blind spots are preconceived ideas about failure modes that drive analysis or design of the system, while other failure modes are ignored. The point here is that no hypothesis can ever be proved incontrovertibly, yet it takes only one failure (in analysis or reality) to provide a counterexample.
  5. False confirmations. An incorrect formula for design is arrived to from wrong assumptions, yet due to a large initial safety factor it is "confirmed" by subsequent designs. Following this "success", the safety factor is gradually reduced to the stage when it no longer compensates for the wrong results and failure occurs.
  6. Tunnel vision in design. Not considering failure outside of the narrow confines of the principal design challenge to the same degree as inside it. The designer needs a special effort to step back from each design and consider more mundane, less challenging aspects of the problem, those appearing to lie on the periphery of the central focus.
  7. Not considering failure seriously. Document expected failure: what failure modes were anticipated in the design, what failure criteria were employed, what failure avoidance strategies were incorporated. Do not ignore case histories of failure and do not misuse them either - often such histories are used only to justify extrapolations to larger and lighter structures.

TL;DR failure is the only real learning tool in engineering - without failing there is no learning, but blind luck or "confirmation" of wrong theories.