Normal Accidents LO12159 -Comments

JOE_PODOLSKY@HP-PaloAlto-om4.om.hp.com
Tue, 21 Jan 97 09:54:24 -0800

Replying to LO12048 --

Some excellent comments on Jottings #68, "Normal Accidents"

Joe

______________________________ Forward Header __________________________________

Dear Joe,

As a, sometime, engineer I appreciate engineering solutions. In my
time as a systems designer, I used to feel terrible when something, I
had created, went wrong. If it was a design fault, I was inconsolable.

That was a long time ago. More recently I read a book which put things
into perspective for me [To Engineer is Human; Henry Petroski.(1985)].
Using examples from the Civil Engineering discipline, the author
explains how we base current designs on what has worked in the past
combined with innovations we believe make possible, what was
impossible. This process continues until something goes wrong. The
concepts translate well into all the areas I am familiar with,
including management and education.

He continues the idea in a later book, which I read recently [Design
Paradigms, Case Histories of Error and Judgement in Engineering.
(1994)]. His thesis is that we have learned more about design from
failure than we have from success, and that human nature hasn't
changed that much in thousands of years.

"When something continues to work, it does so only because it hasn't
yet encountered the conditions that will cause it to fail"

However much resilience, fault tolerance and error correction we build
into a system, there is always the potential for failure, resulting in
the "interesting" conditions you mention. Non-stop computer systems
that stop, unsinkable ocean liners that sink etc.

When we talk about failures, we need to separate people from their
creations, for two important reasons. When our creations fail, they
don't try to avoid the blame and also they don't cover up. They leave
diagnosis to people. When people fail, the process is the opposite.

I'm not so sure that I would envy the lot of the Amish people. I
happen to believe in specialisation. Getting on a Boeing 747 which has
a professional crew worries me far less than drawing lots, on which
passenger should fly it, would! The same goes for brain surgery!

I accept Phillip's interpretation of the evidence concerning the
"experts" and "novices" but, all things being equal, my money is on a
"switched on" expert, rather than a "switched on" novice.

Finally I've been in the business long enough to remember when systems
were "simple", compared to the modern variety. We still lost sleep over
them. But, if we want a current example of complexity versus
simplicity, we have to look no further than the recent spate of Round
the World Ballooning attempts. Fascinating that the smallest, simplest
attempt was the most successful. It's a good job that the Wright
brothers tried a string and sealing wax jobby, rather than building a
'747 and trying to get that off the ground.

Very best wishes,

Bryce Kearey. HP - UK Customer Support.
----------------------------------

===========================================================================

Joe,

The subject of simplicity vs complexity is exactly the issue in the
execution of a disciplined IT Architecture planning process. In our
industry there are approximately 50 categories of technology (see the HP
detailed Open Environment Computing - Enterprise Solution Model) -- e.g.
User Interface Presentation Services, Information Management - Databases,
etc.. Fitting into these categories are approximately 200 generic
component categories (e.g. Relational Database Manager, recoverable
queueing software, transaction monitor,etc.). There are virtually
thousands of products that fit into these component categories (e.g.
Databases - Oracle, Sybase, Informix, Access 97, etc.).

A 'simple' CONSERVATIVE model of how much it costs in 'time' to research,
analyze, acquire, train-on, and support a 'single' technology like Oracle,
MS Word, etc. reveals a lot.

Irrespective of the out-right cost of the technolgy itself, it costs
about $150,000.00 to research, analyze, acquire, train-on, and support a
technology over a 3 year period.

If a good sized business has approximately 100 technologies associated with
it's infrastructure and applications portfolio, then you can expect the
accumulated costs to be 100 X $150k or $15,000,000 IRRESPECTIVE of product
price. This ASSUMES that as an organzation you ONLY allow one (1)
technology per generic component category (for example, you only allow
Oracle as a Database component and not BOTH Oracle and Informix).

If you allow two (2) products to be used for each component category then
you have incurred a $30,000,000 expense , right? But wait! If the data in
the Oracle database is required by an application written for the Informix
database then we may need additional software to adapt! This means we may
have another piece of software to integrate with the infrastructure. The
expense can be increasing geometrically!

As you can see... the allowing of additional options without carefully
scrutinizing the effects can have dramatic economic impact. Moreover,
there is additional stress on the people and time management aspects of the
organization.

We in the computer industry continually advise customers of the advantages
of STANDARDS. A well thought out Infrastructure or Application
Architecture can cost in the range from $150,000 to $600,000 depending on
the size of the project. It is fairly easy to see that this cost can be
recovered if the architecture reduces the number of technology options by
only 6 technologies across the enterprise.

As easy as this is to see we still have customers who will develop
enterprise information systems infrastructures and applications with no
architecture.

I wonder how many of our spouses would be happy with a house that we had
built by our contractor without an architectural blueprint and 'simply' a
day by day list of the next steps to be done?

Regards,,

Mike Short
IT Architect, HP - PSO
1-567-8594
===========================================================================

Joe, a very thought provoking article. I have observed many such
situations in microcosm throughout my career, as teams scramble to find out
why they lost a customer, stuffed the wrong paycheck in an envelope, failed
to detect product defects, and so on. Analyses of these situations often
lead to what I call "institutionalized failure" -- the creation of a
system, process, or protocol that "makes sure THAT never happens again."
Of course, the failure that occurred may only have occurred .001% of the
time, and only in a very specific situation, but the team nonetheless
designs a new system that is optimized around avoidance of that specific
failure rather than the original objective.

These situations are usually highly emotionally charged, which undoubtedly
factors into the solution. They are also often accompanied by someone
powerful saying, "Don't let THAT happen again or ELSE." The problem over
time is that things go wrong, mistakes happen, and these suboptimized
solutions lead to systems that truly do break down more often because they
aren't optimized correctly. I think it is a rare talent to be able to lead
a team through some sort of crisis without letting the tendency to
over-correct for the failure take over.

Dave Clarke
W.L. Gore & Associates

-- 

JOE_PODOLSKY@HP-PaloAlto-om4.om.hp.com

Learning-org -- An Internet Dialog on Learning Organizations For info: <rkarash@karash.com> -or- <http://world.std.com/~lo/>