Friday, July 10, 2015

Massive corporate outages: lessons from the past

So, the NYSE trading floor outage on Wednesday was due to a “botched move” – colloquial for “elevation” or “promotion”. 

News stories have compared modern Internet systems uptime reliability to Ma Bell’s in the monopolist past – with the enormous manpower it took to keep reliability.

United Airlines's problems were due to a bad router.  WSJ's were unexplained. 

I remember how we used to brace for “moves” – with a Wednesday noon deadline for freezing changes, with most elevations only on Fridays for weekend cycles. In the early 1990s, most shops adopted change-control management (like CA-Librarian) that froze various components and guaranteed move integrity,

It’s also interesting right now how recollections from my last two years in a customer service workbench support (2000-2001) helps me solve problems “on the road” today.

