In our 24/7/365 world, computing infrastructure outages can kill a CIO’s reputation and career prospects swiftly and dramatically. Outages have attained an extremely high profile in most organizations because they visibly and quickly:
Undermine customer service.
Cause work to grind to a halt.
Undermine brand reputation.
Computing infrastructure outages occur for many reasons including:
Failing to monitor end-to-end response time.
Sloppy server management.
Gaps in configuration management processes.
External and internal network issues.
DBA finger problems.
Flaky application execution.
External and internal electrical power outages.
Scheduled maintenance taking too long.
At the recent Collision from Home virtual conference, Sebastien Stormacq, Principal Developer Advocate at Amazon Web...