On Monday morning, the Amazon Web Services (AWS) shutdown left crucial web tools inaccessible due to the outage. Many instructors on campus exercised leniency with their students in response. Students in the University of California, Riverside (UCR) assistant professor Rich Yueh’s BUS 101 – Information Technology Management class, for example, were told regarding their review assignments, “Hold onto it for now, wait until Canvas opens. We will waive late penalties.”

Globally, major social media platforms, e-commerce websites and financial services were disrupted, among Amazon’s own internal operations. Some individuals were even awoken in the middle of their sleep due to malfunctioning mattresses that relied on AWS technology to control bed positioning and temperature regulation.

The loss of crucial campus resources, including learning management systems, video conferencing platforms and workplace communication tools interrupted the daily flow of life on- and off-campus for thousands of students and faculty on campus.

Services were restored Monday afternoon. In the days following, AWS re-evaluated its systems, as well as issued a post-event summary apologizing for the inconvenience caused to clients, pledging its commitment to prevent similar issues in the future, or should another problem occur, to reduce system turnaround time. It reported that the outage was caused by a Domain Name System error in a DynamoDB database system at its Virginia-based US-East-1 data center that required manual correction.

For institutions like UCR, whose data storage and infrastructure rely heavily on cloud-based services such as those hosted on AWS, the outage serves as a cogent warning of how one technical failure can trigger a domino effect of disruptions. The outage has left businesses and individuals wondering how they can “outage-proof” themselves from unreliable web systems moving forward.

Author