CiteWeb id: 20070000125

CiteWeb score: 2907

DOI: 10.1145/1323293.1294281

Reliability at massive scale is one of the biggest challenges we face at, one of the largest e-commerce operations in the world; even the slightest outage has significant financial consequences and impacts customer trust. The platform, which provides services for many web sites worldwide, is implemented on top of an infrastructure of tens of thousands of servers and network components located in many datacenters around the world. At this scale, small and large components fail continuously and the way persistent state is managed in the face of these failures drives the reliability and scalability of the software systems. This paper presents the design and implementation of Dynamo, a highly available key-value storage system that some of Amazon's core services use to provide an "always-on" experience. To achieve this level of availability, Dynamo sacrifices consistency under certain failure scenarios. It makes extensive use of object versioning and application-assisted conflict resolution in a manner that provides a novel interface for developers to use.