Lessons Learned from AWS Cloud Powering Amazon.com's 'Biggest Day Ever'
Amazon.com is just another customer to Amazon Web Services Inc. (AWS), the cloud computing platform that powered the online store during the recent record-setting Prime Day 2016 sale.
AWS spokesperson Jeff Barr today reported on how the cloud platform held up under the stress of new levels of global e-commerce prompted by the sale, which reportedly accounted for about three-quarters of all U.S. consumer e-commerce on July 11.
Although Amazon.com is just another customer to AWS, the two entities continually communicate with one another, which helped AWS allocate more EC2 computing resources to handle store Web traffic that reached new levels, generating 85 billion clickstream log entries.
Including EC2 computing resources, the sale prompted an uptick in the use of 38 different services, spanning analytics (which saw a 1,661 percent increase in events) to storage and content delivery.
"AWS enables customers to add the capacity required to power big events like Prime Day, and enables this capacity to be acquired in a much more elastic, cost-effective manner," Barr said in a blog post today. "All of the undifferentiated heavy lifting required to create an online event at this scale is now handled by AWS so the Amazon retail team can focus on delivering the best possible experience for its customers."
Barr didn't address numerous media reports of technical glitches reported during the sale, such as the inability to add products to customers' shopping carts reported by Money. According to a US News & World Report article, an Amazon spokesperson said an issue with certain Lightning Deals (limited-time offers) checkouts was resolved during the sale.
While not commenting on those reported problems, Barr said, "The Amazon retail team was happy that Prime Day was over, and ready for some rest, but they shared some of what they learned with me."
Those lessons learned, which might be beneficial to other customers planning big events, include:
- Prepare: Planning and testing are essential. Use historical metrics to help forecast and model future traffic, and to estimate your resource needs accordingly. Prepare for failures with GameDay exercises -- intentionally breaking various parts of the infrastructure and the site in order to simulate several failure scenarios (read Resilience Engineering -- Learning to Embrace Failure to learn more about GameDay exercises at Amazon).
- Automate: Reduce manual efforts and automate everything. Take advantage of services that can scale automatically in response to demand -- Route53 to automatically scale your DNS, Auto Scaling to scale your EC2 capacity according to demand, and Elastic Load Balancing for automatic failover and to balance traffic across multiple regions and availability zones (AZs).
- Monitor: Use Amazon CloudWatch metrics and alarms liberally. CloudWatch monitoring helps you stay on top of your usage to ensure the best experience for your customers.
- Think Big: Using AWS gave the team the resources to create another holiday season. Confidence in your infrastructure is what enables you to scale your big events.
Again emphasizing communication and AWS support plans, Barr advised AWS customers to contact the cloud service to prepare for any large-scale, one-time events.
David Ramel is an editor and writer for Converge360.