AWS, Fueled by Nvidia, Gets In on AI Compute Wars -- AWSInsider

AWS, Fueled by Nvidia, Gets In on AI Compute Wars

By Gladys Rama
11/28/2023

Not to be outdone (too much) by Microsoft, Amazon Web Services on Tuesday announced that it is collaborating with chip giant Nvidia on multiple AI fronts.

The announcements, made this week during AWS' 2023 re:Invent conference, further establish AWS as a key, if lagging, player in today's ongoing generative AI race. While the leader in the cloud market, AWS has been slow out of the gate in productizing its platform's AI capabilities, especially compared to Microsoft.

However, some recent AWS investments (for instance, in Claude chatbot steward Anthropic and in a potential large language model dubbed "Olympus") and product launches (like the generative AI developer platform Bedrock) have closed the gap. The newly expanded partnership with Nvidia -- which has a stranglehold on the AI chip market -- stands to make AWS even more competitive.

Nvidia AI Power Comes to EC2
For instance, AWS is bringing the massive compute power of Nvidia's GH200 Grace Hopper Superchips to its customers via its Elastic Compute Cloud (EC2) service.

This means AWS customers who need to run resource-intensive, distributed and complex AI and machine learning workloads will be able to rent the chip power to do so from AWS whenever they need it -- at a time when the availability of AI-capable chips is particularly scarce. AWS claims to be "the first cloud provider" to provide such access to its customers.

"AWS instances with GH200 NVL32 will provide customers on-demand access to supercomputer-class performance, which is critical for large-scale AI/ML workloads that need to be distributed across multiple nodes for complex generative AI workloads -- spanning FMs [foundational models], recommender systems, and vector databases," AWS said in a press release Tuesday.

Nvidia is also supporting three new EC2 instances designed for large workloads, including AI model training and inferencing, 3-D AI development, digital twins and more. Coming next year, the new EC2 instances are G6, G6e and P5e. They'll be powered by, respectively, Nvidia's L4, L40S and H200 Tensor Core chips.

'World's Fastest' AI Supercomputer
AWS is also working with Nvidia on an AI supercomputer called "Project Ceiba," which the two companies are touting as the "world's fastest GPU-powered AI supercomputer."

AWS has enabled Ceiba to integrate with its product stack, including Amazon Virtual Private Cloud and Amazon Elastic Block Store. Powering the Ceiba supercomputer are over 16,000 of Nvidia's GH200 Superchips, giving it enough horsepower to run 65 petaflops' worth of AI workloads.

When it's done, Ceiba will serve as a sandbox for Nvidia's army of researchers looking to "advance AI for LLMs, graphics (image/video/3D generation) and simulation, digital biology, robotics, self-driving cars, Earth-2 climate prediction, and more."

Notably, Nvidia has also built supercomputers with AWS rival Microsoft, including the Azure supercomputer dubbed "Eagle," which was recently rated the world's third-fastest supercomputer and the fastest one based in the cloud.

Nvidia Extends Developer Software to AWS
AWS and Nvidia are also collaborating around developer software. For instance, AWS will host Nvidia's DGX Cloud AI-training-as-a-service platform on its cloud.

"It will be the first DGX Cloud featuring GH200 NVL32, providing developers the largest shared memory in a single instance," according to AWS. "DGX Cloud on AWS will accelerate training of cutting-edge generative AI and large language models that can reach beyond 1 trillion parameters."

In addition, AWS developer customers will have access to the NeMo Retriever microservice from Nvidia. The tool lets developers "create highly accurate chatbots and summarization tools using accelerated semantic retrieval."

In a prepared statement, Nvidia CEO Jensen Huang characterized the collaboration with AWS as emblematic of the two companies' mission to bring AI to everyday customers.

"Generative AI is transforming cloud workloads and putting accelerated computing at the foundation of diverse content generation," Huang said. "Driven by a common mission to deliver cost-effective, state-of-the-art generative AI to every customer, NVIDIA and AWS are collaborating across the entire computing stack, spanning AI infrastructure, acceleration libraries, foundation models, to generative AI services."

About the Author

Gladys Rama (@GladysRama3) is the editorial director of Converge360.