Back to stories
Industry

AWS and NVIDIA Announce Million-GPU Deployment in Expanded AI Infrastructure Partnership

Michael Ouroumis2 min read
AWS and NVIDIA Announce Million-GPU Deployment in Expanded AI Infrastructure Partnership

Amazon Web Services and NVIDIA announced a significantly expanded partnership at GTC 2026, with AWS committing to deploy more than one million NVIDIA GPUs across its cloud regions — a scale of AI infrastructure investment that underscores just how aggressively hyperscalers are racing to meet enterprise AI demand.

A Million GPUs and Counting

The headline number is staggering: over one million NVIDIA GPUs deployed across AWS regions starting in 2026. The commitment makes AWS one of the largest single customers for NVIDIA's AI accelerators and positions Amazon's cloud division to capture a growing share of enterprise AI workloads that require massive compute.

The deployment spans multiple NVIDIA GPU generations. Notably, AWS will be the first major cloud provider to offer NVIDIA RTX PRO 4500 Blackwell Server Edition GPUs on Amazon EC2, giving developers access to the latest Blackwell architecture for inference and training workloads.

Nemotron Models Come to Bedrock

Beyond hardware, the partnership brings NVIDIA's open AI models directly into Amazon's managed AI platform. NVIDIA Nemotron 3 Super — a hybrid mixture-of-experts model with 120 billion total parameters but only 12 billion active per forward pass — is coming to Amazon Bedrock. The model is designed for complex multi-agent workloads including software development, cybersecurity triaging, and extended reasoning tasks.

The smaller Nemotron 3 Nano is already available on Amazon Bedrock, and is also available separately within Salesforce Agentforce on NVIDIA's own infrastructure. Critically, developers will soon be able to fine-tune Nemotron models directly on Bedrock using reinforcement fine-tuning, lowering the barrier to customizing open models for enterprise-specific use cases.

Infrastructure Innovations

The collaboration extends into infrastructure optimization. AWS is integrating NVIDIA NIXL for interconnect acceleration to support disaggregated large language model inference on its Elastic Fabric Adapter network. The companies also highlighted 3x faster Apache Spark performance using Amazon EMR on Elastic Kubernetes Service with EC2 G7e instances — a meaningful improvement for data engineering pipelines that feed AI workloads.

The Bigger Picture

This partnership fits within NVIDIA CEO Jensen Huang's broader GTC narrative about the "inference inflection" — the idea that AI has crossed the threshold where it can do productive work at scale, and the bottleneck is now infrastructure supply rather than model capability.

For enterprises evaluating where to run AI workloads, the AWS-NVIDIA expansion means more GPU availability, tighter model integration, and a clearer path from prototyping on managed services to production-scale deployment. As AI spending accelerates — industry analysts project total global AI spending to reach $2.5 trillion by the end of 2026, with AI infrastructure accounting for approximately $1.37 trillion of that total — partnerships of this scale will determine which cloud platforms capture the next wave of enterprise AI adoption.

Learn AI for Free — FreeAcademy.ai

Take "AI for Business: Practical Implementation" — a free course with certificate to master the skills behind this story.

More in Industry

Eli Lilly Bets $2.25B on Profluent's AI-Designed Gene Editors in Beyond-CRISPR Deal
Industry

Eli Lilly Bets $2.25B on Profluent's AI-Designed Gene Editors in Beyond-CRISPR Deal

Eli Lilly inked a research collaboration worth up to $2.25 billion with Bezos-backed AI biotech Profluent to develop custom site-specific recombinases — enzymes designed by generative models to perform large-scale DNA editing that current CRISPR tools cannot.

6 min ago2 min read
AWS Unveils Amazon Quick, Connect Agentic AI Suite, and Bedrock Managed Agents Powered by OpenAI
Industry

AWS Unveils Amazon Quick, Connect Agentic AI Suite, and Bedrock Managed Agents Powered by OpenAI

At its April 28 'What's Next with AWS' event, Amazon turned Connect into a four-product agentic AI family, debuted desktop assistant Amazon Quick, and previewed Bedrock Managed Agents running OpenAI's frontier models on AWS infrastructure.

3 hours ago2 min read
Anthropic Opens Sydney Office, Builds on Australian Government MOU as Hourmouzis Takes ANZ Helm
Industry

Anthropic Opens Sydney Office, Builds on Australian Government MOU as Hourmouzis Takes ANZ Helm

Anthropic officially opened its Sydney office this week, naming former Snowflake executive Theo Hourmouzis as General Manager for Australia and New Zealand and reinforcing an earlier-April memorandum of understanding with the Australian government on AI deployment.

4 hours ago3 min read