Back to stories
Industry

Nvidia Is Building a Secret Inference Chip With Groq Tech — And OpenAI Is the First Customer

Michael Ouroumis2 min read
Nvidia Is Building a Secret Inference Chip With Groq Tech — And OpenAI Is the First Customer

Nvidia has dominated AI training for years. Now it wants to own inference too — and it's using $20 billion worth of acquired technology to do it.

The Secret Chip

According to a report from SiliconANGLE, Nvidia is preparing to unveil a new inference-focused processor at its annual GTC developer conference in San Jose later this month. The chip integrates Language Processing Unit (LPU) architecture that Nvidia licensed from Groq Inc. in December for $20 billion, along with hiring Groq's founding CEO Jonathan Ross and President Sunny Madra.

Groq's LPU architecture takes a fundamentally different approach to inference. Instead of repurposing GPUs designed for training, LPUs are built from the ground up to decode language model outputs with dramatically lower latency and energy consumption.

OpenAI Signs On First

The biggest signal of the chip's potential: OpenAI has already committed as the lead customer. The deal includes a massive purchase of dedicated inference capacity, backed by a $30 billion investment from Nvidia into OpenAI's infrastructure. That's not a research partnership — it's a production-scale commitment.

For OpenAI, which runs ChatGPT for over 900 million users, inference costs dwarf training costs. A chip purpose-built for fast, efficient model serving could meaningfully change the economics of running frontier models at consumer scale.

Why Inference Matters Now

The AI industry has reached an inflection point. Training the biggest models still requires enormous GPU clusters, but the real cost center has shifted. Every ChatGPT response, every Copilot suggestion, every Claude conversation is an inference workload. Companies are spending more on running models than building them.

Nvidia currently controls over 90% of the GPU market for AI training, but inference is more competitive. AMD, Intel, AWS custom silicon, and startups like Cerebras are all targeting the inference market. The Groq acquisition gives Nvidia a purpose-built architecture rather than just optimizing existing GPUs.

What to Watch at GTC

GTC 2026 runs later this month and is expected to be Nvidia's biggest product launch since the Blackwell architecture. Beyond the inference chip, CEO Jensen Huang is expected to detail the full Rubin platform roadmap and new software tools for agentic AI workloads.

The inference chip could reshape how AI companies budget their compute. If it delivers on the efficiency promises of Groq's LPU architecture, running frontier models just got a lot cheaper.

Learn AI for Free — FreeAcademy.ai

Take "AI for Business: Practical Implementation" — a free course with certificate to master the skills behind this story.

More in Industry

Eli Lilly Bets $2.25B on Profluent's AI-Designed Gene Editors in Beyond-CRISPR Deal
Industry

Eli Lilly Bets $2.25B on Profluent's AI-Designed Gene Editors in Beyond-CRISPR Deal

Eli Lilly inked a research collaboration worth up to $2.25 billion with Bezos-backed AI biotech Profluent to develop custom site-specific recombinases — enzymes designed by generative models to perform large-scale DNA editing that current CRISPR tools cannot.

6 min ago2 min read
AWS Unveils Amazon Quick, Connect Agentic AI Suite, and Bedrock Managed Agents Powered by OpenAI
Industry

AWS Unveils Amazon Quick, Connect Agentic AI Suite, and Bedrock Managed Agents Powered by OpenAI

At its April 28 'What's Next with AWS' event, Amazon turned Connect into a four-product agentic AI family, debuted desktop assistant Amazon Quick, and previewed Bedrock Managed Agents running OpenAI's frontier models on AWS infrastructure.

3 hours ago2 min read
Anthropic Opens Sydney Office, Builds on Australian Government MOU as Hourmouzis Takes ANZ Helm
Industry

Anthropic Opens Sydney Office, Builds on Australian Government MOU as Hourmouzis Takes ANZ Helm

Anthropic officially opened its Sydney office this week, naming former Snowflake executive Theo Hourmouzis as General Manager for Australia and New Zealand and reinforcing an earlier-April memorandum of understanding with the Australian government on AI deployment.

4 hours ago3 min read