Back to stories
Models

DeepSeek V4 Drops This Week — A Trillion-Parameter Multimodal Model Trained on Chinese Chips

Michael Ouroumis2 min read
DeepSeek V4 Drops This Week — A Trillion-Parameter Multimodal Model Trained on Chinese Chips

DeepSeek V4 Drops This Week — A Trillion-Parameter Multimodal Model Trained on Chinese Chips

DeepSeek, the Hangzhou-based AI lab that shook the industry with its R1 reasoning model in January 2025, is preparing to release its most ambitious model yet. V4 is a natively multimodal system capable of generating text, images, and video — and it was built on Chinese-made silicon.

What Makes V4 Different

Unlike previous models that bolted vision capabilities onto text-only foundations, V4 was trained on text, image, video, and audio data simultaneously from the ground up. This native multimodality means the model does not treat images as an afterthought — it reasons across modalities as a single integrated system.

The numbers are significant. V4 features roughly one trillion total parameters with approximately 32 billion active per forward pass, a 50% increase in total model size over V3.2 while actually reducing active parameters from 37 billion. The context window jumps to one million tokens, a major leap that positions V4 for enterprise document processing and long-form code generation.

The Hardware Story

The most geopolitically charged detail: DeepSeek partnered with Huawei and Cambricon to optimize V4 for their latest AI chips. Despite US export restrictions on advanced semiconductors to China, DeepSeek has demonstrated that Chinese hardware can support frontier model training. Whether V4's performance holds up against models trained on NVIDIA's latest GPUs will be the benchmark that matters most.

What It Targets

Internal testing suggests V4 is optimized primarily for coding and long-context software engineering tasks. DeepSeek claims it could outperform Claude and ChatGPT on long-context coding benchmarks — a claim the community will verify within days of release.

The model will be released under an open-source license, continuing DeepSeek's strategy of undercutting Western labs on both price and accessibility. For developers already running DeepSeek R2 locally, V4 represents a significant upgrade path.

Why It Matters

V4 is not just another model release. It is a proof point that frontier AI development can happen outside the NVIDIA ecosystem, that open-source multimodal models can compete with proprietary ones, and that China's AI labs are not slowing down despite regulatory pressure. The timing — coinciding with China's Two Sessions parliamentary meetings — underscores the strategic significance Beijing places on domestic AI capabilities.

The AI community should have weights in hand within days.

Learn AI for Free — FreeAcademy.ai

Take "AI Essentials: Understanding AI in 2026" — a free course with certificate to master the skills behind this story.

More in Models

xAI Launches Grok Voice Think Fast 1.0, Tops τ-Voice Bench and Powers Starlink Support
Models

xAI Launches Grok Voice Think Fast 1.0, Tops τ-Voice Bench and Powers Starlink Support

xAI's new voice model scored 67.3% on the τ-voice Bench — well ahead of Gemini 3.1 Flash Live and GPT Realtime — and is now powering Starlink's phone sales and support with a 70% autonomous resolution rate.

2 days ago2 min read
Tencent Drops Hy3 Preview: 295B Open-Source MoE Model Kicks DeepSeek Out of Yuanbao
Models

Tencent Drops Hy3 Preview: 295B Open-Source MoE Model Kicks DeepSeek Out of Yuanbao

Tencent has open-sourced Hy3 Preview, a 295B/21B-activated mixture-of-experts model built in under three months. The Yuanbao chatbot is switching its primary engine from DeepSeek to the new in-house model.

4 days ago2 min read
DeepSeek V4 Preview Lands: 1.6T-Parameter Open Model With 1M Context, Flash Pricing at $0.14/M
Models

DeepSeek V4 Preview Lands: 1.6T-Parameter Open Model With 1M Context, Flash Pricing at $0.14/M

DeepSeek on April 24 released preview versions of V4-Pro and V4-Flash, an open-weight MoE family with a 1M-token context window and pricing that undercuts Western frontier labs.

4 days ago2 min read