DeepSeek V4 is a natively multimodal large language model with approximately one trillion total parameters that can generate text, images, and video. It uses a mixture-of-experts architecture with roughly 32 billion active parameters per token.

When is DeepSeek V4 being released?

DeepSeek is expected to release V4 during the first week of March 2026, timed to coincide with China's annual parliamentary Two Sessions meetings.

Will DeepSeek V4 be open source?

Yes. DeepSeek plans to release V4 under an open-source license, allowing developers worldwide to fine-tune and deploy the model without licensing fees.

DeepSeek V4 Drops This Week — A Trillion-Parameter Multimodal Model Trained on Chinese Chips

DeepSeek, the Hangzhou-based AI lab that shook the industry with its R1 reasoning model in January 2025, is preparing to release its most ambitious model yet. V4 is a natively multimodal system capable of generating text, images, and video — and it was built on Chinese-made silicon.

What Makes V4 Different

Unlike previous models that bolted vision capabilities onto text-only foundations, V4 was trained on text, image, video, and audio data simultaneously from the ground up. This native multimodality means the model does not treat images as an afterthought — it reasons across modalities as a single integrated system.

The numbers are significant. V4 features roughly one trillion total parameters with approximately 32 billion active per forward pass, a 50% increase in total model size over V3.2 while actually reducing active parameters from 37 billion. The context window jumps to one million tokens, a major leap that positions V4 for enterprise document processing and long-form code generation.

The Hardware Story

The most geopolitically charged detail: DeepSeek partnered with Huawei and Cambricon to optimize V4 for their latest AI chips. Despite US export restrictions on advanced semiconductors to China, DeepSeek has demonstrated that Chinese hardware can support frontier model training. Whether V4's performance holds up against models trained on NVIDIA's latest GPUs will be the benchmark that matters most.

What It Targets

Internal testing suggests V4 is optimized primarily for coding and long-context software engineering tasks. DeepSeek claims it could outperform Claude and ChatGPT on long-context coding benchmarks — a claim the community will verify within days of release.

The model will be released under an open-source license, continuing DeepSeek's strategy of undercutting Western labs on both price and accessibility. For developers already running DeepSeek R2 locally, V4 represents a significant upgrade path.

Why It Matters

V4 is not just another model release. It is a proof point that frontier AI development can happen outside the NVIDIA ecosystem, that open-source multimodal models can compete with proprietary ones, and that China's AI labs are not slowing down despite regulatory pressure. The timing — coinciding with China's Two Sessions parliamentary meetings — underscores the strategic significance Beijing places on domestic AI capabilities.

The AI community should have weights in hand within days.

DeepSeek V4 Drops This Week — A Trillion-Parameter Multimodal Model Trained on Chinese Chips

DeepSeek V4 Drops This Week — A Trillion-Parameter Multimodal Model Trained on Chinese Chips

What Makes V4 Different

The Hardware Story

What It Targets

Why It Matters

More in Models

xAI Launches Grok Voice Think Fast 1.0, Tops τ-Voice Bench and Powers Starlink Support

Tencent Drops Hy3 Preview: 295B Open-Source MoE Model Kicks DeepSeek Out of Yuanbao

DeepSeek V4 Preview Lands: 1.6T-Parameter Open Model With 1M Context, Flash Pricing at $0.14/M