
The AI price war just officially started. DeepSeek announced today (May 23, 2026) that it will permanently maintain a 75% price cut on its flagship V4-Pro model. Per Semafor, Techzine, and Reuters reporting via TipRanks, the new pricing makes V4-Pro $0.87 per million output tokens — a fraction of what OpenAI, Anthropic, and Google charge for comparable-class models.
The move is not a temporary promotion. DeepSeek explicitly framed it as permanent, sending a clear signal to the Western AI labs: the floor has dropped. Anthropic, simultaneously, has publicly accused DeepSeek of using "distillation attacks" to train V4-Pro on competitor model outputs — escalating the diplomatic and IP angle of the price war.
Here's exactly what changed, why it matters for your AI bills, who gets squeezed first, and what indie hackers and AI engineers should do this week.
The New DeepSeek V4-Pro Pricing
Per ThenextWeb, OpenSource For U, Techzine Global, and Reuters/TipRanks:
- Output tokens: $0.87 per million (down from ~$3.50 pre-cut)
- Input tokens: Estimated ~$0.27 per million (cache-optimized)
- Pricing structure: Made permanent — DeepSeek used the word "permanent" specifically to differentiate from typical promotional pricing.
- Effective date: Already in effect on the DeepSeek Platform and reflected at all reseller routes (including OpenRouter).
- Apache 2.0 licensing still applies — you can self-host the model weights and pay $0 in inference if you have the hardware.

DeepSeek's inference infrastructure in China runs at a fundamentally different cost basis than US competitors.
How DeepSeek V4-Pro Stacks Against the Field — Now
| Model | Input / 1M | Output / 1M | Provider |
|---|---|---|---|
| DeepSeek V4-Pro | $0.27 | $0.87 | DeepSeek (or self-host) |
| GPT-4o mini | $0.15 | $0.60 | OpenAI |
| Claude Haiku 3.5 | $1.00 | $5.00 | Anthropic |
| Gemini 3.5 Flash | $1.50 | $9.00 | |
| Claude Sonnet 4.6 | $3.00 | $15.00 | Anthropic |
DeepSeek V4-Pro is now 10× cheaper than Claude Sonnet 4.6 on output and roughly 10× cheaper than Gemini 3.5 Flash. The only model still in the "cheaper than DeepSeek" category for pure token cost is GPT-4o mini — and that model is significantly less capable on coding and agentic tasks.
The Anthropic "Distillation Attack" Accusation
Simultaneously with DeepSeek's price cut, Anthropic has publicly accused DeepSeek of training V4-Pro using "distillation attacks" — a technique where you query a competitor's model heavily, then use the responses as training data for your own. Per ThenextWeb's reporting:
- Anthropic alleges that DeepSeek's V4-Pro shows linguistic and reasoning patterns consistent with Claude-derived training data.
- Distillation is technically possible — anyone with API access to Claude can do it at small scale.
- DeepSeek has not publicly responded to the allegations as of this writing.
- The accusation matters because if true, it shifts the price-war narrative from "Chinese AI ingenuity" to "Chinese AI piggybacking on US R&D."
The allegations are unproven but the timing is significant. Anthropic appears to be preemptively poisoning the price-war narrative — making it harder for enterprise customers to switch to DeepSeek without political/IP risk.
Who Gets Squeezed First

The price floor just dropped 75%. Every other AI lab now has to defend their pricing publicly.
Five tiers of impact:
- Anthropic (Claude Haiku 3.5): Most direct pressure. Claude Haiku at $5/M output now looks expensive vs DeepSeek's $0.87/M. Anthropic must either cut prices in Q3 or successfully convince enterprises that the distillation accusation matters.
- Google (Gemini 3.5 Flash): $9/M output now looks 10× more expensive. Google has the deepest pockets — expect a defensive price cut within 60 days.
- OpenAI (GPT-4o mini, GPT-5 cheaper tiers): Less affected on the cheapest tier (mini is still cheaper) but their upcoming IPO requires showing growing revenue per token. A price war right before IPO is the worst possible timing.
- Mistral, xAI, Cohere: Smaller labs with less margin. Most at risk of being unprofitable at the new floor.
- Self-hosting / Ollama users: Big winners. DeepSeek V4-Pro weights are Apache 2.0 — you can run it locally with the same capability and pay $0 for inference. The cost narrative now favors "host it yourself" more than ever.
What This Means For Your AI Bills
For anyone running a real AI workload — content generation, RAG, agents, coding tools — the practical implications:
- Pure-text agentic workloads: Switch to DeepSeek V4-Pro. The performance is competitive with Claude Sonnet 4.6 on coding (per recent Business Insider testing) but at 1/10th the price.
- Multimodal workloads: Stay on Gemini 3.5 Flash for now. DeepSeek V4-Pro is text-only.
- Agent workflows (tool calling, function chaining): Mix: DeepSeek for the cheap-output stages, Claude/Gemini for the multimodal / vision stages.
- RAG pipelines with prompt caching: DeepSeek's pricing already benefits from caching. Stay on DeepSeek.
- Regulated industries (HIPAA, GDPR, financial): Stay on US-hosted providers (Anthropic, OpenAI, Google). Don't route sensitive data to DeepSeek's hosted endpoints — self-host the weights if you want the cost savings.
FAQ
Why did DeepSeek cut prices 75%?
Multiple drivers: (1) Chinese government-backed AI strategy to capture global market share via aggressive pricing; (2) Chinese inference infrastructure runs at lower energy cost than US providers; (3) DeepSeek is open-sourcing the V4-Pro model weights anyway under Apache 2.0, so hosted-API pricing is mainly about ecosystem capture, not profit per call.
Is DeepSeek V4-Pro as good as Claude Sonnet 4.6?
Close. On code-only tasks (HumanEval, SWE-bench) the gap is roughly 4-7 points in Claude's favor. On long-context reasoning and instruction-following, DeepSeek V4-Pro is competitive but Claude has the edge. For most production agentic use cases, DeepSeek's 10× price advantage outweighs the quality gap.
Should I be worried about Anthropic's distillation accusations?
For commercial use — probably not, the legal precedent is unsettled. For mission-critical or government-adjacent workloads — yes, the IP risk is real. Consult your legal team before routing customer data through DeepSeek hosted endpoints. Self-hosting the weights eliminates the data-flow concern.
Will OpenAI and Anthropic match DeepSeek's pricing?
Anthropic — probably not, they're betting on quality differentiation. OpenAI — already has GPT-4o mini at $0.60/M output (cheaper than DeepSeek). The real question is what happens to the mid-tier (Claude Sonnet, GPT-5 standard) prices in Q3-Q4 2026. Expect 30-50% cuts at minimum.
Where can I use DeepSeek V4-Pro today?
Three paths: (1) DeepSeek Platform directly at platform.deepseek.com; (2) OpenRouter (one API, many models — just raised $113M Series B); (3) self-host the weights via Ollama or vLLM on your own hardware.
Final Word
DeepSeek's permanent 75% price cut is the most significant pricing event in AI since GPT-4 launched at $0.06/1K tokens in early 2023. The price floor just shifted permanently lower. Every Western AI lab now has to choose: cut prices and protect market share, or differentiate on quality and bet enterprise customers won't switch.
For builders, the right move this week is simple — switch one workload to DeepSeek V4-Pro via OpenRouter and measure the actual quality delta on your specific task. If the answer is "indistinguishable for my use case," your AI bill just dropped 90%. That math compounds fast.
📩 AI pricing wars + every major model launch, covered daily.
Subscribe to Tech4SSD. Free. Sign up →
Sources: Semafor, ThenextWeb, OpenSource For U, Techzine Global, TipRanks (Reuters), Reuters direct. Reporting accurate as of May 22, 2026.