Overview
DeepSeek-Math-V2 is a math-specialized LLM built on DeepSeek-V3.2-Exp-Base, trained to generate and verify step-by-step proofs. It uses a learned verifier as a reward model so the generator learns to fix its own reasoning, reaching gold-level scores on contests like IMO 2025, CMO 2024, and near-perfect Putnam 2024 with scaled test-time compute.
Description
DeepSeek-Math-V2 targets self-verifiable mathematical reasoning instead of only chasing final answer accuracy. The authors first train a rigorous LLM-based proof verifier, then use it as the reward model in reinforcement learning so the proof generator is pushed to detect and repair issues in its own derivations before finalizing them. They further scale verification compute to label new hard-to-verify proofs and iteratively strengthen the verifier. This loop yields a model with strong theorem-proving performance on IMO-ProofBench and recent competitions such as IMO 2025, CMO 2024, and Putnam 2024, suggesting that scalable self-checking is a viable path toward more reliable deep mathematical reasoning systems.
About DeepSeek
DeepSeek is a Chinese AI firm specializing in large language models, based in Hangzhou.
Industry:
Artificial Intelligence
Company Size:
N/A
Location:
Hangzhou, Zhejiang, CN
View Company Profile