150,568
4,225
30,292
6,100
4,297
557
35,535
5,701
54
190
9,730
24,007
22,830
15,945
997
2,823
2,315
955
7,912
15,835
DeepSeek
DeepSeek is a Chinese AI firm specializing in large language models, based in Hangzhou.
AI Native
Yes
Number of tools
1
Number of employees
Profitable
No
Valuation
$15BAI
Most popular AI tool
Tools
- Rumor Share
Models
-
Second-generation DeepSeek OCR model, “Visual Causal Flow,” aimed at more human-like visual encoding, with dynamic resolution support and strong document-to-Markdown and layout-aware OCR for images and PDFs.NewTextReleased 1mo ago
-
DeepSeek-V3.2-Speciale is a 685B-parameter research-only variant of DeepSeek-V3.2 that pushes open-weight reasoning ability to the limit, but disables tool calling and is intended purely for experimentation rather than everyday agent use.TextReleased 3mo ago
-
DeepSeek V3.2 is the core, stable 3.2-series model designed as a strong general-purpose LLM. It builds on the DeepSeek V3 architecture and delivers balanced performance across reasoning, writing, coding, and multilingual tasks.TextReleased 3mo ago
-
DeepSeek-Math-V2 is a math-specialized LLM built on DeepSeek-V3.2-Exp-Base, trained to generate and verify step-by-step proofs. It uses a learned verifier as a reward model so the generator learns to fix its own reasoning, reaching gold-level scores on contests like IMO 2025, CMO 2024, and near-perfect Putnam 2024 with scaled test-time compute.TextReleased 3mo ago
-
LLM-centric OCR model using “Contexts Optical Compression” to explore visual-text compression and provide fast streaming and batch OCR for images and PDFs via vLLM and Transformers runtimes.TextReleased 5mo ago
-
DeepSeek v3.2 Exp is an experimental build of the DeepSeek V3 line, tuned for deeper reasoning and stronger coding while keeping latency practical. It supports long context, function/tool calling, and schema-true JSON—great for RAG, agents, and repo-scale tasks when you want extra accuracy.TextReleased 5mo ago
-
DeepSeek-V3.1-Terminus is DeepSeek’s flagship reasoning model, tuned for difficult analysis, math, and coding. It supports very long context, function/tool calling, reliable JSON outputs, and an optional extended-thinking mode—ideal for enterprise RAG, agents, and high-stakes workflows.TextReleased 5mo ago
-
DeepSeek R1 is a reasoning-first large language model built to solve complex problems with explicit multi-step thinking. It excels at math, coding, and logical analysis, supports long context, tool/function calling, and structured JSON outputs, and can trade latency for higher accuracy via extended "thinking" budgets.TextReleased 1y ago
-
TextReleased 1y ago
-
TextReleased 1y ago
-
TextReleased 2y ago
-
TextReleased 2y ago
Papers
-
Pointer-CAD: Unifying B-Rep and Command Sequences via Pointer-based Edges & Faces SelectionBeihang UniversityPublished on: 2026-03-04 1 author
-
DualPath: Breaking the Storage Bandwidth Bottleneck in Agentic LLM InferencePublished on: 2026-02-26 1 author
-
DeepSeek-OCR 2: Visual Causal FlowPublished on: 2026-01-28 1 author
-
Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language ModelsPeking UniversityPublished on: 2026-01-12 1 author
-
mHC: Manifold-Constrained Hyper-ConnectionsPublished on: 2026-01-05 1 author
-
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement LearningPublished on: 2026-01-04 1 author
-
Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI ArchitecturesPublished on: 2025-12-23 1 author
-
DeepSeek-V3.2: Pushing the Frontier of Open Large Language ModelsPublished on: 2025-12-02 1 author
-
Inference-Time Scaling for Generalist Reward ModelingPublished on: 2025-09-25 1 author
-
X-MoE: Enabling Scalable Training for Emerging Mixture-of-Experts Architectures on HPC PlatformsUniversity of Illinois Urbana-ChampaignPublished on: 2025-08-18 1 author
-
DeepSeek-Prover-V2: Advancing Formal Mathematical Reasoning via Reinforcement Learning for Subgoal DecompositionPublished on: 2025-07-18 1 author
-
Janus-Pro: Unified Multimodal Understanding and Generation with Data and Model ScalingPublished on: 2025-01-29 1 author
-
DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal UnderstandingPublished on: 2024-12-13 1 author
Repositories
No repositories yet.
