Baidu

NVIDIA Google Microsoft Amazon Meta Platforms Tesla OpenAI WordPress Tencent Amazon Web Services ByteDance Alibaba Anthropic Cisco Systems xAI Salesforce Shopify LinkedIn Adobe CrowdStrike

ByteDance

CapCut Online Creative Suite

Baidu is a Chinese multinational technology company specializing in internet-related services, products, and artificial intelligence.

Beijing, China

🇨🇳

Visit website

AI Native

Number of tools

Number of employees

10k+

Valuation

$47.10BAI

Tools 0 Models 16 Papers 1

Models

Gen 3

Qianfan-OCR

Qianfan-OCR is a 4B end-to-end document intelligence vision-language model that performs direct image-to-Markdown conversion and supports prompt-driven document tasks like table extraction, chart understanding, document QA, and key information extraction.

📜OCR 📄Document data extraction 📷Image text extraction 🖼️Image to markdown

NewMultimodal

Released 13d ago
Gen 3 Paddle

PaddleOCR-VL 1.5

Production ready OCR and document AI toolkit that turns images and PDFs into structured data, with multilingual OCR, layout analysis and VLM based document parsing.

🏭Manufacturing

NewText

Released 1mo ago
Gen 4 Z image

Z Image

Image

Released 3mo ago
Gen 7 ERNIE

ERNIE 4.5 VL 28B A3B Thinking

A multimodal MoE model that “looks, reads, and reasons” across images, video, and text. It adds tool use and a Thinking with Images mode, supports long context, and activates about 3B parameters per token for flagship-level VLM quality at practical latency.

📷Images 💻Coding 👤Avatars 🎥Videos

Text

Released 4mo ago
Gen 7 ERNIE

ERNIE 5

ERNIE 5 is Baidu’s next-gen general model for reasoning, coding, and multimodal understanding. It supports long context, tool and function calling, reliable JSON, streaming, and enterprise guardrails, making it a strong default for RAG, agents, and document or chart analysis.

📷Images 💻Coding 📝Writing

Text

Released 4mo ago
Gen 3 Paddle

PaddleOCR-VL

PaddleOCR-VL is a vision-language model built around PaddleOCR that reads documents, forms, tables, charts, and screenshots. It combines strong OCR with reasoning over layout and content, then answers in text or structured JSON for multimodal RAG and automation.

🏭Manufacturing 🎮Game creation

Multimodal

Released 5mo ago
Gen 3 Qianfan

Qianfan-VL-3B

Qianfan-VL-3B is Baidu’s lightweight VLM for cost-sensitive, real-time multimodal apps. It processes images plus text and returns grounded answers with basic OCR and layout understanding, long context, tool/function calling, and JSON outputs—optimized for speed and efficiency.

🏭Manufacturing 🖼️Image to text 🔍Image recognition

Text

Released 6mo ago
Gen 3 Qianfan

Qianfan-VL-8B

Qianfan-VL-8B is Baidu’s mid-size vision-language model. It reads images (docs, charts, screenshots, photos) alongside text and returns grounded answers with solid OCR, layout understanding, multi-image reasoning, long context, tool/function calling, and reliable JSON outputs—balanced for quality and latency.

🏭Manufacturing

Text

Released 6mo ago
Gen 3 Qianfan

Qianfan VL 70B

Qianfan-VL 70B is Baidu’s large vision-language model on the Qianfan platform. It ingests images (docs, charts, screenshots, photos) with text and produces grounded answers, featuring strong OCR and layout understanding, long context, tool/function calling, streaming, and reliable JSON outputs for multimodal RAG and enterprise apps.

📜OCR 🖼️3D image generation 🎬Video dubbing 🔍Image recognition

Text

Released 6mo ago
Gen 3 ERNIE

ERNIE 4.5 Turbo

ERNIE 4.5 Turbo is Baidu’s high-throughput, cost-optimized variant of ERNIE 4.5. It delivers strong reasoning and coding with long-context options, tool/function calling, JSON outputs, and streaming—ready for production via ERNIE Bot and the Qianfan API.

💻Coding 📚Summaries 🚀Productivity

Text

Released 6mo ago
Gen 7 ERNIE

ERNIE X1.1

ERNIE X1.1 is Baidu’s upgraded “deep-thinking” reasoning model, unveiled on Sept 9, 2025 at Wave Summit. Versus ERNIE X1, it boosts factuality (+34.8%), instruction following (+12.5%), and agentic skills (+9.6%). It’s available in ERNIE Bot/Wenxiaoyan and via the Qianfan API

📝Writing 📊Data analysis 💻Coding

Text

Released 6mo ago
Gen 7 ERNIE

ERNIE 4.5-21B-A3B

ERNIE 4.5-21B-A3B is Baidu’s efficient MoE variant of ERNIE 4.5—about 21B total parameters with ~3B active per token—built to balance strong reasoning and coding accuracy with low latency. It supports long context, tool/function calling, structured JSON output, and streaming via ERNIE Bot and the Qianfan API.

💻Coding 📝Writing 🚀Productivity

Text

Released 8mo ago

Search

Baidu

Tools

Models

Papers

Repositories

Search

Baidu

Tools

Models

Papers

Repositories

Help

People also viewed

Feedback and Incident Report

Create AI Tools

Mini Tool

Vibe code an AI Tool

Choose listing type: