TAAFT
Free mode
100% free
Freemium
Free Trial
Deals

Baidu

Baidu is a Chinese multinational technology company specializing in internet-related services, products, and artificial intelligence.
AI Native
No
Number of tools
0
Number of employees
10k+
Valuation
$47.10BAI

Models

  • Qianfan-OCR is a 4B end-to-end document intelligence vision-language model that performs direct image-to-Markdown conversion and supports prompt-driven document tasks like table extraction, chart understanding, document QA, and key information extraction.
    NewMultimodal
    Released 12d ago
  • Production ready OCR and document AI toolkit that turns images and PDFs into structured data, with multilingual OCR, layout analysis and VLM based document parsing.
    NewText
    Released 1mo ago
  • Image
    Released 3mo ago
  • A multimodal MoE model that “looks, reads, and reasons” across images, video, and text. It adds tool use and a Thinking with Images mode, supports long context, and activates about 3B parameters per token for flagship-level VLM quality at practical latency.
    Text
    Released 4mo ago
  • ERNIE 5 is Baidu’s next-gen general model for reasoning, coding, and multimodal understanding. It supports long context, tool and function calling, reliable JSON, streaming, and enterprise guardrails, making it a strong default for RAG, agents, and document or chart analysis.
    Text
    Released 4mo ago
  • PaddleOCR-VL is a vision-language model built around PaddleOCR that reads documents, forms, tables, charts, and screenshots. It combines strong OCR with reasoning over layout and content, then answers in text or structured JSON for multimodal RAG and automation.
    Multimodal
    Released 5mo ago
  • Qianfan-VL-3B is Baidu’s lightweight VLM for cost-sensitive, real-time multimodal apps. It processes images plus text and returns grounded answers with basic OCR and layout understanding, long context, tool/function calling, and JSON outputs—optimized for speed and efficiency.
    Text
    Released 6mo ago
  • Qianfan-VL-8B is Baidu’s mid-size vision-language model. It reads images (docs, charts, screenshots, photos) alongside text and returns grounded answers with solid OCR, layout understanding, multi-image reasoning, long context, tool/function calling, and reliable JSON outputs—balanced for quality and latency.
    Text
    Released 6mo ago
  • Qianfan-VL 70B is Baidu’s large vision-language model on the Qianfan platform. It ingests images (docs, charts, screenshots, photos) with text and produces grounded answers, featuring strong OCR and layout understanding, long context, tool/function calling, streaming, and reliable JSON outputs for multimodal RAG and enterprise apps.
    Text
    Released 6mo ago
  • ERNIE 4.5 Turbo is Baidu’s high-throughput, cost-optimized variant of ERNIE 4.5. It delivers strong reasoning and coding with long-context options, tool/function calling, JSON outputs, and streaming—ready for production via ERNIE Bot and the Qianfan API.
    Text
    Released 6mo ago
  • ERNIE X1.1 is Baidu’s upgraded “deep-thinking” reasoning model, unveiled on Sept 9, 2025 at Wave Summit. Versus ERNIE X1, it boosts factuality (+34.8%), instruction following (+12.5%), and agentic skills (+9.6%). It’s available in ERNIE Bot/Wenxiaoyan and via the Qianfan API
    Text
    Released 6mo ago
  • ERNIE 4.5-21B-A3B is Baidu’s efficient MoE variant of ERNIE 4.5—about 21B total parameters with ~3B active per token—built to balance strong reasoning and coding accuracy with low latency. It supports long context, tool/function calling, structured JSON output, and streaming via ERNIE Bot and the Qianfan API.
    Text
    Released 8mo ago
0 AIs selected
Clear selection
#
Name
Task