PaddleOCR-VL

PaddleOCR-VL

Model family: Paddle

PaddleOCR-VL pairs a high quality OCR stack with a language backbone so it can look, read, and reason in one pass. You can provide scanned pages, receipts, invoices, tables, dashboards, or UI screenshots plus a prompt, and the model extracts text with layout awareness, links fields to labels, interprets tables and charts, and returns grounded answers or schema true JSON. It handles multi page inputs, maintains references across images, and can point to regions for evidence when needed. For production it supports streaming, long context, tool or function calling for region crops and retrieval, and integrates cleanly with PaddlePaddle based workflows. Typical uses include document automation, KVP extraction, invoice processing, chart and dashboard Q and A, screenshot understanding, and multimodal search where accuracy, speed, and easy integration matter.

Overview

PaddleOCR-VL is a vision-language model built around PaddleOCR that reads documents, forms, tables, charts, and screenshots. It combines strong OCR with reasoning over layout and content, then answers in text or structured JSON for multimodal RAG and automation.

🏭Manufacturing 🎮Game creation

About Baidu

Baidu is a Chinese multinational technology company specializing in internet-related services, products, and artificial intelligence.

Industry: Internet

Company Size: 10001+

Location: Beijing, CN

Website: https://baidu.com

View Company Profile

Tools using PaddleOCR-VL

No tools found for this model yet.

Last updated: February 25, 2026

Search

Overview

About Baidu

Other models from this family

Tools using PaddleOCR-VL

Related Models

Help

People also viewed

Create AI Tools

Mini Tool

Vibe code an AI Tool

Choose listing type: