Penguin VL 2B

Penguin VL 2B

Penguin-VL pairs Qwen3 backbones with an LLM-based vision encoder approach (instead of contrastive-pretrained encoders) to preserve fine-grained visual cues for dense captioning and complex VLM reasoning, released as a 2B-scale variant with inference code and demos.

Overview

Penguin-VL-2B is a compact vision-language model that uses an LLM-based vision encoder to push efficiency limits in multimodal reasoning.

🔍Image interpretation 📷Image text extraction 🔍Image and document analysis 📜OCR

About Tencent

Tencent is a world-leading internet and technology company that develops innovative products and services to improve the quality of life of people around the world.

Industry: Technology, Information and Media

Company Size: 110,000

Location: Shenzhen, CN

Website: tencent.com

View Company Profile

Tools using Penguin VL 2B

No tools found for this model yet.

Last updated: March 12, 2026

Search

Overview

About Tencent

Tools using Penguin VL 2B

Related Models

Help

People also viewed

Create AI Tools

Mini Tool

Vibe code an AI Tool

Choose listing type: