TAAFT
Free mode
100% free
Freemium
Free Trial
Deals

BGE VL

BGE VL Is a multimodal retrieval series trained on MegaPairs rather than a plain text embedding line. The docs list lightweight CLIP-based bge-vl-base at 150M and bge-vl-large at 428M for image-text embedding, plus two 7.57B MLLM variants, bge-vl-MLLM-S1 and S2, for stronger composed image retrieval. BAAI says the base and large models are built on CLIP backbones, while the MLLM versions target higher-end retrieval performance.
Multimodal Gen 3
Released: March 6, 2025

Overview

BGE VL is BAAI’s multimodal retrieval family for image-text search. It includes lightweight CLIP-based models and larger MLLM-based variants, ranging from 150M to 7.57B parameters, and is built for matching images with text or mixed image-text queries.

About Beijing Academy of Artificial Intelligence (BAAI)

Beijing Academy of Artificial Intelligence (BAAI), also known as Zhiyuan Institute, is a Chinese non-profit AI research laboratory established in November 2018. BAAI conducts fundamental AI research, develops open-source models (including WuDao, BGE embeddings, Emu, and RoboBrain), and fosters collaboration between academia and industry. Known for creating the BGE (BAAI General Embedding) series of embedding models used in RAG systems worldwide.

Industry: Research
Company Size: 260
Location: Beijing, Beijing, CN
Website: baai.ac.cn
View Company Profile

Tools using BGE VL

No tools found for this model yet.

Last updated: April 16, 2026
0 AIs selected
Clear selection
#
Name
Task