MiniMax VL 01
Overview
MiniMax-VL-01 is a vision-language model that reads images and text together. It handles OCR, charts, screenshots, and real-world photos, then answers in natural text or structured JSON. It supports long context, function calling, and streaming for multimodal RAG and assistants.
About MiniMax
MiniMax is a Chinese AI company (Shanghai) focused on developing multimodal foundation models across text, image, audio, video, and music.
View Company Profile