TAAFT
Free mode
100% free
Freemium
Free Trial
Deals

Seed 1.5 VL

Model family: Seed
Seed1.5-VL is ByteDanceโ€™s flagship vision-language model, pairing a SeedViT 532M vision encoder with a 20B-active MoE LLM. It handles images and videos of arbitrary aspect ratio, does fine-grained grounding, OCR and visual puzzles, and powers GUI agents for control and gameplay, while matching top VLMs with much lower compute.
New Multimodal Gen 3
Released: May 13, 2025

Overview

Compact vision-language foundation model from ByteDance Seed, combining a 532M vision encoder with a 20B-active MoE LLM to deliver strong image, video and GUI understanding and multimodal reasoning, with many SOTA results at low inference cost.

About ByteDance

ByteDance is a multinational technology company known for its content platforms, including TikTok and Douyin.

Industry: Internet
Company Size: 10001+
Location: Beijing, CN
View Company Profile

Tools using Seed 1.5 VL

Last updated: February 25, 2026
0 AIs selected
Clear selection
#
Name
Task