TAAFT
Free mode
100% free
Freemium
Free Trial
Deals

Seed Audio 1.0

Model family: Seed
Seed Audio 1.0 is ByteDance's unified audio generation model, launched June 23, 2026 at the Volcano Engine FORCE 2026 conference. Unlike traditional text-to-speech systems, it generates the full spectrum of audio from text prompts: natural human speech, original music compositions, foley sound effects, and environmental soundscapes. Key capabilities include zero-shot voice cloning from short reference clips, multi-character dialogue generation in a single pass with distinct voices per speaker, simultaneous generation of voice and background music and sound effects, and cross-lingual synthesis without fine-tuning. Up to two minutes of audio can be generated per request with consistent voice across extensions. The model is accessible via Volcano Engine API and through the Doubao consumer app. International developers can access it via BytePlus.
New Audio Gen 4
Released: June 23, 2026

Overview

Seed Audio 1.0 is ByteDance's universal audio generation model that creates voice, music, sound effects, and ambient soundscapes from text prompts. It supports zero-shot voice cloning from short audio references, multi-character dialogue generation in a single pass, and cross-lingual synthesis without fine-tuning. Accessible via Volcano Engine API.

About ByteDance

ByteDance is a multinational technology company known for its content platforms, including TikTok and Douyin.

Industry: Internet
Company Size: 150000
Location: Beijing, CN
View Company Profile
Last updated: June 24, 2026
0 AIs selected
Clear selection
#
Name
Task