Seed Audio 1.0

Seed Audio 1.0

Model family: Seed

Seed Audio 1.0 is ByteDance's unified audio generation model, launched June 23, 2026 at the Volcano Engine FORCE 2026 conference. Unlike traditional text-to-speech systems, it generates the full spectrum of audio from text prompts: natural human speech, original music compositions, foley sound effects, and environmental soundscapes. Key capabilities include zero-shot voice cloning from short reference clips, multi-character dialogue generation in a single pass with distinct voices per speaker, simultaneous generation of voice and background music and sound effects, and cross-lingual synthesis without fine-tuning. Up to two minutes of audio can be generated per request with consistent voice across extensions. The model is accessible via Volcano Engine API and through the Doubao consumer app. International developers can access it via BytePlus.

Overview

Seed Audio 1.0 is ByteDance's universal audio generation model that creates voice, music, sound effects, and ambient soundscapes from text prompts. It supports zero-shot voice cloning from short audio references, multi-character dialogue generation in a single pass, and cross-lingual synthesis without fine-tuning. Accessible via Volcano Engine API.

🔊Advanced audio generation 🔊Text to speech 🗣️Voice cloning 🎶Music generation

About ByteDance

ByteDance is a multinational technology company known for its content platforms, including TikTok and Douyin.

Industry: Internet

Company Size: 150000

Location: Beijing, CN

Website: bytedance.com

View Company Profile

Last updated: July 7, 2026

Go to section

Search

Overview

About ByteDance

Other models from this family

Related Models

Help

People also viewed

Create AI Tools

Mini Tool

Vibe code an AI Tool

Choose listing type: