Overview

Voicebox is an open-source voice cloning desktop application powered by Qwen3-TTS. It allows users to create natural-sounding speech from text, replicating voices with high precision.

This application is positioned as a local-first voice cloning studio providing professional voice synthesis comparable to commercial-grade software, but with user privacy as a focus.

It requires no cloud services or subscriptions, thus ensuring complete user privacy and native performance. With Voicebox, one can download voice models, clone voices, and generate speech entirely on a local machine.

The application is cross-platform, designed for macOS, Windows, and Linux. It provides multi-sample support to allow for greater quality and natural sounding voice cloning.

The application is designed for optimal performance, leveraging Metal acceleration on Mac and CUDA acceleration on Windows/Linux for speedy, local inference operations.

In addition, it enables users to run GPU inference locally or connect to a remote machine. The software also equips users with a stories editor that permits the created multi-voice narratives with a timeline-based editor, making it possible to arrange tracks, trim clips, and mix conversations.

Moreover, it features an audio transcription system powered by Whisper for accurate speech-to-text, thereby allowing automatic extraction of reference text from voice samples.

Releases

VoiceboxInitial

Get notified when a new version of Voicebox is released

Notify me

Initial release

February 4, 2026

Initial release of Voicebox.

+ Submit new release

By unverified author Claim this AI

Pricing

Pricing model

Pricing

Paid options from

N/A

Use tool

Save

🔗 Copy link

🗳️ Vote Best AI Tool

Featured

Voice cloning Voicebox

Voice cloning

159

1.0(1)

Overview Releases Alternatives Pricing Pros & Cons Prompts Reviews Q&A

Use tool

Save

Top alternatives

KreadoAI v4.0

Create AI-powered multilingual videos with digital avatars

Voice cloning

Open

6,232 www.kreadoai.com

fei shi

🙏 63 karma

Oct 20, 2023

@KreadoAI

It simplifies my video creation. It's a must-have tool.

3811 Reply Share Edit Delete Report

Share

🇨🇳 China
Released 11mo ago
Free + from $12/mo

52,669
379
4.0
Fineshare

An online all-in-one AI voice generator for everyone

Voice cloning

Open

Aramis

🛠️ 2 tools 🙏 957 karma

Apr 3, 2025

@Fineshare

Seems to be working for me. Maybe try again?

132 Reply Share Edit Delete Report

Share

🇭🇰 Hong Kong
Released 3y ago
Free + from $6.99/mo

31,240
797
3.2
Instant Singer

Clone your voice and sing any song instantly.

Voice cloning

Open

Rick

🙏 27 karma

Mar 13, 2024

@Instant Singer

awful, voice is incredibly robotic

174 Reply Share Edit Delete Report

Share

Released 2y ago
Free + from $1.99

29,732
101
2.5
All Voice Lab

Reshape audio workflows with AI-powered voice solutions.

Voice cloning

Open

17,010 www.allvoicelab.com

Jason Xu

🙏 25 karma

Mar 11, 2025

@All Voice Lab

Love that its free to use. 300,000 credits after registration is impressive! Would be even better if there was a larger voice library available

181 Reply Share Edit Delete Report

Share

Released 11mo ago
No pricing

21,852
113
3.6
Descript - AI Speech

Create natural-sounding TTS with your voice or stock voices.

Voice cloning

Open

1,139 www.descript.com

Share

🇺🇸 United States
Released 6y ago
Free + from $16/mo

14,292
76
1.0
Myvocal

Clone your voice for singing and speaking in 60 seconds.

Voice cloning

Open

Master Swapper

🙏 17 karma

Oct 6, 2023

@Myvocal

Can only be used after payment

81 Reply Share Edit Delete Report

Share

🇸🇬 Singapore
Released 2y ago
Free + from $10/mo

11,008
71
1.0

Promote AI Claim AI New release

Reviews

1.0

Average from 1 rating.

★ ★ ★ ★ ★ 0

★ ★ ★ ★ 0

★ ★ ★ 0

★ ★ 0

★ 1

Your rating

★ ★ ★ ★ ★

Post

How would you rate Voicebox?

Help other people by letting them know if this AI was useful.

Prompts & Results

Title:

Description:

Prompt type:*

Prompt:*

Output type:*

Output:*

Add your own prompts and outputs to help others understand how to use this AI.

Pros and Cons

Pros

Open-source application

Professional voice synthesis

User privacy focused

No cloud services required

No subscriptions needed

Allows download of voice models

Supports voice cloning

Generates speech on local machine

Cross-platform application

Optimized for macOS, Windows, Linux

Multi-sample support

Utilizes Metal acceleration on Mac

Uses CUDA acceleration on Windows/Linux

Offers local GPU inference

Connect to remote machine

Features a stories editor

Timeline-based editor for multi-voice narratives

Audio transcription system

Automatic reference text extraction

Smart caching

No Python installation required

Create natural-sounding speech

Can clone any voice

Personal voice data protection

Studio-grade editing tools

Fast local inference

Nearly perfect voice replication

One-click server setup

Combine multiple voice samples

View 25 more pros

Cons

Requires GPU for optimal performance

UI possibly overwhelming

Local machine can limit performance

Remote machine setup required

Dependent on Qwen3-TTS

Whisper needed for transcriptions

Metal acceleration only on Mac

CUDA acceleration required on Windows/Linux

Sizeable download for voice models

Possible privacy concerns with cloning voices

View 5 more cons

Q&A

What is Voicebox and how does it work?

Voicebox is an open-source voice cloning desktop application engineered by Qwen3-TTS technology. Primarily, it empowers users to produce natural-sounding speech from text, replicating any given voice with remarkable precision. Designed as a local-first voice cloning studio, Voicebox maintains the performance quality of professional voice synthesis, comparable to commercial alternatives. The entire process of cloning voices and generating speech takes place locally, without the need for any cloud services or subscriptions. Voicebox extends further functionality by including a stories editor for creating multi-voice narratives and an audio transcription system powered by Whisper for accurate speech-to-text service.

Is Voicebox free to use since it is open-source?

Indeed, Voicebox is free to use. It is an open-source application that does not demand any fee or subscription for its use or for accessing its source code.

How does Voicebox guarantee user privacy?

Voicebox guarantees user privacy by operating on a local-first basis. Your voice data is neither sent nor stored on any remote servers since all operations, including voice cloning and speech generation, are done solely on your local machine without using any cloud services or requiring subscriptions.

What operating systems is Voicebox compatible with?

Voicebox is designed to be cross-platform, compatible with macOS, Windows, and Linux.

How is the quality of voice cloning ensured in Voicebox?

Voicebox ensures the quality of voice cloning through its multi-sample support feature and the power of Qwen3-TTS technology. By using multiple voice samples, the chances of achieving higher quality and more natural-sounding results are enhanced. Qwen3-TTS technology, on the other hand, offers exceptional voice quality and accuracy.

What is the role of Qwen3-TTS in Voicebox?

The role of Qwen3-TTS in Voicebox is significant. It forms the core engine for Voicebox, providing near-perfect voice cloning capabilities. Qwen3-TTS is responsible for the superior quality and accuracy of voice replication in Voicebox.

+ Show 14 more

How are local inference operations accelerated using Metal and CUDA in Voicebox?

In Voicebox, local inference operations are accelerated using Metal and CUDA. Metal acceleration is leveraged on Mac devices, while CUDA acceleration is utilized on Windows/Linux systems. These are both GPU-based hardware acceleration technologies that speed up the process of voice cloning and speech synthesis.

How can I clone voices using Voicebox?

With Voicebox, you can clone voices by first downloading a voice model. After that, you could use a few seconds of audio to clone any voice and create multi-voice projects using the studio-grade editing tools available within the application.

Can Voicebox be used without a cloud service or subscription?

Yes, Voicebox operates fully without the requirement of a cloud service or subscription. It runs all voice cloning and speech generation functions entirely on your local machine.

Can Voicebox run GPU inference locally and how does it connect to a remote machine?

Yes, Voicebox can run GPU inference locally. It makes use of Metal acceleration on Mac and CUDA acceleration on Windows/Linux systems to speed up local inference operations. Additionally, if desired, it also has the feature to connect to a remote machine for the GPU inference operations.

What features does the Stories Editor in Voicebox provide?

The Stories Editor in Voicebox provides a platform for users to craft multi-voice narratives with a timeline-based editor. It gives you the ability to arrange tracks, trim clips, and mix conversations, thereby offering a comprehensive editing environment.

What roles does audio transcription and Whisper play in Voicebox?

The role of audio transcription in Voicebox is particularly substantial. Powered by Whisper, this function delivers accurate speech-to-text services, which in turn, facilitates the automatic extraction of reference text from voice samples. Essentially, Whisper makes this feature more accurate and efficient.

How do I generate natural-sounding speech from text using Voicebox?

To generate natural-sounding speech from text using Voicebox, you first need to download a voice model. Once that's done, input your text within the app. Voicebox's underlying Qwen3-TTS technology will then convert your text into near-perfect voice-replicated speech.

How does Voicebox allow for extraction of reference text from voice samples?

Voicebox facilitates extraction of reference text from voice samples via its audio transcription system. This system, powered by Whisper's speech-to-text capabilities, transcribes voice samples and automatically extracts reference text.

How can I arrange tracks, trim clips, and mix conversations in Voicebox?

You can arrange tracks, trim clips, and mix conversations in Voicebox using the Stories Editor. This feature provides a timeline-based editor, enabling users to manipulate and organize their multi-voice narratives as per their requirements.

What is the functionality of the multi-sample support feature in Voicebox?

The multi-sample support feature allows you to combine multiple voice samples in Voicebox. This function enhances the quality of the voice replication, making it sound more natural.

How does the smart caching work in Voicebox?

The information about smart caching in Voicebox has not been provided.

What does it mean when Voicebox operates on a local-first basis and how does it differ from other options?

When Voicebox is described as operating on a local-first basis, it means all its operations including voice cloning and speech generation take place on the user's local machine. This contrasts with many other services which rely heavily on cloud functionality and require data to be sent and stored on remote servers, which can often sacrifice user privacy.

Do I need to install Python to use Voicebox?

No, Python installation is not required to use Voicebox. It offers native performance on macOS, Windows, and Linux platforms without requiring additional installments.

How can I download voice models with Voicebox?

To download voice models on Voicebox, you simply use the application where it provides an option to download and use the voice models within its interface.

Ask a question

Submit

Search

Voicebox

Overview

Releases

Pricing

Top alternatives

Related topics

Reviews

How would you rate Voicebox?

Prompts & Results

Pros and Cons

Pros

View 25 more pros

Cons

View 5 more cons

Q&A

Search

Overview

Releases

Pricing

Top alternatives

Related topics

Reviews

How would you rate Voicebox?

Prompts & Results

Pros and Cons

Pros

View 25 more pros

Cons

View 5 more cons

Q&A

Help

People also viewed

Feedback and Incident Report

AI Options

Create AI Tools

Mini Tool

Vibe code an AI Tool