ElevenLabs Scribev2Version updatev2Jan 9, 2026Live real-time speech-to-text model — built for streaming transcription with very low latency (~150 ms) for live voice interactions.
Ultra-low latency performance — instant speech transcription ideal for conversational AI, voice agents, meetings, and live captioning. 
High accuracy across many languages — supports 90+ languages with strong real-world performance and benchmark scores. 
Predictive streaming (“negative latency”) — anticipates next words and punctuation to reduce delays. 
Automatic language detection — the model detects and switches languages mid-conversation. 
Advanced streaming controls — includes manual commit control, text conditioning, and voice activity detection (VAD). 
Broad audio format support — works with PCM (8–48 kHz) and μ-law audio for compatibility across use cases.
Overview
ElevenLabs Speech to text is a speech-to-text model that specializes in converting speech into text with remarkable accuracy across multiple contexts and languages.
It houses two main features namely, Scribe v2 and Scribe v2 Realtime. The former focuses on the transcription of audio and video content into text, perfect for creating captions, subtitles, and editable transcripts for various forms of recorded content.
It stands out for its ability to accurately transcribe specific words based on context, marked sound events in transcripts, and distinguish and label every speaker in a dialogue.
The latter, Scribe v2 Realtime, is designed for real-time applications with an emphasis on things like live calls, meetings, or AI agents requiring immediate transcription.
It uses a streaming-first architecture to provide real-time results while still maintaining accuracy. It also includes features like precision speech segmentation for smoother live processing and voice activity detection.
Both versions of Scribe support over 90 languages and can be incorporated into your products using their API.
Supported features
Releases
Ultra-low latency performance — instant speech transcription ideal for conversational AI, voice agents, meetings, and live captioning. 
High accuracy across many languages — supports 90+ languages with strong real-world performance and benchmark scores. 
Predictive streaming (“negative latency”) — anticipates next words and punctuation to reduce delays. 
Automatic language detection — the model detects and switches languages mid-conversation. 
Advanced streaming controls — includes manual commit control, text conditioning, and voice activity detection (VAD). 
Broad audio format support — works with PCM (8–48 kHz) and μ-law audio for compatibility across use cases.
Other tools by Eleven Labs
Top alternatives
-
Write 9x Faster with AI Speech to Text on all Apps
-
Build Voice AI Apps With Insanely Accurate Speech-to-TextOne of the most accurate API's I've used for speech to text and summarization. Cost effective w/ bulk contracts too.
-
Unlimited transcripts, summaries, 99.8% accuracy, speaker recognition, superfastI already have another transcription tool, but this one is much better. I love the different features such as the summary, quiz, and chapters. It does a great job of them. I've only done one transcript so far to try it out, but I'm truly impressed and am going to grab another code. A couple things that would make it even better are: - the ability to rename the files and organize them through folders. - the ability to download a copy of the other features as well as the transcript. Copying and pasting it works, but doesn't keep the format. -
🎯 3 free transcripts every day. 🔥 Unlimited transcription starting at $10/mo.No other tool quite like this, it's pretty straightforward. Needed to extract a long interview from YouTube and it extracted everything, providing it in different meaningful formats in less than two minutes. Awesome
-
⚡ Write by thinking aloud - emails, notes, articles, in your style.This is my favourite, so handy and works brilliant -
Private dictation, cloud optionalthe founder is cool, definitely a recommendation if you have personal requests and grow the app along with you.


