FunAudio ASR

FunAudio ASR

FunAudio-ASR is a large-scale ASR system and model family focused on practical deployment rather than benchmark-only performance. Public model pages describe it as trained on tens of millions of hours of real speech data, with strong contextual understanding and industry adaptability, especially for specialized domains such as education and finance. Current public Fun-ASR model pages also describe low-latency real-time transcription and multilingual coverage up to 31 languages for the newer multilingual Nano variant, while the technical report frames the broader system as an LLM-based ASR stack optimized for streaming, noise robustness, code-switching, hotword customization, and reduced hallucination in real-world use.

Overview

FunAudio-ASR, also called Fun-ASR, is Tongyi Lab’s end-to-end speech recognition model family for production ASR. It is trained on very large real-speech data, supports low-latency real-time transcription, and is designed for strong contextual understanding, domain terminology handling, and multilingual recognition.

🗒Transcription 🎙️Voice recognition 🎤Voice notes transcription 📝Meeting transcription

About Alibaba

Chinese e-commerce and cloud leader behind Taobao, Tmall, and Alipay.

Industry: Retail

Company Size: 128197

Location: CN

Website: alibaba.com

View Company Profile

Last updated: July 7, 2026

Go to section

Search

Overview

About Alibaba

Related Models

Help

People also viewed

Create AI Tools

Mini Tool

Vibe code an AI Tool

Choose listing type: