OpenMOSS
Follow
Models
-
MOSS-TTS is OpenMOSS and MOSI.AI’s open-source speech and sound generation family for voice cloning, dialogue speech, voice design, real-time TTS, and sound effects.NewMultimodalReleased 24d ago
-
MOSS-Audio is an open-source unified audio understanding model family from MOSI.AI, OpenMOSS, and the Shanghai Innovation Institute. It is built to handle speech, environmental sound, music, captioning, time-aware QA, and complex audio reasoning in one system, with 4B and 8B Instruct and Thinking variants.NewMultimodalReleased 2mo ago
-
MOSS-TTS-Nano is an open-source multilingual tiny speech generation model from MOSI.AI and OpenMOSS. With only 0.1B parameters, it is built for real-time TTS, can run directly on CPU without a GPU, and keeps deployment simple enough for local demos, web serving, and lightweight product integration.NewMultimodalReleased 2mo ago
-
MOSS-TTS-Local-Transformer-v1.5 is a 5B-parameter text-to-speech model supporting 31 languages with zero-shot voice cloning, long-form speech generation, token-level duration control, Pinyin/IPA pronunciation control, code-switching, and 48 kHz stereo audio output via MOSS-Audio-Tokenizer-v2.AudioReleased 3mo ago
-
Open source foundation model that jointly generates video and audio in one pass, achieving tightly synchronized lip movements and environment-aware sound effects.VideoReleased 4mo ago
MongoDB - Build AI That Scales
