Moondream
Follow
Visit website
Models
-
Photon is Moondream’s real-time vision-language model aimed at production video and image analysis. It is designed to deliver VLM-style visual reasoning fast enough for live use cases such as manufacturing inspection, broadcast moderation, retail monitoring, and security feeds.NewMultimodalReleased 2mo ago
-
Moondream 3 Preview is a compact frontier-oriented vision-language model built for fast visual reasoning, grounding, OCR, object detection, pointing, and structured output. It uses a 9B MoE architecture with 2B active parameters and extends context length to 32K, aiming to deliver strong real-world vision performance while staying efficient and inexpensive to run.MultimodalReleased 8mo ago
-
Moondream 2 is a small open vision-language model designed to run efficiently across many environments. It is the previous-generation Moondream model, released under Apache 2.0, and is positioned as a lightweight image-text model for practical multimodal use where smaller size and deployability matter more than maximum frontier scale.MultimodalReleased 11mo ago
-
Moondream 0.5B is a tiny open-source vision-language model built for edge devices and mobile platforms. With only 0.5B parameters, it is positioned as the world’s smallest VLM, designed for fast lightweight deployment on constrained hardware while still supporting practical real-world visual tasks.ImageReleased 1y ago
-
Small, efficient open-source vision-language model designed to run broadly on many devices.MultimodalReleased 1y ago
