Manzano
Overview
Manzano is Apple’s unified multimodal model that shares a hybrid vision tokenizer for both image understanding and text-to-image generation, using one autoregressive LLM plus a diffusion decoder to reach state-of-the-art unified performance.
About Apple
Tools using Manzano
No tools found for this model yet.
