PaliGemma
Overview
PaliGemma is Google’s open vision-language model that accepts images plus text and outputs text for captioning, visual question answering, OCR-style tasks, and detection.
About Google DeepMind
AI research lab within Google developing frontier models, scientific AI systems, reinforcement learning methods, and products that support Google services and research.
