Gemini Embedding 2

Gemini Embedding 2

Model family: Gemini

Gemini Embedding 2 (gemini-embedding-2-preview) produces embeddings for text, images, video, audio, and PDFs in one unified space for cross-modal retrieval and classification. Google describes support that includes up to 8192 input tokens for text, up to 6 images per request, up to 120 seconds of video, native audio embedding without transcription, and PDF embedding up to 6 pages, plus interleaved multimodal inputs in a single request and MRL-based output dimension scaling down from 3072.

Overview

Gemini Embedding 2 is Google’s first natively multimodal embedding model that maps text, images, video, audio, and documents into a single shared embedding space.

🔍Information retrieval 📂Content categorization 🔍Multimodal search 🔍Conceptual search 🔍Semantic content analysis

About Google

At Google, we think that AI can meaningfully improve people's lives and that the biggest impact will come when everyone can access it.

Industry: Research

Company Size: 182.000-190.000

Location: Mountain View, CA, US

Website: ai.google

View Company Profile

Tools using Gemini Embedding 2

No tools found for this model yet.

Last updated: March 11, 2026

Search

Overview

About Google

Other models from this family

Tools using Gemini Embedding 2

Related Models

Help

People also viewed

Create AI Tools

Mini Tool

Vibe code an AI Tool

Choose listing type: