Papers
-
Discovering Multiagent Learning Algorithms with Large Language Models
-
The Geometry of Noise: Why Diffusion Models Don't Need Noise Conditioning
-
Unified Latents (UL): How to train your latents
-
On Surprising Effectiveness of Masking Updates in Adaptive Optimizers
-
Accelerating Mathematical and Scientific Discovery with Gemini Deep Think
-
Self-Consistency Improves Chain of Thought Reasoning in Language Models
-
Accelerating Scientific Research with Gemini: Case Studies and Common Techniques
-
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
-
TranslateGemma Technical Report
-
Reasoning Models Generate Societies of Thought
-
Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities.
-
Prompt Repetition Improves Non-Reasoning LLMs
-
Towards a Science of Scaling Agent Systems
-
T5Gemma 2: Seeing, Reading, and Understanding Longer
-
Efficiently Reconstructing Dynamic Scenes One D4RT at a Time
-
SIMA 2: A Generalist Embodied Agent for Virtual Worlds
-
Gemini Robotics 1.5: Pushing the Frontier of Generalist Robots with Advanced Embodied Reasoning, Thinking, and Motion Transfer
-
ATLAS: Practical Scaling Laws for Multilingual Models
-
Cortex: Workflow-Aware Resource Pooling and Scheduling for Agentic Servin
-
Training Agents Inside of Scalable World Models
-
AlphaEarth Foundations: An embedding field model for accurate and efficient global mapping from sparse label data
-
An AI System to Help Scientists Write Expert-Level Empirical Software
-
Measuring the environmental impact of delivering AI at Google Scale
-
Scaling Data-Constrained Language Models
-
Non-preemptive Throughput Maximizationunder Time-varying Capacity
-
AlphaEvolve: A coding agent for scientific and algorithmic discovery
-
Gemini Robotics: Bringing AI into the Physical World
-
Lessons from Defending Gemini Against Indirect Prompt Injections
-
Migrating Code At Scale With LLMs At Google
-
Gemini: A Family of Highly Capable Multimodal Models
-
Gemma 3 Technical Report
-
Gemini Embedding: Generalizable Embeddings from Gemini
-
EmbeddingGemma: Powerful and Lightweight Text Representations
-
Titans: Learning to Memorize at Test Time
-
VDB-GPDF: Online Gaussian Process Distance Field with VDB Structure
-
OpenVLA: An Open-Source Vision-Language-Action Model
-
Generative Image Dynamics
-
Gemma: Open Models Based on Gemini Research and Technology
-
VideoPoet: A Large Language Model for Zero-Shot Video Generation
-
Solving Olympiad Geometry without Human Demonstrations (AlphaGeometry)
-
PaLM 2 Technical Report
-
RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control
-
Plan-and-Solve Prompting: Improving Zero-Shot Chain-of-Thought Reasoning by Large Language Models
-
Sequential Attention: Making AI Models Leaner and Faster Without Sacrificing Accuracy
-
Generative Agents: Interactive Simulacra of Human Behavior
-
Synergizing Reasoning and Acting in Language Models
-
PaLM: Scaling Language Modeling with Pathways
-
AudioLM: A Language Modeling Approach to Audio Generation
-
Do As I Can, Not As I Say: Grounding Language in Robotic Affordances
-
Discovering faster matrix multiplication algorithms with reinforcement learning
