Papers
-
Gemma 3 Technical Report
-
Gemini Embedding: Generalizable Embeddings from Gemini
-
EmbeddingGemma: Powerful and Lightweight Text Representations
-
Titans: Learning to Memorize at Test Time
-
VDB-GPDF: Online Gaussian Process Distance Field with VDB Structure
-
OpenVLA: An Open-Source Vision-Language-Action Model
-
Generative Image Dynamics
-
Gemma: Open Models Based on Gemini Research and Technology
-
VideoPoet: A Large Language Model for Zero-Shot Video Generation
-
Solving Olympiad Geometry without Human Demonstrations (AlphaGeometry)
-
PaLM 2 Technical Report
-
RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control
-
Plan-and-Solve Prompting: Improving Zero-Shot Chain-of-Thought Reasoning by Large Language Models
-
Sequential Attention: Making AI Models Leaner and Faster Without Sacrificing Accuracy
-
Generative Agents: Interactive Simulacra of Human Behavior
-
Synergizing Reasoning and Acting in Language Models
-
PaLM: Scaling Language Modeling with Pathways
-
AudioLM: A Language Modeling Approach to Audio Generation
-
Do As I Can, Not As I Say: Grounding Language in Robotic Affordances
-
Discovering faster matrix multiplication algorithms with reinforcement learning
-
CLIP-CLOP: CLIP-Guided Collage and Photomontage
-
Scaling Autoregressive Models for Content-Rich Text-to-Image Generation
-
Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding
-
Least-to-Most Prompting Enables Complex Reasoning in Large Language Models
-
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity
-
Training Compute-Optimal Large Language Models
-
Competition-Level Code Generation with AlphaCode
-
Improving language models by retrieving from trillions of tokens
-
Highly accurate protein structure predictionwith AlphaFold
-
AN IMAGE IS WORTH 16X16 WORDS: TRANSFORMERS FOR IMAGE RECOGNITION AT SCALE
-
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
-
Denoising Diffusion Probabilistic Models
-
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
-
Attention Is All You Need
-
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
-
Concrete Problems in AI Safety
