Papers
-
T5Gemma 2: Seeing, Reading, and Understanding Longer
-
Evaluating AI’s ability to perform scientific research tasks
-
Seedance 1.5 pro: A Native Audio-Visual Joint Generation Foundation Model
-
On Learning-Curve Monotonicity for Maximum Likelihood Estimators
-
Unsupervised decoding of encoded reasoning using language model interpretability
-
Training LLMs for Honesty via Confessions
-
Early Science Acceleration Experiments with GPT-5
-
Anthropic Economic Index report: Uneven geographic and enterprise AI adoption
-
Weight-Sparse Transformers Have Interpretable Circuits
-
SageServe: Optimizing LLM Serving on Cloud Data Centers with Forecast Aware Auto-Scaling
-
Steering Language Models with Weight ArithmeticUniversity of Copenhagen
-
Charts Are Not Images: On the Challenges of Scientific Chart Editing
-
Signs of introspection in large language models
-
Gemini Robotics 1.5: Pushing the Frontier of Generalist Robots with Advanced Embodied Reasoning, Thinking, and Motion Transfer
-
Shielded Diffusion: Generating Novel and Diverse Images using Sparse Repellency
-
Enabling Differentially Private Federated Learning for Speech Recognition: Benchmarks, Adaptive Optimizers and Gradient Clipping
-
Evaluating honesty and lie detection techniques on a diverse suite of dishonest models
-
ATLAS: Practical Scaling Laws for Multilingual Models
-
Neural Texture Splatting: Expressive 3D Gaussian Splatting for View Synthesis, Geometry, and Dynamic Reconstruction
-
MEF: A Systematic Evaluation Framework for Text-to-Image Models
-
Seed3D 1.0: From Images to High-Fidelity Simulation-Ready 3D Assets
-
Arctic-Extract Technical Report
-
The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity
-
Cortex AISQL: A Production SQL Engine for Unstructured Data
-
Agentic Misalignment: How LLMs Could Be Insider ThreatsUniversity College London
-
Cortex: Workflow-Aware Resource Pooling and Scheduling for Agentic Servin
-
LLM-based Relevance Assessment for Web-Scale Search Evaluation at Pinterest
-
What Is Your Agent's GPA? A Framework for Evaluating Agent Goal-Plan-Action Alignment
-
Seeing, Listening, Remembering, and Reasoning: A Multimodal Agent with Long-Term Memory
-
SuffixDecoding: Extreme Speculative Decoding for Emerging AI Applications
-
Tongyi DeepResearch Technical Report
-
IBM Granite 4.0: hyper-efficient, high performance hybrid models for enterprise
-
Diffusion Adversarial Post-Training for One-Step Video Generatio
-
From Slow Bidirectional to Fast Autoregressive Video Diffusion Models
-
From Slow Bidirectional to Fast Autoregressive Video Diffusion Models
-
Steering MoE LLMs via Expert (De)Activation
-
Robix: A Unified Model for Robot Interaction, Reasoning and Planning
-
AlphaEarth Foundations: An embedding field model for accurate and efficient global mapping from sparse label data
-
An AI System to Help Scientists Write Expert-Level Empirical Software
-
Measuring the environmental impact of delivering AI at Google Scale
-
3D-GENERALIST: Vision-Language-Action Models for Crafting 3D Worlds
-
Amazon Ads Multi-Touch Attribution
-
Devstral: Fine-tuning Language Models for Coding Agent Applications
-
Seed Diffusion: A Large-Scale Diffusion Language Model with High-Speed Inference
-
Seed-Prover: Deep and Broad Reasoning for Automated Theorem Proving
-
Scaling Data-Constrained Language Models
-
Seed LiveInterpret 2.0: End-to-end Simultaneous Speech-to-speech Translation with Your Voice
-
STUN: Structured-Then-Unstructured Pruning for Scalable MoE Pruning
-
Voxtral
-
Apple Intelligence Foundation Language Models: Tech Report 2025
