Papers
-
SparK: Query-Aware Unstructured Sparsity with Recoverable KV Cache Channel Pruning
-
BenchPress: Rapid Benchmark Curation
-
LLM-based Relevance Assessment for Web-Scale Search Evaluation at Pinterest
-
Training-Free Group Relative Policy Optimization
-
LUT-LLM: Efficient Large Language Model Inference with Memory-based Computations on FPGAs
-
Efficient and Adaptable Overlapping for Computation and Communication via Signaling and Reordering
-
Training-Free Group Relative Policy Optimization
-
What Is Your Agent's GPA? A Framework for Evaluating Agent Goal-Plan-Action Alignment
-
Seeing, Listening, Remembering, and Reasoning: A Multimodal Agent with Long-Term Memory
-
Poisoning Attacks on LLMs Require a Near-constant Number of Poison SamplesAnthropic / Alan Turing Institute, Oxford Applied and Theoretical Machine Learning, Swiss Federal Institute of Technology in Zurich, UK AI Security Institute, University of Oxford
-
WebAgent-R1: Training Web Agents via End-to-End Multi-Turn Reinforcement Learning
-
Moloch's Bargain: Emergent Misalignment When LLMs Compete for AudiencesStanford University
-
SuffixDecoding: Extreme Speculative Decoding for Emerging AI Applications
-
Mind Your Tone: Investigating How Prompt Politeness Affects LLM Accuracy (short paper)The Pennsylvania State University
-
Less is More: Recursive Reasoning with Tiny Networks
-
Tongyi DeepResearch Technical Report
-
TabArena: A Living Benchmark for Machine Learning on Tabular Data
-
IBM Granite 4.0: hyper-efficient, high performance hybrid models for enterprise
-
Diffusion Adversarial Post-Training for One-Step Video Generatio
-
Pretraining Large Language Models with NVFP4
-
Training Agents Inside of Scalable World Models
-
APRIL: Active Partial Rollouts in Reinforcement Learning to Tame Long-tail Generation
-
WebExplorer: Explore and Evolve for Training Long-Horizon Web Agents
-
Inference-Time Scaling for Generalist Reward Modeling
-
MechStyle: Augmenting Generative AI with Mechanical Simulation to Create Stylized and Structurally Viable 3D ModelsGoogle Research, Stability AI / Center for Bits and Atoms, MIT, Khoury College of Computer Sciences, Northeastern University, University of Washington
-
FS-DFM: Fast and Accurate Long Text Generation with Few-Step Diffusion Language Models
-
Dynamic Parameter Memory: Temporary LoRA-Enhanced LLM for Long-Sequence Emotion Recognition in Conversation
-
ReSearch: Learning to Reason with Search for LLMs via Reinforcement Learning
-
From Slow Bidirectional to Fast Autoregressive Video Diffusion Models
-
EpiCache: Episodic KV Cache Management for Long Conversational Question Answering
-
"My Boyfriend is AI": A Computational Analysis of Human-AI Companionship in Reddit's AI CommunityMassachusetts Institute of Technology
-
Teaching LLMs to Plan: Logical Chain-of-Thought Instruction Tuning for Symbolic Planning
-
Towards an AI-Augmented Textbook
-
Steering MoE LLMs via Expert (De)Activation
-
Robix: A Unified Model for Robot Interaction, Reasoning and Planning
-
AlphaEarth Foundations: An embedding field model for accurate and efficient global mapping from sparse label data
-
An AI System to Help Scientists Write Expert-Level Empirical Software
-
Why Language Models Hallucinate
-
Mooncake: A KVCache-centric Disaggregated Architecture for LLM Serving
-
GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models
-
Measuring the environmental impact of delivering AI at Google Scale
-
3D-GENERALIST: Vision-Language-Action Models for Crafting 3D Worlds
-
X-MoE: Enabling Scalable Training for Emerging Mixture-of-Experts Architectures on HPC Platforms
-
NoProp: Training Neural Networks without Full Back-propagation or Full Forward-propagation
-
Matrix-3D: Omnidirectional Explorable 3D World Generation
-
Amazon Ads Multi-Touch Attribution
-
Scaling Laws for Native Multimodal Models
-
Devstral: Fine-tuning Language Models for Coding Agent Applications
-
Establishing Best Practices for Building Rigorous Agentic Benchmarks
-
No LLM Solved Yu Tsumura's 554th ProblemUniversity of Cambridge, University of Oxford
