Papers
-
Voxtral Realtime
-
Accelerating Mathematical and Scientific Discovery with Gemini Deep Think
-
OmniSapiens: A Foundation Model for Social Behavior Processing via HARPOMIT, National University of Singapore
-
GameDevBench Evaluating Agentic Capabilities Through Game Development Wayne Chi1 , Yixiong Fang1 , Arnav Yayavaram1 , Siddharth Yayavaram1 , Seth Karten2Carniege Mellon University, Princeton University
-
ArtifactLens: Hundreds of Labels Are Enough for Artifact Detection with VLMs
-
EchoJEPA: A Latent Predictive Foundation Model for Echocardiography
-
CLaRa: Bridging Retrieval and Generation with Continuous Latent Reasoning
-
Autoregressive Image Generation with Masked Bit Modeling
-
iGRPO: Self-Feedback–Driven LLM Reasonin
-
ML-DCN: Masked Low-Rank Deep Crossing Network Towards Scalable Ads Click-through Rate Prediction at Pinterest
-
Agentic LLMs as Powerful Deanonymizers: Re-identification of Participants in the Anthropic Interviewer Dataset
-
HER: Human-like Reasoning and Reinforcement Learning for LLM Role-playing
-
Intelligence Explosion
-
DreamDojo: A Generalist Robot World Model from Large-Scale Human Videos
-
Can Post-Training Transform LLMs into Causal Reasoners?Fudan University, Shanghai Artificial Intelligence Laboratory
-
Learning a Generative Meta-Model of LLM ActivationsUC Berkeley
-
Self-Consistency Improves Chain of Thought Reasoning in Language Models
-
Vector Quantization using Gaussian Variational Autoencoder
-
See Less, See Right: Bi-directional Perceptual Shaping For Multimodal Reasoning
-
Using a GPT-5-driven autonomous lab to optimize the cost and titer of cell-free protein synthesis
-
Knowledge-Intensive AgentsNortheastern University, China
-
Learning to Reason in 13 Parameters
-
LIVE: Long-horizon Interactive Video World Modeling
-
AgentArk: Distilling Multi-Agent Intelligence into a Single LLM Agent
-
Generative Engine Optimization: A VLM and Agent Framework for Pinterest Acquisition Growth
-
Accelerating Scientific Research with Gemini: Case Studies and Common Techniques
-
Closing the Loop: Universal Repository Representation with RPG-Encoder
-
Understanding Agent Scaling in LLM-Based Multi-Agent Systems via Diversity
-
AutoFigure: Generating and Refining Publication-Ready Scientific Illustrations
-
Agent Primitives: Reusable Latent Building Blocks for Multi-Agent Systems
-
Generative AI for Enzyme Design and Biocatalysis
-
SCAIL: Towards Studio-Grade Character Animation via In-Context Learning of 3D-Consistent Pose Representations
-
An Empirical Study on Noisy Data and LLM Pretraining Loss Divergence
-
Interpretable Tabular Foundation Models via In-Context Kernel Regression
-
RFS: Reinforcement Learning with Residual Flow Steering for Dexterous Manipulation
-
CUA-Skill: Develop Skills for Computer Using Agent
-
ReasonCACHE: Teaching LLMs To Reason Without Weight Updates
-
SimMerge: Learning to Select Merge Operators from Similarity Signals
-
Argument Rarity-based Originality Assessment for AI-Assisted WritingRitsumeikan Global Innovation Research Organization
-
AgentRx: Diagnosing AI Agent Failures from Execution Trajectories
-
X-Coder: Advancing Competitive Programming with Fully Synthetic Tasks, Solutions, and Tests
-
What Does Vision Tool-Use Reinforcement Learning Really Learn? Disentangling Tool-Induced and Intrinsic Effects for Crop-and-Zoom
-
DeepPlanning: Benchmarking Long-Horizon Agentic Planning with Complex Real-World Tasks
-
Post-LayerNorm Is Back: Stable, ExpressivE, and Deep
-
MemoryLLM: Plug-n-Play Interpretable Feed-Forward Memory for Transformers
-
LLM-42: Enabling Determinism in LLM Inference with Verified Speculation
-
The Hot Mess of AI: How Does Misalignment Scale With Model Intelligence and Task Complexity?
-
Lost in Transmission: When and Why LLMs Fail to Reason Globally
-
WorldVQA: Measuring Atomic World Knowledge in Multimodal Large Language Models
-
Efficient Autoregressive Video Diffusion with Dummy Head
