Papers
-
Toward Autonomous Long-Horizon Engineering for ML Research
-
Nucleus-Image: Sparse MoE for Image Generation
-
Lyra 2.0: Explorable Generative 3D Worlds
-
TIPSv2: Advancing Vision-Language Pretraining with Enhanced Patch-Text Alignment
-
General365: Benchmarking General Reasoning in Large Language Models Across Diverse and Challenging Tasks
-
A Mechanistic Analysis of Looped Reasoning Language Models
-
Solving Physics Olympiad via Reinforcement Learning on Physics Simulators
-
RationalRewards: Reasoning Rewards Scale Visual Generation Both Training and Test Time
-
OmniShow: Unifying Multimodal Conditions for Human-Object Interaction Video Generation
-
GenTac: Generative Modeling and Forecasting of Soccer Tactics
-
Steered LLM Activations are Non-Surjective
-
ELT: Elastic Looped Transformers for Visual Generation
-
Strips as Tokens: Artist Mesh Generation with Native UV Segmentation
-
Matrix-Game 3.0: Real-Time and Streaming Interactive World Model with Long-Horizon Memory
-
Scaffolding Human-AI Collaboration: A Field Experiment on Behavioral Protocols and Cognitive Reframing
-
Accelerating Transformer-Based Monocular SLAM via Geometric Utility Scoring
-
SkillClaw: Let Skills Evolve Collectively with Agentic Evolver
-
Loop, Think, & Generalize: Implicit Reasoning in Recurrent-Depth Transformers
-
Memento: Teaching LLMs to Manage Their Own Context
-
In-Place Test-Time Training
-
Vero: An Open RL Recipe for General Visual ReasoningPrinceton University
-
OmniSonic: Towards Universal and Holistic Audio Generation from Video and Text
-
Learning, Potential, and Retention: An Approach for Evaluating Adaptive AI-Enabled Medical Devices
-
AI Trust OS -- A Continuous Governance Framework for Autonomous AI Observability and Zero-Trust Compliance in Enterprise Environments
-
Synthetic Sandbox for Training Machine Learning Engineering Agents
-
AURA: Always-On Understanding and Real-Time Assistance via Video Streams
-
InfoSeeker: A Scalable Hierarchical Parallel Agent Framework for Web Information Seeking
-
Agentic-MME: What Agentic Capability Really Brings to Multimodal Intelligence?
-
Epicure: Multidimensional Flavor Structure in Food Ingredient Embeddings
-
Ouroboros: Dynamic Weight Generation for Recursive Transformers via Input-Conditioned LoRA Modulation
-
Mapping generative AI use in the human brain: divergent neural, academic, and mental health profiles of functional versus socio emotional AI use
-
Woosh: A Sound Effects Foundation Model
-
CORAL: Towards Autonomous Multi-Agent Evolution for Open-Ended Discovery
-
A Simple Baseline for Streaming Video Understanding
-
SKILL0: In-Context Agentic Reinforcement Learning for Skill Internalization
-
VOID: Video Object and Interaction Deletion
-
Attention to Mamba: A Recipe for Cross-Architecture Distillation
-
Experience as a Compass: Multi-agent RAG with Evolving Orchestration and Agent Prompts
-
Signals: Trajectory Sampling and Triage for Agentic Interactions
-
Embarrassingly Simple Self-Distillation Improves Code Generation
-
Detecting Multi-Agent Collusion Through Multi-Agent Interpretability
-
OmniVoice: Towards Omnilingual Zero-Shot Text-to-Speech with Diffusion Language Models
-
OmniMem: Autoresearch-Guided Discovery of Lifelong Multimodal Agent Memory
-
ASI-Evolve: AI Accelerates AI
-
The Model Says Walk: How Surface Heuristics Override Implicit Constraints in LLM Reasoning
-
GEMS: Agent-Native Multimodal Generation with Memory and Skills
-
Towards Computational Social Dynamics of Semi-Autonomous AI Agents
-
Meta-Harness: End-to-End Optimization of Model Harnesses
-
HandX: Scaling Bimanual Motion and Interaction Generation
-
The Ultimate Tutorial for AI-driven Scale Development in Generative Psychometrics: Releasing AIGENIE from its Bottle
MongoDB - Build AI That Scales
