Papers
-
Agentic Robot Policy Self-Improvement in the Real World
-
Data-Driven Integration Kernels for Interpretable Nonlocal Operator LearningNVIDIA / Boston University, Lamont-Doherty Earth Observatory, New York University, University of California, University of Lausanne
-
GIIM: Graph-based Learning of Inter- and Intra-view Dependencies for Multi-view Medical Image Diagnosis
-
A Survey of Weight Space Learning: Understanding, Representation, and GenerationNVIDIA / Technion – Israel Institute of Technology, University of California, University of Notre Dame, University of St. Gallen, University of Surrey, University of Virginia
-
ReCoSplat: Autoregressive Feed-Forward Gaussian Splatting Using Render-and-CompareNVIDIA / Hong Kong University of Science and Technology, Shanghai Jiao Tong University, Swiss Federal Institute of Technology in Zurich, University of California, Merced
-
MM-Zero: Self-Evolving Multi-Model Vision Language Models From Zero Data
-
Retrieval-Augmented Gaussian Avatars: Improving Expression GeneralizationNVIDIA / Bar-Ilan University, OriginAI, Technion – Israel Institute of Technology, The Hebrew University of Jerusalem
-
Scalable Training of Mixture-of-Experts Models with Megatron Core
-
Scalable Training of Mixture-of-Experts Models with Megatron Core
-
Latent Transfer Attack: Adversarial Examples via Generative Latent Spaces
-
AI+HW 2035: Shaping the Next DecadeNVIDIA, Google, AMD, IBM, Together AI, OpenAI, SEMRON, EnCharge AI, SambaNova, SK Hynix, Oracle / Agentrys, Brown University, California Institute of Technology, Carnegie Mellon University, Hewlett Packard Labs, New York University, Princeton University, Stanford University, University at Buffalo, University of California, University of Illinois Urbana-Champaign, University of Pennsylvania, University of Texas
-
Progressive Residual Warmup for Language Model PretrainingNVIDIA / Hong Kong University of Science and Technology, The University of Hong Kong, University of Surrey
-
FlashAttention-4: Algorithm and Kernel Pipelining Co-Design for Asymmetric Hardware Scaling
-
Towards Multimodal Lifelong Understanding: A Dataset and Agentic BaselineNVIDIA / Nanjing University, Shanghai Jiao Tong University, The University of Tokyo, Zhejiang University
-
AI+HW 2035: Shaping the Next Decade
-
FlashAttention-4: Algorithm and Kernel Pipelining Co-Design for Asymmetric Hardware Scaling
-
Towards Multimodal Lifelong Understanding: A Dataset and Agentic Baseline
-
RoboCasa365: A Large-Scale Simulation Framework for Training and Benchmarking Generalist Robots
-
ManipulationNet: An Infrastructure for Benchmarking Real-World Robot Manipulation with Physical Skill Challenges and Embodied Multimodal Reasoning
-
V1 : Unifying Generation and Self-Verification for Parallel Reasoners
-
ROBOMETER: Scaling General-Purpose Robotic Reward Models via Trajectory Comparisons
-
CuTe Layout Representation and Algebra
-
Mode Seeking meets Mean Seeking for Fast Long Video Generation
-
VGG-T3: Offline Feed-Forward 3D Reconstruction at Scale
-
Test-Time Training with KV Binding Is Secretly Linear Attention
-
Toward the Thermodynamic Limit: Neural Operators for Non-equilibrium Dynamics of Mott Insulators
-
El Agente Gráfico: Structured Execution Graphs for Scientific Agents
-
EgoScale: Scaling Dexterous Manipulation with Diverse Egocentric Human Data
-
World Action Models are Zero-shot Policies
-
Isaac Lab: A GPU-Accelerated Simulation Framework for Multi-Modal Robot LearningNVIDIA / Georgia Institute of TechnologyUniversity of Texas, Massachusetts Institute of Technology, Robotics and AI Institute, Swiss Federal Institute of Technology in Zurich, University of California, University of Southern California, University of Texas, University of Toronto
-
iGRPO: Self-Feedback–Driven LLM Reasonin
-
DreamDojo: A Generalist Robot World Model from Large-Scale Human VideosNVIDIA / Hong Kong University of Science and Technology, Korea Advanced Institute of Science & Technology, Stanford University, University of California, University of Toronto, University of Washington
-
Learning to Discover at Test Time
-
Who Said Neural Networks Aren't Linear?
-
Pretraining Large Language Models with NVFP4
-
3D-GENERALIST: Vision-Language-Action Models for Crafting 3D Worlds
-
GR00T N1: An Open Foundation Model for Generalist Humanoid Robots
-
Describe Anything: Detailed Localized Image and Video Captioning
-
One-Minute Video Generation with Test-Time Training
-
Data Scaling Laws for End-to-End Autonomous Driving
-
Cosmos World Foundation Model Platform for Physical AI
-
NVLM: Open Frontier-Class Multimodal LLMs
-
Nemotron-4-340B-Instruct
-
ChipNeMo: Domain-Adapted LLMs for Chip Design
-
Eureka: Human-Level Reward Design via Coding Large Language Models
-
SteerLM: Attribute Conditioned SFT as an (User-Steerable) Alternative to RLHF
-
Neuralangelo: High-Fidelity Neural Surface Reconstruction
-
Voyager: An Open-Ended Embodied Agent with Large Language Models
-
Reflexion: Language Agents with Verbal Reinforcement Learning
MongoDB - Build AI That Scales
