Papers

Filter by company

MultiGen: Level-Design for Editable Multiplayer Worlds in Diffusion Game Engines

Published on: 2026-03-03 6 authors
EvoSkill: Automated Skill Discovery for Multi-Agent Systems

Sentient / Virginia Tech

Published on: 2026-03-03 5 authors
Speculative Speculative Decoding

Together AI / Stanford University

Published on: 2026-03-03 1 author
OneRanker: Unified Generation and Ranking with One Model in Industrial Advertising Recommendation

Tencent

Published on: 2026-03-03 1 author
UniG2U-Bench: Do Unified Models Advance Multimodal Understanding?

Microsoft / Shanghai Jiao Tong University

Published on: 2026-03-03 1 author
NOVA: Sparse Control, Dense Synthesis for Pair-Free Video Editing

Tencent / Nanjing University, The University of Hong Kong, University of Chinese Academy of Sciences

Published on: 2026-03-03 1 author
Kling-MotionControl Technical Report

Kuaishou Technology

Published on: 2026-03-03 1 author
Architecting Trust in Artificial Epistemic Agents

Google

Published on: 2026-03-03 1 author
Utonia: Toward One Encoder for All Point Clouds

Xiaomi / The University of Hong Kong

Published on: 2026-03-03 1 author
LoGeR: Long-Context Geometric Reconstruction with Hybrid Memory

Google / University of California

Published on: 2026-03-03 1 author
Beyond Pixel Histories: World Models with Persistent 3D State

Microsoft / University of Edinburgh

Published on: 2026-03-03 1 author
Heterogeneous Agent Collaborative Reinforcement Learning

ByteDance / Beihang University, Peking University, Tsinghua University

Published on: 2026-03-03 1 author
Beyond Language Modeling: An Exploration of Multimodal Pretraining

Meta Platforms / New York University

Published on: 2026-03-03 1 author
Modular Memory is the Key to Continual Learning Agents

Microsoft / University of Bremen

Published on: 2026-03-02 1 author
Expanding LLM Agent Boundaries with Strategy-Guided Exploratio

Apple

Published on: 2026-03-02 1 author
ROBOMETER: Scaling General-Purpose Robotic Reward Models via Trajectory Comparisons

NVIDIA / University of Southern California

Published on: 2026-03-02 2 authors
LaST-VLA: Thinking in Latent Spatio-Temporal Space for Vision-Language-Action in Autonomous Driving

Xiaomi / Tsinghua University, University of Macao

Published on: 2026-03-02 1 author
Agentic Code Reasoning

Meta Platforms

Published on: 2026-03-02 1 author
CuTe Layout Representation and Algebra

NVIDIA

Published on: 2026-03-02 1 author
RubricBench: Aligning Model-Generated Rubrics with Human Standards

Tencent / University of Illinois Springfield

Published on: 2026-03-02 1 author
WorldStereo: Bridging Camera-Guided Video Generation and Scene Reconstruction via 3D Geometric Memorie

Tencent / Zhejiang University

Published on: 2026-03-02 1 author
How Well Does Agent Development Reflect Real-World Work?

Carnegie Mellon University, Stanford University

Published on: 2026-03-01 10 authors
Learn Hard Problems During RL with Reference Guided Fine-tuning

ByteDance / University of California

Published on: 2026-03-01 1 author
SSKG Hub: An Expert-Guided Platform for LLM-Empowered Sustainability Standards Knowledge Graphs

Alibaba

Published on: 2026-02-28 1 author
Process-of-Thought Reasoning for Videos

Snap / Nanyang Technological University, Sun Yat-sen University

1 author
Compositional Generalization Requires Linear, Orthogonal Representations in Vision Embedding Models

Helmholtz Munich, Technical University of Munich, University of Tübingen

Published on: 2026-02-27 3 authors
EMO-R3: Reflective Reinforcement Learning for Emotional Reasoning in Multimodal Large Language Models

Xiaomi / Wuhan University

Published on: 2026-02-27 1 author
Mode Seeking meets Mean Seeking for Fast Long Video Generation

NVIDIA

Published on: 2026-02-27 1 author
CUDA Agent: Large-Scale Agentic RL for High-Performance CUDA Kernel Generation

ByteDance / Tsinghua University

Published on: 2026-02-27 1 author
Imagination Helps Visual Reasoning, But Not Yet in Latent Space

Published on: 2026-02-26 7 authors
MSJoE: Jointly Evolving MLLM and Sampler for Efficient Long-Form Video Understanding

Xiaomi / Tongji University

Published on: 2026-02-26 1 author
ThinkOmni: Lifting Textual Reasoning to Omni-modal Scenarios via Guidance Decoding

Xiaomi / Huazhong University of Science and Technology

Published on: 2026-02-26 1 author
AngelSlim: A more accessible, comprehensive, and efficient toolkit for large model compression

Tencent

Published on: 2026-02-26 1 author
Exploratory Memory-Augmented LLM Agent via Hybrid On- and Off-Policy Optimization

Microsoft / Korea Advanced Institute of Science & Technology

Published on: 2026-02-26 1 author
SceneTransporter: Optimal Transport-Guided Compositional Latent Diffusion for Single-Image Structured 3D Scene Generation

Skywork AI / Tsinghua University

Published on: 2026-02-26 1 author
SkyReels-V4: Multi-modal Video-Audio Generation, Inpainting and Editing model

Skywork AI

Published on: 2026-02-26 1 author
Generative Recommendation for Large-Scale Advertising

Kuaishou Technology

Published on: 2026-02-26 1 author
Exploratory Memory-Augmented LLM Agent via Hybrid On- and Off-Policy Optimization

Microsoft / Korea Advanced Institute of Science & Technology

Published on: 2026-02-26 1 author
VGG-T3: Offline Feed-Forward 3D Reconstruction at Scale

NVIDIA / University of Toronto

Published on: 2026-02-26 1 author
DualPath: Breaking the Storage Bandwidth Bottleneck in Agentic LLM Inference

DeepSeek / Peking University, Tsinghua University

Published on: 2026-02-26 1 author
TrajTok: Learning Trajectory Tokens enables better Video Understanding

Apple / University of Washington

Published on: 2026-02-26 1 author
The Art of Efficient Reasoning: Data, Reward, and Optimization

Tencent / The University of Hong Kong

Published on: 2026-02-25 1 author
World Guidance: World Modeling in Condition Space for Action Generatio

ByteDance / The University of Hong Kong

Published on: 2026-02-25 1 author
The Design Space of Tri-Modal Masked Diffusion Models

Apple / University of Cambridge

Published on: 2026-02-25 1 author
WebGym: Scaling Training Environments for Visual Web Agents with Realistic Tasks

Microsoft / University of Illinois Urbana-Champaign

Published on: 2026-02-25 1 author
ActionEngine: From Reactive to Programmatic GUI Agents via State Machine Memory

Published on: 2026-02-24 7 authors
UFO: Unifying Feed-Forward and Optimization-based Methods for Large Driving Scene Modeling

Xiaomi / University of Illinois Urbana-Champaign

Published on: 2026-02-24 1 author
From Pairs to Sequences: Track-Aware Policy Gradients for Keypoint Detection

Xiaomi / Wuhan University

Published on: 2026-02-24 1 author
VGGDrive: Empowering Vision-Language Models with Cross-View Geometric Grounding for Autonomous Driving

Xiaomi / Tianjin University

Published on: 2026-02-24 1 author
Test-Time Training with KV Binding Is Secretly Linear Attention

NVIDIA / University of Toronto, Vector Institute

Published on: 2026-02-24 1 author

Prev 137 138 139 140 141 142 143 144 145 146 147 Next

Go to section

Search

Papers

Help

People also viewed

Create AI Tools

Mini Tool

Vibe code an AI Tool

Choose listing type: