Papers
-
CR-QAT: Curriculum Relational Quantization-Aware Training for Open-Vocabulary Object Detection
-
PROBE: Probabilistic Occupancy BEV Encoding with Analytical Translation Robustness for 3D Place Recognition
-
Imagine How To Change: Explicit Procedure Modeling for Change Captioning
-
Breaking Smooth-Motion Assumptions: A UAV Benchmark for Multi-Object Tracking in Complex and Adverse Conditions
-
Towards High-resolution and Disentangled Reference-based Sketch Colorization
-
An Interactive Multi-Agent System for Evaluation of New Product Concepts
-
HarvestFlex: Strawberry Harvesting via Vision-Language-Action Policy Adaptation in the Wild
-
Agent Hunt: Bounty Based Collaborative Autoformalization With LLM Agents
-
Technical Report: Automated Optical Inspection of Surgical Instruments
-
Rank-Factorized Implicit Neural Bias: Scaling Super-Resolution Transformer with FlashAttention
-
TADPO: Reinforcement Learning Goes Off-road
-
Track-SQL: Enhancing Generative Language Models with Dual-Extractive Modules for Schema and Context Tracking in Multi-turn Text-to-SQL
-
MM-ISTS: Cooperating Irregularly Sampled Time Series Forecasting with Multimodal Vision-Text LLMs
-
RePer-360: Releasing Perspective Priors for 360$^\circ$ Depth Estimation via Self-Modulation
-
Restoring Linguistic Grounding in VLA Models via Train-Free Attention Recalibration
-
Demystifying KAN for Vision Tasks: The RepKAN Approach
-
EvoESAP: Non-Uniform Expert Pruning for Sparse MoE
-
MASFactory: A Graph-centric Framework for Orchestrating LLM-Based Multi-Agent Systems with Vibe Graphing
-
Preventing Learning Stagnation in PPO by Scaling to 1 Million Parallel Environments
-
EffectMaker: Unifying Reasoning and Generation for Customized Visual Effect Creation
-
MOSIV: Multi-Object System Identification from Videos
-
ViewFusion: Structured Spatial Thinking Chains for Multi-View Reasoning
-
Sensitivity-Aware Retrieval-Augmented Intent Clarification
-
Agnostic learning in (almost) optimal time via Gaussian surface area
-
Improved high-dimensional estimation with Langevin dynamics and stochastic weight averaging
-
ResearchEnvBench: Benchmarking Agents on Environment Synthesis for Research Code Execution
-
StruVis: Enhancing Reasoning-based Text-to-Image Generation via Thinking with Structured Vision
-
ViroGym: Realistic Large-Scale Benchmarks for Evaluating Viral Proteins
-
Occlusion-Aware SORT: Observing Occlusion for Robust Multi-Object Tracking
-
Ensemble Learning with Sparse Hypercolumns
-
Heterogeneous Decentralized Diffusion Models
-
FontUse: A Data-Centric Approach to Style- and Use-Case-Conditioned In-Image Typography
-
Stabilizing Reinforcement Learning for Diffusion Language Models
-
Learning to Generate via Understanding: Understanding-Driven Intrinsic Rewarding for Unified Multimodal Models
-
GenHOI: Towards Object-Consistent Hand-Object Interaction with Temporally Balanced and Spatially Selective Object Injection
-
Devil is in Narrow Policy: Unleashing Exploration in Driving VLA Models
-
Probing Visual Concepts in Lightweight Vision-Language Models for Automated Driving
-
TempoSyncDiff: Distilled Temporally-Consistent Diffusion for Low-Latency Audio-Driven Talking Head Generation
-
Transforming Omnidirectional RGB-LiDAR data into 3D Gaussian Splatting
-
Text-Driven Emotionally Continuous Talking Face Generation
-
Lifelong Embodied Navigation Learning
-
StreamVoiceAnon+: Emotion-Preserving Streaming Speaker Anonymization via Frame-Level Acoustic Distillation
-
Lyapunov Probes for Hallucination Detection in Large Foundation Models
-
Offline Materials Optimization with CliqueFlowmer
-
Experiences Build Characters: The Linguistic Origins and Functional Impact of LLM Personality
-
DeepSight: Bridging Depth Maps and Language with a Depth-Driven Multimodal Model
-
Enhancing Neural Video Compression of Static Scenes with Positive-Incentive Noise
-
Enhancing Instruction Following of LLMs via Activation Steering with Dynamic Rejection
-
ButterflyViT: 354$\times$ Expert Compression for Edge Vision Transformers
-
Latent Diffusion-Based 3D Molecular Recovery from Vibrational Spectra
