Papers
-
RePer-360: Releasing Perspective Priors for 360$^\circ$ Depth Estimation via Self-Modulation
-
Restoring Linguistic Grounding in VLA Models via Train-Free Attention RecalibrationFudan University, Singapore Management University, Tsinghua University
-
Demystifying KAN for Vision Tasks: The RepKAN ApproachSejong University
-
EvoESAP: Non-Uniform Expert Pruning for Sparse MoEMohamed bin Zayed University of Artificial Intelligence, Westlake University, Zhejiang University
-
MASFactory: A Graph-centric Framework for Orchestrating LLM-Based Multi-Agent Systems with Vibe GraphingBeijing University of Posts and Telecommunications, Shanghai Jiao Tong University
-
Preventing Learning Stagnation in PPO by Scaling to 1 Million Parallel Environments
-
EffectMaker: Unifying Reasoning and Generation for Customized Visual Effect Creation
-
MOSIV: Multi-Object System Identification from VideosInsta360 / Carnegie Mellon University, ETH Zurich, Georgia Tech, Harvard University, University of California, University of Illinois Urbana-Champaign
-
ViewFusion: Structured Spatial Thinking Chains for Multi-View ReasoningHong Kong University of Science and Technology, University of California, University of Queenland
-
Sensitivity-Aware Retrieval-Augmented Intent ClarificationUniversity of Amsterdam
-
Agnostic learning in (almost) optimal time via Gaussian surface areaETH Zurich, University of Amsterdam
-
Improved high-dimensional estimation with Langevin dynamics and stochastic weight averagingHarvard University, Princeton University, University of California
-
ResearchEnvBench: Benchmarking Agents on Environment Synthesis for Research Code ExecutionFudan University, Jilin University, Nanjing University, OpenMOSS, Shanghai Innovation Institution, Shanghai Key Laboratory of Multimodal Embodied AI, Wuhan University
-
StruVis: Enhancing Reasoning-based Text-to-Image Generation via Thinking with Structured VisionAntGroup / East China Normal University, Hong Kong University of Science and Technology, Shanghai Jiao Tong University
-
ViroGym: Realistic Large-Scale Benchmarks for Evaluating Viral Proteins
-
Occlusion-Aware SORT: Observing Occlusion for Robust Multi-Object TrackingChinese Academy of Sciences, Sichuan University
-
Ensemble Learning with Sparse HypercolumnsDublin City University
-
Heterogeneous Decentralized Diffusion ModelsBagel Lab
-
Improved Constrained Generation by Bridging Pretrained Generative Models
-
FontUse: A Data-Centric Approach to Style- and Use-Case-Conditioned In-Image TypographyUniversity of Tsukuba
-
Stabilizing Reinforcement Learning for Diffusion Language Models
-
Learning to Generate via Understanding: Understanding-Driven Intrinsic Rewarding for Unified Multimodal ModelsBaidu / Chinese Academy of Sciences, Peking University, Sun Yat-sen University, University of Chinese Academy of Sciences
-
GenHOI: Towards Object-Consistent Hand-Object Interaction with Temporally Balanced and Spatially Selective Object Injection
-
Devil is in Narrow Policy: Unleashing Exploration in Driving VLA Models
-
Probing Visual Concepts in Lightweight Vision-Language Models for Automated DrivingUniversity of Limerick
-
TempoSyncDiff: Distilled Temporally-Consistent Diffusion for Low-Latency Audio-Driven Talking Head GenerationGargi Memorial Institute of Technology, Variable Energy Cyclotron Centre
-
Transforming Omnidirectional RGB-LiDAR data into 3D Gaussian SplattingState University of New York
-
Agentic LLM Planning via Step-Wise PDDL Simulation: An Empirical CharacterisationAustrian Institute of Technology
-
Evaluating Austrian A-Level German Essays with Large Language Models for Automated Essay ScoringSalzburg University of Applied Sciences
-
Aggregative Semantics for Quantitative Bipolar Argumentation FrameworksSorbonne University
-
Text-Driven Emotionally Continuous Talking Face GenerationHarbin Institute of Technology
-
Lifelong Embodied Navigation LearningChinese Academy of Sciences, Mohamed bin Zayed University of Artificial Intelligence, University of Chinese Academy of Sciences
-
StreamVoiceAnon+: Emotion-Preserving Streaming Speaker Anonymization via Frame-Level Acoustic DistillationAgency for Science, Technology and Research, Singapore, Institute for Infocomm Research, Nanyang Technological University, The Hong Kong Polytechnic University
-
Lyapunov Probes for Hallucination Detection in Large Foundation ModelsBeihang University, Beijing Academy of Blockchain and Edge Computing, National University of Defense Technology, ShanghaiTech University
-
Offline Materials Optimization with CliqueFlowmer
-
Experiences Build Characters: The Linguistic Origins and Functional Impact of LLM PersonalityThe University of Sheffield, University of Oklahoma
-
DeepSight: Bridging Depth Maps and Language with a Depth-Driven Multimodal ModelHarbin Institute of Technology
-
Enhancing Neural Video Compression of Static Scenes with Positive-Incentive NoiseTeleAI
-
Enhancing Instruction Following of LLMs via Activation Steering with Dynamic RejectionYonsei University
-
ButterflyViT: 354$\times$ Expert Compression for Edge Vision TransformersIndian Institute of Information Technology
-
Latent Diffusion-Based 3D Molecular Recovery from Vibrational SpectraState Key Laboratory of Precision and Intelligent Chemistry, University of Birmingham, University of Science and Technology of China
-
Making Implicit Premises Explicit in Logical Understanding of EnthymemesUniversity College London
-
Dynamic Momentum Recalibration in Online Gradient LearningNortheastern University, Shenyang University of Chemical Technology, University of Louisville
-
FedARKS: Federated Aggregation via Robust and Discriminative Knowledge Selection and Integration for Person Re-identificationWuhan University of Science and Technology
-
Diffusion Language Models Are Natively Length-AwareBocconi University
-
A Hazard-Informed Data Pipeline for Robotics Physical Safety
-
DQE: A Semantic-Aware Evaluation Metric for Time Series Anomaly DetectionHangzhou Dianzi University, Zhejiang University
-
A Causal Graph Approach to Oppositional Narrative AnalysisUniversity of Deusto
-
Cross-Resolution Distribution Matching for Diffusion Distillation
-
Partial Policy Gradients for RL in LLMs
MongoDB - Build AI That Scales
