Papers
-
LUMINA: LLM-Guided GPU Architecture Exploration via Bottleneck Analysis
-
CollabOD: Collaborative Multi-Backbone with Cross-scale Vision for UAV Small Object DetectionShenzhen ATC Technology / Hong Kong University of Science and Technology, The University of International Business and Economics Beijing, The University of Sydney
-
Beyond Geometry: Artistic Disparity Synthesis for Immersive 2D-to-3D
-
Pano3DComposer: Feed-Forward Compositional 3D Scene Generation from Single Panoramic ImageSun Yat-sen University
-
InfoGatherer: Principled Information Seeking via Evidence Retrieval and Strategic QuestioningCanadian Institute for Advanced Research, Dalhousie University, University of British Columbia, University of Washington, Vector Institute
-
The World Won't Stay Still: Programmable Evolution for Agent Benchmarks
-
CORE-Seg: Reasoning-Driven Segmentation for Complex Lesions via Reinforcement LearningAgency for Science, Technology and Research, Singapore, Nanjing University of Science and Technology, Southeast University
-
DeepFact: Co-Evolving Benchmarks and Agents for Deep Research Factuality
-
Stock Market Prediction Using Node Transformer Architecture Integrated with BERT Sentiment AnalysisUniversity of Ottawa
-
Design Experiments to Compare Multi-armed Bandit AlgorithmsThe Chinese University of Hong Kong, University of Toronto
-
BlackMirror: Black-Box Backdoor Detection for Text-to-Image Models via Instruction-Response DeviationBeijing Institute of Technology, Chinese Academy of Sciences, Sun Yat-sen University, University of Chinese Academy of Sciences
-
Learning Next Action Predictors from Human-Computer InteractionHasso Plattner Institute, New York University, Stanford University
-
Weak-SIGReg: Covariance Regularization for Stable Deep LearningKreasof AI
-
RAC: Rectified Flow Auto CoderNanyang Technological University, Rutgers University, University of Wisconsin-Madison
-
Towards Driver Behavior Understanding: Weakly-Supervised Risk Perception in Driving Scenes
-
Addressing the Ecological Fallacy in Larger LMs with Human ContextStony Brook University, Vanderbilt University
-
Beyond Static Frames: Temporal Aggregate-and-Restore Vision Transformer for Human Pose EstimationZhejiang Gongshang University
-
A Persistent-State Dataflow Accelerator for Memory-Bound Linear Attention Decode on FPGAUniversity of Southern California
-
FTSplat: Feed-forward Triangle Splatting NetworkNankai University
-
Implicit Style Conditioning: A Structured Style-Rewrite Framework for Low-Resource Character ModelingGuangdong University of Finance
-
OD-RASE: Ontology-Driven Risk Assessment and Safety Enhancement for Autonomous DrivingChubu University
-
Facial Expression Recognition Using Residual Masking NetworkHo Chi Minh City University of Technology
-
SLER-IR: Spherical Layer-wise Expert Routing for All-in-One Image RestorationUniversity of California San Diego
-
XAI for Coding Agent Failures: Transforming Raw Execution Traces into Actionable InsightsIslington College
-
Adaptive Radial Projection on Fourier Magnitude Spectrum for Document Image Skew EstimationHo Chi Minh City University of Technology, Vietnam National University Ho Chi Minh City
-
Vessel-Aware Deep Learning for OCTA-Based Detection of AMDStony Brook University
-
LucidNFT: LR-Anchored Multi-Reward Preference Optimization for Generative Real-World Super-ResolutionHong Kong University of Science and Technology
-
Energy-Driven Adaptive Visual Token Pruning for Efficient Vision-Language ModelsHong Kong University of Science and Technology
-
Unify the Views: View-Consistent Prototype Learning for Few-Shot SegmentationTongji University
-
Who We Are, Where We Are: Mental Health at the Intersection of Person, Situation, and Large Language ModelsOslo Metropolitan University, Stony Brook University, University of Texas
-
Domain-Adaptive Model Merging across Disconnected ModesNanchang University, Peking University, Southeast University, Tongji University
-
OVGGT: O(1) Constant-Cost Streaming Visual Geometry TransformerNational Taiwan University, National Taiwan University of Science and Technology
-
Omni-Masked Gradient Descent: Memory-Efficient Optimization via Mask Traversal with Improved ConvergencePeking University
-
Exploring Open-Vocabulary Object Recognition in Images using CLIPIwate Prefectural University
-
Skeleton-to-Image Encoding: Enabling Skeleton Representation Learning via Vision-Pretrained ModelsHebei University of Technology, KTH Royal Institute of Technology, Lancaster University, Nanyang Technological University, Shenzen MSU-BIT University, VinUniversity
-
CR-QAT: Curriculum Relational Quantization-Aware Training for Open-Vocabulary Object DetectionIncheon National University, Korea Advanced Institute of Science & Technology, University of Seoul
-
PROBE: Probabilistic Occupancy BEV Encoding with Analytical Translation Robustness for 3D Place Recognition
-
Imagine How To Change: Explicit Procedure Modeling for Change CaptioningAalto University, Chinese Academy of Sciences, Sichuan University, University of Chinese Academy of Sciences
-
Breaking Smooth-Motion Assumptions: A UAV Benchmark for Multi-Object Tracking in Complex and Adverse ConditionsXidian University
-
Towards High-resolution and Disentangled Reference-based Sketch ColorizationThe University of Tokyo, Waseda University
-
An Interactive Multi-Agent System for Evaluation of New Product Concepts
-
HarvestFlex: Strawberry Harvesting via Vision-Language-Action Policy Adaptation in the WildBeijing Academy of Agriculture and Forestry Sciences, ShanghaiTech University
-
Agent Hunt: Bounty Based Collaborative Autoformalization With LLM AgentsAI4REASON Institute, Chalmers University of Technology, The University of Melbourne, University of Gothenburg
-
Technical Report: Automated Optical Inspection of Surgical InstrumentsNational University of Computer and Emerging Sciences Islamabad
-
Rank-Factorized Implicit Neural Bias: Scaling Super-Resolution Transformer with FlashAttentionUniversity of Seoul
-
TADPO: Reinforcement Learning Goes Off-roadCarnegie Mellon University
-
Track-SQL: Enhancing Generative Language Models with Dual-Extractive Modules for Schema and Context Tracking in Multi-turn Text-to-SQLGuangdong Laboratory of Artificial Intelligence and Digital Economy, Guangdong University of Technology, Peng Cheng Laboratory, Shantou University
-
MM-ISTS: Cooperating Irregularly Sampled Time Series Forecasting with Multimodal Vision-Text LLMsAcademy of Sciences Hong Kong, East China Normal University, The Hong Kong Polytechnic University
-
RePer-360: Releasing Perspective Priors for 360$^\circ$ Depth Estimation via Self-Modulation
-
Restoring Linguistic Grounding in VLA Models via Train-Free Attention RecalibrationFudan University, Singapore Management University, Tsinghua University
MongoDB - Build AI That Scales
