Papers
-
Viewpoint-Agnostic Grasp Pipeline using VLM and Partial Observations
-
Slumbering to Precision: Enhancing Artificial Neural Network Calibration Through Sleep-like Processes
-
Hospitality-VQA: Decision-Oriented Informativeness Evaluation for Vision-Language Models
-
Toward Unified Multimodal Representation Learning for Autonomous Driving
-
What Do AI Agents Talk About? Emergent Communication Structure in the First AI-Only Social Network
-
CCR-Bench: A Comprehensive Benchmark for Evaluating LLMs on Complex Constraints, Control Flows, and Real-World Cases
-
Reject, Resample, Repeat: Understanding Parallel Reasoning in Language Model Inference
-
VLM-SubtleBench: How Far Are VLMs from Human-Level Subtle Comparative Reasoning?
-
Structure and Progress Aware Diffusion for Medical Image Segmentation
-
Visualizing Coalition Formation: From Hedonic Games to Image Segmentation
-
A Lightweight Traffic Map for Efficient Anytime LaCAM*
-
Revisiting Sharpness-Aware Minimization: A More Faithful and Effective Implementation
-
Designing probabilistic AI monsoon forecasts to inform agricultural decision-making
-
MINT: Molecularly Informed Training with Spatial Transcriptomics Supervision for Pathology Foundation Models
-
SMGI: A Structural Theory of General Artificial Intelligence
-
LeJOT-AutoML: LLM-Driven Feature Engineering for Job Execution Time Prediction in Databricks Cost Optimization
-
Revisiting Unknowns: Towards Effective and Efficient Open-Set Active Learning
-
Bayesian Transformer for Probabilistic Load Forecasting in Smart Grids
-
EveryQuery: Zero-Shot Clinical Prediction via Task-Conditioned Pretraining over Electronic Health Records
-
NaviDriveVLM: Decoupling High-Level Reasoning and Motion Planning for Autonomous Driving
-
DyQ-VLA: Temporal-Dynamic-Aware Quantization for Embodied Vision-Language-Action Models
-
Long-Short Term Agents for Pure-Vision Bronchoscopy Robotic Autonomy
-
Beyond Heuristic Prompting: A Concept-Guided Bayesian Framework for Zero-Shot Image Recognition
-
Geometric Transformation-Embedded Mamba for Learned Video Compression
-
Ares: Adaptive Reasoning Effort Selection for Efficient LLM Agents
-
Rel-MOSS: Towards Imbalanced Relational Deep Learning on Relational Databases
-
Enhancing Unregistered Hyperspectral Image Super-Resolution via Unmixing-based Abundance Fusion Learning
-
RLPR: Radar-to-LiDAR Place Recognition via Two-Stage Asymmetric Cross-Modal Alignment for Autonomous Driving
-
Robust Transfer Learning with Side Information
-
Semantic Risk Scoring of Aggregated Metrics: An AI-Driven Approach for Healthcare Data Governance
-
SPREAD: Subspace Representation Distillation for Lifelong Imitation Learning
-
IMSE: Intrinsic Mixture of Spectral Experts Fine-tuning for Test-Time Adaptation
-
SWE-Fuse: Empowering Software Agents via Issue-free Trajectory Learning and Entropy-aware RLVR Training
-
A Hybrid Vision Transformer Approach for Mathematical Expression Recognition
-
BRIDGE: Benchmark for multi-hop Reasoning In long multimodal Documents with Grounded Evidence
-
Text to Automata Diagrams: Comparing TikZ Code Generation with Direct Image Synthesis
-
$L^3$:Scene-agnostic Visual Localization in the Wild
-
AI Agents, Language, Deep Learning and the Next Revolution in Science
-
ELLMob: Event-Driven Human Mobility Generation with Self-Aligned LLM Framework
-
VisualAD: Language-Free Zero-Shot Anomaly Detection via Vision Transformer
-
RL unknotter, hard unknots and unknotting number
-
PSTNet: Physically-Structured Turbulence Network
-
SGG-R$^{\rm 3}$: From Next-Token Prediction to End-to-End Unbiased Scene Graph Generation
-
Local Constrained Bayesian Optimization
-
Listening with the Eyes: Benchmarking Egocentric Co-Speech Grounding across Space and Time
-
Advancing Automated Algorithm Design via Evolutionary Stagewise Design with LLMs
-
Adaptive Collaboration with Humans: Metacognitive Policy Optimization for Multi-Agent LLMs with Continual Learning
-
VORL-EXPLORE: A Hybrid Learning Planning Approach to Multi-Robot Exploration in Dynamic Environments
-
Scaling Machine Learning Interatomic Potentials with Mixtures of Experts
-
OSExpert: Computer-Use Agents Learning Professional Skills via Exploration
