Papers
-
TurnWise: The Gap between Single- and Multi-turn Language Model Capabilities
-
Dual Stream Independence Decoupling for True Emotion Recognition under Masked Expressions
-
SOMP: Scalable Gradient Inversion for Large Language Models via Subspace-Guided Orthogonal Matching Pursuit
-
Anticipatory Planning for Multimodal AI Agents
-
IOSVLM: A 3D Vision-Language Model for Unified Dental Diagnosis from Intraoral Scans
-
SpokenUS: A Spoken User Simulator for Task-Oriented Dialogue
-
Conservative Continuous-Time Treatment Optimization
-
InCoder-32B: Code Foundation Model for Industrial Scenarios
-
V-Co: A Closer Look at Visual Representation Alignment via Co-Denoising
-
Adaptive Moments are Surprisingly Effective for Plug-and-Play Diffusion Sampling
-
High-Dimensional Gaussian Mean Estimation under Realizable Contamination
-
RaDAR: Relation-aware Diffusion-Asymmetric Graph Contrastive Learning for Recommendation
-
Integrating Inductive Biases in Transformers via Distillation for Financial Time Series Forecasting
-
DexGrasp-Zero: A Morphology-Aligned Policy for Zero-Shot Cross-Embodiment Dexterous Grasping
-
CABTO: Context-Aware Behavior Tree Grounding for Robot Manipulation
-
ODIN-Based CPU-GPU Architecture with Replay-Driven Simulation and Emulation
-
Empirical Recipes for Efficient and Compact Vision-Language Models
-
Beyond Accuracy: Evaluating Forecasting Models by Multi-Echelon Inventory Cost
-
WildDepth: A Multimodal Dataset for 3D Wildlife Perception and Depth Estimation
-
Is Conformal Factuality for RAG-based LLMs Robust? Novel Metrics and Systematic Insights
-
Surg$Σ$: A Spectrum of Large-Scale Multimodal Data and Foundation Models for Surgical Intelligence
-
Deep Reinforcement Learning-driven Edge Offloading for Latency-constrained XR pipelines
-
Real-Time Decoding of Movement Onset and Offset for Brain-Controlled Rehabilitation Exoskeleton
-
Prompt Programming for Cultural Bias and Alignment of Large Language Models
-
Conditional Distributional Treatment Effects: Doubly Robust Estimation and Testing
-
An assessment of data-centric methods for label noise identification in remote sensing data sets
-
Learning to Present: Inverse Specification Rewards for Agentic Slide Generation
-
What DINO saw: ALiBi positional encoding reduces positional bias in Vision Transformers
-
Stochastic Resetting Accelerates Policy Convergence in Reinforcement Learning
-
Internalizing Agency from Reflective Experience
-
M^3: Dense Matching Meets Multi-View Foundation Models for Monocular Gaussian Splatting SLAM
-
Dynamic Meta-Layer Aggregation for Byzantine-Robust Federated Learning
-
Mediocrity is the key for LLM as a Judge Anchor Selection
-
GIST: Gauge-Invariant Spectral Transformers for Scalable Graph Neural Operators
-
Unifying Optimization and Dynamics to Parallelize Sequential Computation: A Guide to Parallel Newton Methods for Breaking Sequential Bottlenecks
-
Online Experiential Learning for Language Models
-
Long-Horizon Traffic Forecasting via Incident-Aware Conformal Spatio-Temporal Transformers
-
SOMA: Unifying Parametric Human Body Models
-
SocialOmni: Benchmarking Audio-Visual Social Interactivity in Omni Models
-
Chronos: Temporal-Aware Conversational Agents with Structured Event Retrieval for Long-Term Memory
-
SparkVSR: Interactive Video Super-Resolution via Sparse Keyframe Propagation
-
ManiTwin: Scaling Data-Generation-Ready Digital Object Dataset to 100K
-
MessyKitchens: Contact-rich object-level 3D scene reconstruction
-
Efficient Reasoning on the Edge
-
SegviGen: Repurposing 3D Generative Model for Part Segmentation
-
Demystifing Video Reasoning
-
WorldCam: Interactive Autoregressive 3D Gaming Worlds with Camera Pose as a Unifying Geometric Representation
-
LLM NL2SQL Robustness: Surface Noise vs. Linguistic Variation in Traditional and Agentic Settings
-
Transformers Can Learn Rules They've Never Seen: Proof of Computation Beyond Interpolation
-
Generative AI-assisted Participatory Modeling in Socio-Environmental Planning under Deep Uncertainty
MongoDB - Build AI That Scales
