Papers
-
98$\times$ Faster LLM Routing Without a Dedicated GPU: Flash Attention, Prompt Compression, and Near-Streaming for the vLLM Semantic Router
-
LR-SGS: Robust LiDAR-Reflectance-Guided Salient Gaussian Splatting for Self-Driving Scene Reconstruction
-
From Sparse to Dense: Multi-View GRPO for Flow Models via Augmented Condition Space
-
Sobolev--Ricci Curvature
-
VGGT-World: Transforming VGGT into an Autoregressive Geometry World Model
-
VFM-Recon: Unlocking Cross-Domain Scene-Level Neural Reconstruction with Scale-Aligned Foundation Priors
-
Continual Learning in Large Language Models: Methods, Challenges, and Opportunities
-
AVION: Aerial Vision-Language Instruction from Offline Teacher to Prompt-Tuned Network
-
CHIMERA-Bench: A Benchmark Dataset for Epitope-Specific Antibody Design
-
Learning Geometric and Photometric Features from Panoramic LiDAR Scans for Outdoor Place Categorization
-
From Text to Forecasts: Bridging Modality Gap with Temporal Evolution Semantic Space
-
Spatial Transcriptomics as Images for Large-Scale Pretraining
-
RetroReasoner: A Reasoning LLM for Strategic Retrosynthesis Prediction
-
Marker-Based 3D Reconstruction of Aggregates with a Comparative Analysis of 2D and 3D Morphologies
-
Vision Verification Enhanced Fusion of VLMs for Efficient Visual Reasoning
-
Spatially Grounded Long-Horizon Task Planning in the Wild
-
Disentangled Latent Dynamics Manifold Fusion for Solving Parameterized PDEs
-
MetaKE: Meta-learning Aligned Knowledge Editing via Bi-level Optimization
-
Bin~Wan,G2HFNet: GeoGran-Aware Hierarchical Feature Fusion Network for Salient Object Detection in Optical Remote Sensing Images
-
Colluding LoRA: A Composite Attack on LLM Safety Alignment
-
Experimental evidence of progressive ChatGPT models self-convergence
-
Federated Hierarchical Clustering with Automatic Selection of Optimal Cluster Numbers
-
RSONet: Region-guided Selective Optimization Network for RGB-T Salient Object Detection
-
STRAP-ViT: Segregated Tokens with Randomized -- Transformations for Defense against Adversarial Patches in ViTs
-
CM-Bench: A Comprehensive Cross-Modal Feature Matching Benchmark Bridging Visible and Infrared Images
-
HSEmotion Team at ABAW-10 Competition: Facial Expression Recognition, Valence-Arousal Estimation, Action Unit Detection and Fine-Grained Violence Classification
-
RXNRECer Enables Fine-grained Enzymatic Function Annotation through Active Learning and Protein Language Models
-
HaltNav: Reactive Visual Halting over Lightweight Topological Priors for Robust Vision-Language Navigation
-
EvolveCoder: Evolving Test Cases via Adversarial Verification for Code Reinforcement Learning
-
Seeing Eye to Eye: Enabling Cognitive Alignment Through Shared First-Person Perspective in Human-AI Collaboration
-
FGTR: Fine-Grained Multi-Table Retrieval via Hierarchical LLM Reasoning
-
VCBench: A Streaming Counting Benchmark for Spatial-Temporal State Maintenance in Long Videos
-
Cost-Efficient Multimodal LLM Inference via Cross-Tier GPU Heterogeneity
-
HFP-SAM: Hierarchical Frequency Prompted SAM for Efficient Marine Animal Segmentation
-
AI Planning Framework for LLM-Based Web Agents
-
Text-Phase Synergy Network with Dual Priors for Unsupervised Cross-Domain Image Retrieval
-
Design-Specification Tiling for ICL-based CAD Code Generation
-
Deep Learning Based Estimation of Blood Glucose Levels from Multidirectional Scleral Blood Vessel Imaging
-
UNIStainNet: Foundation-Model-Guided Virtual Staining of H&E to IHC
-
Altered Thoughts, Altered Actions: Probing Chain-of-Thought Vulnerabilities in VLA Robotic Manipulation
-
The COTe score: A decomposable framework for evaluating Document Layout Analysis models
-
IGASA: Integrated Geometry-Aware and Skip-Attention Modules for Enhanced Point Cloud Registration
-
CMHANet: A Cross-Modal Hybrid Attention Network for Point Cloud Registration
-
CognitionCapturerPro: Towards High-Fidelity Visual Decoding from EEG/MEG via Multi-modal Information and Asymmetric Alignment
-
SciDesignBench: Benchmarking and Improving Language Models for Scientific Inverse Design
-
Graph In-Context Operator Networks for Generalizable Spatiotemporal Prediction
-
Anchored Alignment: Preventing Positional Collapse in Multimodal Recommender Systems
-
On Using Machine Learning to Early Detect Catastrophic Failures in Marine Diesel Engines
-
VecMol: Vector-Field Representations for 3D Molecule Generation
-
SRAM-Based Compute-in-Memory Accelerator for Linear-decay Spiking Neural Networks
MongoDB - Build AI That Scales
