Papers
-
When AVSR Meets Video Conferencing: Dataset, Degradation, and the Hidden Mechanism Behind Performance Collapse
-
EVA: Efficient Reinforcement Learning for End-to-End Video Agent
-
The EU AI Act and the Rights-based Approach to Technological Governance
-
Quality Over Clicks: Intrinsic Quality-Driven Iterative Reinforcement Learning for Cold-Start E-Commerce Query Suggestion
-
ProGRank: Probe-Gradient Reranking to Defend Dense-Retriever RAG from Corpus Poisoning
-
Ran Score: a LLM-based Evaluation Score for Radiology Report Generation
-
FixationFormer: Direct Utilization of Expert Gaze Trajectories for Chest X-Ray Classification
-
Optimizing Small Language Models for NL2SQL via Chain-of-Thought Fine-Tuning
-
PersonalQ: Select, Quantize, and Serve Personalized Diffusion Models for Efficient Inference
-
Caption Generation for Dongba Paintings via Prompt Learning and Semantic Fusion
-
Weak-PDE-Net: Discovering Open-Form PDEs via Differentiable Symbolic Networks and Weak Formulation
-
Cluster-Wise Spatio-Temporal Masking for Efficient Video-Language Pretraining
-
Privacy-Preserving EHR Data Transformation via Geometric Operators: A Human-AI Co-Design Technical Report
-
Safe Reinforcement Learning with Preference-based Constraint Inference
-
AscendOptimizer: Episodic Agent for Ascend NPU Operator Optimization
-
Stepwise Variational Inference with Vine Copulas
-
Asymptotic Learning Curves for Diffusion Models with Random Features Score and Manifold Data
-
A PAC-Bayesian approach to generalization for quantum models
-
Few-Shot Generative Model Adaption via Identity Injection and Preservation
-
Set-Valued Prediction for Large Language Models with Feasibility-Aware Coverage Guarantees
-
Beyond Theoretical Bounds: Empirical Privacy Loss Calibration for Text Rewriting Under Local Differential Privacy
-
FCL-COD: Weakly Supervised Camouflaged Object Detection with Frequency-aware and Contrastive Learning
-
WorldMesh: Generating Navigable Multi-Room 3D Scenes via Mesh-Conditioned Image Diffusion
-
Where Experts Disagree, Models Fail: Detecting Implicit Legal Citations in French Court Decisions
-
Causal Reconstruction of Sentiment Signals from Sparse News Data
-
DariMis: Harm-Aware Modeling for Dari Misinformation Detection on YouTube
-
JFTA-Bench: Evaluate LLM's Ability of Tracking and Analyzing Malfunctions Using Fault Trees
-
Can Graph Foundation Models Generalize Over Architecture?
-
Beyond Hate: Differentiating Uncivil and Intolerant Speech in Multimodal Content Moderation
-
A Critical Review on the Effectiveness and Privacy Threats of Membership Inference Attacks
-
Robustness Quantification and Uncertainty Quantification: Comparing Two Methods for Assessing the Reliability of Classifier Predictions
-
VLA-IAP: Training-Free Visual Token Pruning via Interaction Alignment for Vision-Language-Action Models
-
VQ-Jarvis: Retrieval-Augmented Video Restoration Agent with Sharp Vision and Fast Thought
-
PaperVoyager : Building Interactive Web with Visual Language Models
-
On the use of Aggregation Operators to improve Human Identification using Dental Records
-
Can Large Language Models Reason and Optimize Under Constraints?
-
AgentRAE: Remote Action Execution through Notification-based Visual Backdoors against Screenshots-based Mobile GUI Agents
-
Zero-Shot Personalization of Objects via Textual Inversion
-
Knowledge Access Beats Model Size: Memory Augmented Routing for Persistent AI Agents
-
A Sobering Look at Tabular Data Generation via Probabilistic Circuits
-
Concept-based explanations of Segmentation and Detection models in Natural Disaster Management
-
Cog3DMap: Multi-View Vision-Language Reasoning with 3D Cognitive Maps
-
Looking Beyond the Window: Global-Local Aligned CLIP for Training-free Open-Vocabulary Semantic Segmentation
-
Generative Event Pretraining with Foundation Model Alignment
-
Traffic Sign Recognition in Autonomous Driving: Dataset, Benchmark, and Field Experiment
-
YOLOv10 with Kolmogorov-Arnold networks and vision-language foundation models for interpretable object detection and trustworthy multimodal AI in computer vision perception
-
HUydra: Full-Range Lung CT Synthesis via Multiple HU Interval Generative Modelling
-
Assessing the Robustness of Climate Foundation Models under No-Analog Distribution Shifts
-
Parametric Knowledge and Retrieval Behavior in RAG Fine-Tuning for Electronic Design Automation
-
MSR-HuBERT: Self-supervised Pre-training for Adaptation to Multiple Sampling Rates
MongoDB - Build AI That Scales
