Papers
-
Beyond Hate: Differentiating Uncivil and Intolerant Speech in Multimodal Content Moderation
-
A Critical Review on the Effectiveness and Privacy Threats of Membership Inference Attacks
-
Robustness Quantification and Uncertainty Quantification: Comparing Two Methods for Assessing the Reliability of Classifier Predictions
-
VLA-IAP: Training-Free Visual Token Pruning via Interaction Alignment for Vision-Language-Action Models
-
On the use of Aggregation Operators to improve Human Identification using Dental Records
-
Can Large Language Models Reason and Optimize Under Constraints?
-
AgentRAE: Remote Action Execution through Notification-based Visual Backdoors against Screenshots-based Mobile GUI Agents
-
Zero-Shot Personalization of Objects via Textual Inversion
-
Knowledge Access Beats Model Size: Memory Augmented Routing for Persistent AI Agents
-
A Sobering Look at Tabular Data Generation via Probabilistic Circuits
-
Concept-based explanations of Segmentation and Detection models in Natural Disaster Management
-
Cog3DMap: Multi-View Vision-Language Reasoning with 3D Cognitive Maps
-
Looking Beyond the Window: Global-Local Aligned CLIP for Training-free Open-Vocabulary Semantic Segmentation
-
Generative Event Pretraining with Foundation Model Alignment
-
Traffic Sign Recognition in Autonomous Driving: Dataset, Benchmark, and Field Experiment
-
YOLOv10 with Kolmogorov-Arnold networks and vision-language foundation models for interpretable object detection and trustworthy multimodal AI in computer vision perception
-
HUydra: Full-Range Lung CT Synthesis via Multiple HU Interval Generative Modelling
-
Assessing the Robustness of Climate Foundation Models under No-Analog Distribution Shifts
-
Parametric Knowledge and Retrieval Behavior in RAG Fine-Tuning for Electronic Design Automation
-
MSR-HuBERT: Self-supervised Pre-training for Adaptation to Multiple Sampling Rates
-
DBAutoDoc: Automated Discovery and Documentation of Undocumented Database Schemas via Statistical Analysis and Iterative LLM Refinement
-
Post-Selection Distributional Model Evaluation
-
Prompt Amplification and Zero-Shot Late Fusion in Audio-Language Models for Speech Emotion Recognition
-
Minibal: Balanced Game-Playing Without Opponent Modeling
-
Machine Learning Models for the Early Detection of Burnout in Software Engineering: a Systematic Literature Review
-
Mind Your HEARTBEAT! Claw Background Execution Inherently Enables Silent Memory Pollution
-
StateLinFormer: Stateful Training Enhancing Long-term Memory in Navigation
-
MLLM-HWSI: A Multimodal Large Language Model for Hierarchical Whole Slide Image Understanding
-
AuthorMix: Modular Authorship Style Transfer via Layer-wise Adapter Mixing
-
PolarAPP: Beyond Polarization Demosaicking for Polarimetric Applications
-
Generalization Bounds for Physics-Informed Neural Networks for the Incompressible Navier-Stokes Equations
-
Can an LLM Detect Instances of Microservice Infrastructure Patterns?
-
MsFormer: Enabling Robust Predictive Maintenance Services for Industrial Devices
-
MedCausalX: Adaptive Causal Reasoning with Self-Reflection for Trustworthy Medical Vision-Language Models
-
Policy-based Tuning of Autoregressive Image Models with Instance- and Distribution-Level Rewards
-
A Synchronized Audio-Visual Multi-View Capture System
-
When Language Models Lose Their Mind: The Consequences of Brain Misalignment
-
SpecXMaster Technical Report
-
NeuroSeg Meets DINOv3: Transferring 2D Self-Supervised Visual Priors to 3D Neuron Segmentation via DINOv3 Initialization
-
High-Resolution Tensor-Network Fourier Methods for Exponentially Compressed Non-Gaussian Aggregate Distributions
-
Dual-Criterion Curriculum Learning: Application to Temporal Data
-
Between Rules and Reality: On the Context Sensitivity of LLM Moral Judgment
-
AgentFoX: LLM Agent-Guided Fusion with eXplainability for AI-Generated Image Detection
-
Automatic Segmentation of 3D CT scans with SAM2 using a zero-shot approach
-
SMSP: A Plug-and-Play Strategy of Multi-Scale Perception for MLLMs to Perceive Visual Illusions
-
PiCo: Active Manifold Canonicalization for Robust Robotic Visual Anomaly Detection
-
3rd Place of MeViS-Audio Track of the 5th PVUW: VIRST-Audio
-
Polaris: A Gödel Agent Framework for Small Language Models through Experience-Abstracted Policy Repair
-
InterDyad: Interactive Dyadic Speech-to-Video Generation by Querying Intermediate Visual Guidance
-
A Bayesian Learning Approach for Drone Coverage Network: A Case Study on Cardiac Arrest in Scotland
KiloClaw - Managed 🦀 