Papers
-
Mending the Holes: Mitigating Reward Hacking in Reinforcement Learning for Multilingual Translation
-
Are General-Purpose Vision Models All We Need for 2D Medical Image Segmentation? A Cross-Dataset Empirical Study
-
Convergence Rate of a Functional Learning Method for Contextual Stochastic Optimization
-
3DTCR: A Physics-Based Generative Framework for Vortex-Following 3D Reconstruction to Improve Tropical Cyclone Intensity Forecasting
-
Causal Cellular Context Transfer Learning (C3TL): An Efficient Architecture for Prediction of Unseen Perturbation Effects
-
Topo-R1: Detecting Topological Anomalies via Vision-Language Models
-
Team RAS in 10th ABAW Competition: Multimodal Valence and Arousal Estimation Approach
-
Reference-Free Image Quality Assessment for Virtual Try-On via Human Feedback
-
Competition-Aware CPC Forecasting with Near-Market Coverage
-
LADR: Locality-Aware Dynamic Rescue for Efficient Text-to-Image Generation with Diffusion Large Language Models
-
L2GTX: From Local to Global Time Series Explanations
-
GeoChemAD: Benchmarking Unsupervised Geochemical Anomaly Detection for Mineral Exploration
-
Fractals made Practical: Denoising Diffusion as Partitioned Iterated Function Systems
-
Mitigating Memorization in Text-to-Image Diffusion via Region-Aware Prompt Augmentation and Multimodal Copy Detection
-
Rooftop Wind Field Reconstruction Using Sparse Sensors: From Deterministic to Generative Learning Methods
-
InterEdit: Navigating Text-Guided Multi-Human 3D Motion Editing
-
Human-in-the-Loop LLM Grading for Handwritten Mathematics Assessments
-
Influence Malleability in Linearized Attention: Dual Implications of Non-Convergent NTK Dynamics
-
DRCY: Agentic Hardware Design Reviews
-
V-Bridge: Bridging Video Generative Priors to Versatile Few-shot Image Restoration
-
Reasoning over Video: Evaluating How MLLMs Extract, Integrate, and Reconstruct Spatiotemporal Evidence
-
Breaking the Tuning Barrier: Zero-Hyperparameters Yield Multi-Corner Analysis Via Learned Priors
-
MESD: Detecting and Mitigating Procedural Bias in Intersectional Groups
-
SldprtNet: A Large-Scale Multimodal Dataset for CAD Generation in Language-Driven 3D Design
-
Beyond Final Answers: CRYSTAL Benchmark for Transparent Multimodal Reasoning Evaluation
-
Evaluating VLMs' Spatial Reasoning Over Robot Motion: A Step Towards Robot Planning with Motion Preferences
-
BenDFM: A taxonomy and synthetic CAD dataset for manufacturability assessment in sheet metal bending
-
Panoramic Multimodal Semantic Occupancy Prediction for Quadruped Robots
-
BoSS: A Best-of-Strategies Selector as an Oracle for Deep Active Learning
-
ZO-SAM: Zero-Order Sharpness-Aware Minimization for Efficient Sparse Training
-
NOIR: Neural Operator mapping for Implicit Representations
-
Geometry-Guided Camera Motion Understanding in VideoLLMs
-
FDeID-Toolbox: Face De-Identification Toolbox
-
Scalable Machines with Intrinsic Higher Mental-State Dynamics
-
Developing the PsyCogMetrics AI Lab to Evaluate Large Language Models and Advance Cognitive Science -- A Three-Cycle Action Design Science Study
-
Steve-Evolving: Open-World Embodied Self-Evolution via Fine-Grained Diagnosis and Dual-Track Knowledge Distillation
-
When Right Meets Wrong: Bilateral Context Conditioning with Reward-Confidence Correction for GRPO
-
ESG-Bench: Benchmarking Long-Context ESG Reports for Hallucination Mitigation
-
DiT-IC: Aligned Diffusion Transformer for Efficient Image Compression
-
Towards Faithful Multimodal Concept Bottleneck Models
-
Reconciling In-Context and In-Weight Learning via Dual Representation Space Encoding
-
Developing and evaluating a chatbot to support maternal health care
-
Semantic Invariance in Agentic AI
-
Purifying Generative LLMs from Backdoors without Prior Knowledge or Clean Reference
-
Perceive What Matters: Relevance-Driven Scheduling for Multimodal Streaming Perception
-
Clustering Astronomical Orbital Synthetic Data Using Advanced Feature Extraction and Dimensionality Reduction Techniques
-
MXNorm: Reusing MXFP block scales for efficient tensor normalisation
-
Diffusion-Based Feature Denoising and Using NNMF for Robust Brain Tumor Classification
-
Towards Spatio-Temporal World Scene Graph Generation from Monocular Videos
-
Learnability and Privacy Vulnerability are Entangled in a Few Critical Weights
MongoDB - Build AI That Scales
