Papers
-
ToolPRMBench: Evaluating and Advancing Process Reward Models for Tool-using Agents
-
RIMRULE: Improving Tool-Using Language Agents via MDL-Guided Rule Learning
-
A Survey of Automatic Prompt Optimization with Instruction-focused Heuristic-based Search Algorithm
-
SEE: Strategic Exploration and Exploitation for Cohesive In-Context Prompt Optimization
-
The Behavior Gap: Evaluating Zero-shot LLM Agents in Complex Task-Oriented Dialogs
-
Transaction Categorization with Relational Deep Learning in QuickBooks
-
Gradient-guided Attention Map Editing: Towards Efficient Contextual Hallucination Mitigation
-
Modality-Specialized Synergizers for Interleaved Vision-Language Generalists
-
Learning to Search Effective Example Sequences for In-Context Learning
-
Towards Statistical Factuality Guarantee for Large Vision-Language Models
-
HyQE: Ranking Contexts with Hypothetical Query Embeddings
-
Survival of the Safest: Towards Secure Prompt Optimization through Interleaved Multi-Objective Evolution
-
Data-Driven Discovery of Conservation Laws from Trajectories via Neural Deflation
-
SPUQ: Perturbation-Based Uncertainty Quantification for Large Language Models
-
SAC3: Reliable Hallucination Detection in Black-Box Language Models via Semantic-aware Cross-check Consistency
-
Surgical-DINO: Adapter Learning of Foundation Models for Depth Estimation in Endoscopic Surgery
-
Interactive Multi-fidelity Learning for Cost-effective Adaptation of Language Model with Sparse Human Supervision
