Definition
The degree to which an AI system's decisions can be understood and traced by humans.
Detailed Explanation
Interpretability focuses on making machine learning models understandable to humans at both local (individual prediction) and global (overall model behavior) levels. It includes techniques like feature importance analysis, partial dependence plots, and model-agnostic interpretation methods. The goal is to understand not just what a model predicts, but why it makes specific predictions.
Use Cases
Financial trading algorithms providing clear reasoning for investment decisions, healthcare systems showing which symptoms led to diagnoses, security systems explaining threat detections