Definition
The process of adding meaningful labels or tags to raw data for use in supervised machine learning.
Detailed Explanation
Similar to data annotation but specifically focused on assigning categorical or numerical labels to data points. This process creates the ground truth that machine learning models use to learn patterns and make predictions. It often involves quality control measures and consensus mechanisms to ensure label accuracy.
Use Cases
Classification of customer support tickets, Medical image diagnosis training, Sentiment analysis datasets