Definition
Datasets where the distribution of target variables is significantly skewed.
Detailed Explanation
Imbalanced data refers to situations where the classes in a classification problem are not represented equally. This imbalance can lead to models that are biased toward the majority class and perform poorly on minority classes. Various techniques like oversampling undersampling and synthetic data generation can help address this issue.
Use Cases
Fraud detection disease diagnosis anomaly detection