Class imbalance problems arise when the numbers of examples in each class are unequal, causing challenges for traditional machine-learning models. This paper proposes two ensemble learning techniques, Hard Example Mining (HEM) and Soft Example Mining (SEM), to handle class imbalance. HEM focuses on hard examples that are misclassified, while SEM focuses on soft examples with low predictive confidence. We incorporate HEM and SEM into the Balanced Cascade architecture with AdaBoost as the base learner. Experiments on benchmark and real-world datasets show that the proposed approaches improve balanced accuracy and F1 score over baseline AdaBoost.
We proposed two gradient resampling techniques, HEM and SEM, to handle class imbalance. Integrated into the BCWF ensemble architecture, experiments show that the proposed approaches achieve higher balanced accuracy and F1 score than baseline AdaBoost, especially with highly imbalanced data. In future, we will explore more advanced ensemble methods and apply our approaches to other real-world problems.