## References

[1] M. Kubat, R. C. Holte, and S. Matwin, “Machine Learning for the Detection of Oil Spills in Satellite Radar Images.,” Machine Learning, vol. 30, no. 2-3, pp. 195–215, 1998.

[2] G. M. Weiss,“Mining with Rare Cases.,”in The Data Mining and Knowledge Discovery Handbook (O. Maimon and L. Rokach, eds.), pp. 765–776, Springer, 2005.

[3] G. M. Weiss,“Mining with rarity: a unifying framework.,”SIGKDD Explorations, vol. 6, no. 1, pp. 7–19, 2004.

[4] R. Prati, G. Batista, and M. Monard, “Class imbalances versus class overlapping: an analysis of a learning system behavior,” MICAI 2004: Advances in Artiﬁcial Intelligence, pp. 312–321, 2004.

[5] T. Jo and N. Japkowicz, “Class imbalances versus small disjuncts.,” SIGKDD Explorations, vol. 6, no. 1, pp. 40–49, 2004.

[6] R. C. Prati, G. E. A. P. A. Batista, and M. C. Monard, “Learning with Class Skews and Small Disjuncts.,” in SBIA

(A. L. C. Bazzan and S. Labidi, eds.), vol. 3171 of Lecture Notes in Computer Science, pp. 296–306, Springer, 2004.

[7] J. R. Quinlan, “Induction of Decision Trees,” Machine Learning, vol. 1, no. 1, pp. 81–106, 1986.

[8] W.-H. Yang, D.-Q. Dai, and H. Yan, “Feature Extraction and Uncorrelated Discriminant Analysis for HighDimensional Data.,” IEEE Trans. Knowl. Data Eng., vol. 20, no. 5, pp. 601–614, 2008.

[9] H. He and E. Garcia,“Learning from Imbalanced Data,”Knowledge and Data Engineering, IEEE Transactions on, vol. 21, pp. 1263–1284, Sept 2009.

[10] H. He and Y. Ma, Imbalanced Learning: Foundations, Algorithms, and Applications. Wiley-IEEE Press, 1st ed., 2013.

[11] P. Branco, L. Torgo, and R. P. Ribeiro, “A Survey of Predictive Modeling on Imbalanced Domains.,” ACM Comput. Surv., vol. 49, no. 2, pp. 31:1–31:50, 2016.

[12] D. Mease, A. Wyner, and a. Buja,“Boosted classiﬁcation trees and class probability/quantile estimation,”The Journal of Machine Learning Research, vol. 8, pp. 409–439, 2007.

[13] C. Drummond and R. Holte, “C4.5, class imbalance, and cost sensitivity: why under-sampling beats oversampling,” Workshop on Learning from Imbalanced Datasets II, pp. 1–8, 2003.

[14] R. C. Holte, L. Acker, and B. W. Porter, “Concept Learning and the Problem of Small Disjuncts.,” in IJCAI (N. S. Sridharan, ed.), pp. 813–818, Morgan Kaufmann, 1989.

[15] X.-Y. Liu and Z.-H. Zhou,“The Inﬂuence of Class Imbalance on Cost-Sensitive Learning: An Empirical Study.,” in ICDM, pp. 970–974, IEEE Computer Society, 2006.

[16] J. Zhang and I. Mani,“KNN Approach to Unbalanced Data Distributions: A Case Study Involving Information Extraction,” in Proceedings of the ICML’2003 Workshop on Learning from Imbalanced Datasets, 2003.

[17] M. Kubat and S. Matwin, “Addressing the Curse of Imbalanced Training Sets: One-Sided Selection,” in In Proceedings of the Fourteenth International Conference on Machine Learning, pp. 179–186, Morgan Kaufmann, 1997.

[18] N. Chawla, K. Bowyer, L. Hall, and W. Kegelmeyer,“SMOTE: Synthetic Minority Over-sampling Technique,” Journal of Artiﬁcial Intelligence Research, vol. 16, pp. 321–357, 2002.

[19] B. X. Wang and N. Japkowicz, “Imbalanced Data Set Learning with Synthetic Samples,” 2004.

[20] H. Han, W. Wang, and B. Mao,“Borderline-SMOTE: A New Over-Sampling Method in Imbalanced Data Sets Learning.,” in ICIC (1) (D.-S. Huang, X.-P. Zhang, and G.-B. Huang, eds.), vol. 3644 of Lecture Notes in Computer Science, pp. 878–887, Springer, 2005.

[21] S. Barua, M. M. Islam, X. Yao, and K. Murase, “MWMOTE-Majority Weighted Minority Oversampling Technique for Imbalanced Data Set Learning.,” IEEE Trans. Knowl. Data Eng., vol. 26, no. 2, pp. 405–425, 2014.

[22] A. Agrawal, H. L. Viktor, and E. Paquet, “SCUT: Multi-Class Imbalanced Data Classiﬁcation using SMOTE and Cluster-based Undersampling.,” in KDIR (A. L. N. Fred, J. L. G. Dietz, D. Aveiro, K. Liu, and J. Filipe, eds.), pp. 226–234, SciTePress, 2015.

[23] H. He, Y. Bai, E. A. Garcia, and S. Li, “ADASYN: Adaptive synthetic sampling approach for imbalanced learning.,” in IJCNN, pp. 1322–1328, IEEE, 2008.

[24] S. Barua, M. M. Islam, and K. Murase, “ProWSyn: Proximity Weighted Synthetic Oversampling Technique for Imbalanced Data Set Learning.,”in PAKDD (2) (J. Pei, V. S. Tseng, L. Cao, H. Motoda, and G. Xu, eds.), vol. 7819 of Lecture Notes in Computer Science, pp. 317–328, Springer, 2013.

[25] Y. Dong and X. Wang, “A New Over-Sampling Approach: Random-SMOTE for Learning from Imbalanced Data Sets.,” in KSEM (H. Xiong and W. B. Lee, eds.), vol. 7091 of Lecture Notes in Computer Science, pp. 343–352, Springer, 2011.

[26] I. Tomek, “Two Modiﬁcations of CNN,” IEEE Transactions on Systems, Man, and Cybernetics, vol. 7(2), pp. 679–772, 1976.

[27] G. E. A. P. A. Batista, R. C. Prati, and M. C. Monard, “A Study of the Behavior of Several Methods for Balancing Machine Learning Training Data,”ACM SIGKDD Explorations Newsletter – Special issue on learning from imbalanced datasets, vol. 6, no. 1, pp. 20–29, 2004.

[28] J. Laurikkala, “Improving Identiﬁcation of Diﬃcult Small Classes by Balancing Class Distribution.,” in AIME (S. Quaglini, P. Barahona, and S. Andreassen, eds.), vol. 2101 of Lecture Notes in Computer Science, pp. 63–66, Springer, 2001.

[29] N. V. Chawla, A. Lazarevic, L. O. Hall, and K. W. Bowyer, “SMOTEBoost: Improving Prediction of the Minority Class in Boosting.,” in PKDD (N. Lavrac, D. Gamberger, H. Blockeel, and L. Todorovski, eds.), vol. 2838 of Lecture Notes in Computer Science, pp. 107–119, Springer, 2003.

[30] H. Guo and H. L. Viktor, “Learning from imbalanced data sets with boosting and data generation: the DataBoost-IM approach.,” SIGKDD Explorations, vol. 6, no. 1, pp. 30–39, 2004.

[31] H. Guo and H. L. Viktor, “Boosting with Data Generation: Improving the Classiﬁcation of Hard to Learn Examples.,” in IEA/AIE (R. Orchard, C. Yang, and M. Ali, eds.), vol. 3029 of Lecture Notes in Computer Science, pp. 1082–1091, Springer, 2004.

[32] S. Chen, H. He, and E. A. Garcia, “RAMOBoost: Ranked Minority Oversampling in Boosting.,” IEEE Trans Neural Netw, vol. 21, pp. 1624–42, Oct. 2010.

[33] C. Seiﬀert, T. M. Khoshgoftaar, J. V. Hulse, and A. Napolitano,“RUSBoost: A Hybrid Approach to Alleviating Class Imbalance.,” IEEE Trans. Systems, Man, and Cybernetics, Part A, vol. 40, no. 1, pp. 185–197, 2010.

[34] X. Zhang and B.-G. Hu, “A New Strategy of Cost-Free Learning in the Class Imbalance Problem.,” IEEE Trans. Knowl. Data Eng., vol. 26, no. 12, pp. 2872–2885, 2014.

[35] Y. Peng, “Adaptive Sampling with Optimal Cost for Class-Imbalance Learning.,” in AAAI (B. Bonet and S. Koenig, eds.), pp. 2921–2927, AAAI Press, 2015.

[36] W. Zong, G.-B. Huang, and Y. Chen, “Weighted extreme learning machine for imbalance learning.,” Neurocomputing, vol. 101, pp. 229–242, 2013.

[37] X. Gao, Z. Chen, S. Tang, Y. Zhang, and J. Li, “Adaptive weighted imbalance learning with application to abnormal activity recognition.,” Neurocomputing, vol. 173, pp. 1927–1935, 2016.

[38] I. Nekooeimehr and S. K. Lai-Yuen, “Adaptive semi-unsupervised weighted oversampling (A-SUWO) for imbalanced datasets.,” Expert Syst. Appl., vol. 46, pp. 405–416, 2016.

[39] M. A. Tahir, J. Kittler, and F. Yan, “Inverse random under sampling for class imbalance problem and its application to multi-label classiﬁcation.,” Pattern Recognition, vol. 45, no. 10, pp. 3738–3750, 2012.

[40] C. Elkan, “The Foundations of Cost-Sensitive Learning,” in IJCAI, pp. 973–978, 2001.

[41] N. V. Chawla, N. Japkowicz, and A. Kotcz, “Editorial: special issue on learning from imbalanced data sets.,” SIGKDD Explorations, vol. 6, no. 1, pp. 1–6, 2004.

[42] B. Zadrozny, J. Langford, and N. Abe, “Cost-Sensitive Learning by Cost-Proportionate Example Weighting.,” in ICDM, pp. 435–, IEEE Computer Society, 2003.

[43] P. M. Domingos,“MetaCost: A General Method for Making Classiﬁers Cost-Sensitive.,”in KDD (U. M. Fayyad, S. Chaudhuri, and D. Madigan, eds.), pp. 155–164, ACM, 1999.

[44] Y. Freund and R. E. Schapire,“Experiments with a New Boosting Algorithm,”in International Conference on Machine Learning, pp. 148–156, 1996.

[45] Y. Sun, M. S. Kamel, A. K. C. Wong, and Y. W. 0007,“Cost-sensitive boosting for classiﬁcation of imbalanced data.,” Pattern Recognition, vol. 40, no. 12, pp. 3358–3378, 2007.

[46] W. Fan, S. J. Stolfo, J. Zhang, and P. K. Chan,“AdaCost: Misclassiﬁcation Cost-Sensitive Boosting.,”in ICML (I. Bratko and S. Dzeroski, eds.), pp. 97–105, Morgan Kaufmann, 1999.

[47] M. a. Maloof,“Learning When Data Sets are Imbalanced and When Costs are Unequal and Unknown,”Analysis, vol. 21, pp. 1263–1284, 2003.

[48] K. M. Ting,“An Instance-Weighting Method to Induce Cost-Sensitive Trees.,”IEEE Trans. Knowl. Data Eng., vol. 14, no. 3, pp. 659–665, 2002.

[49] C. Drummond and R. C. Holte, “Exploiting the Cost (In)sensitivity of Decision Tree Splitting Criteria.,” in ICML (P. Langley, ed.), pp. 239–246, Morgan Kaufmann, 2000.

[50] M. Kukar and I. Kononenko, “Cost-Sensitive Learning with Neural Networks.,” in ECAI, pp. 445–449, 1998.

[51] B. Krawczyk and M. Wozniak,“Cost-Sensitive Neural Network with ROC-Based Moving Threshold for Imbalanced Classiﬁcation.,” in IDEAL (K. Jackowski, R. Burduk, K. Walkowiak, M. Wozniak, and H. Yin, eds.), vol. 9375 of Lecture Notes in Computer Science, pp. 45–52, Springer, 2015.

[52] R. Akbani, S. Kwek, and N. Japkowicz, “Applying Support Vector Machines to Imbalanced Datasets.,” in ECML (J.-F. Boulicaut, F. Esposito, F. Giannotti, and D. Pedreschi, eds.), vol. 3201 of Lecture Notes in Computer Science, pp. 39–50, Springer, 2004.

[53] F. Vilarin˜o, P. Spyridonos, J. Vitri`a, and P. Radeva,“Experiments with SVM and Stratiﬁed Sampling with an Imbalanced Problem: Detection of Intestinal Contractions.,” in ICAPR (2) (S. Singh, M. Singh, C. Apt´e, and P. Perner, eds.), vol. 3687 of Lecture Notes in Computer Science, pp. 783–791, Springer, 2005.

[54] P. Kang and S. Cho, “EUS SVMs: Ensemble of Under-Sampled SVMs for Data Imbalance Problems.,” in ICONIP (1) (I. King, J. Wang, L. Chan, and D. L. Wang, eds.), vol. 4232 of Lecture Notes in Computer Science, pp. 837–846, Springer, 2006.

[55] Y. Liu, A. An, and X. Huang,“Boosting Prediction Accuracy on Imbalanced Datasets with SVM Ensembles.,” in PAKDD (W. K. Ng, M. Kitsuregawa, J. Li, and K. Chang, eds.), vol. 3918 of Lecture Notes in Computer Science, pp. 107–118, Springer, 2006.

[56] B. X. Wang and N. Japkowicz, “Boosting support vector machines for imbalanced data sets.,” Knowl. Inf. Syst., vol. 25, no. 1, pp. 1–20, 2010.

[57] Y. Tang and Y.-Q. Zhang, “Granular SVM with Repetitive Undersampling for Highly Imbalanced Protein Homology Prediction.,” in GrC, pp. 457–460, IEEE, 2006.

[58] G. Wu and E. Y. Chang,“Aligning Boundary in Kernel Space for Learning Imbalanced Dataset,”in Proceedings of the Fourth IEEE International Conference on Data Mining, ICDM ’04, (Washington, DC, USA), pp. 265– 272, IEEE Computer Society, 2004.

[59] X. Hong, S. C. 0001, and C. J. Harris,“A Kernel-Based Two-Class Classiﬁer for Imbalanced Data Sets.,”IEEE Transactions on Neural Networks, vol. 18, no. 1, pp. 28–41, 2007.

[60] Y. Xu, Y. Zhang, Z. Yang, X. Pan, and G. Li, “Imbalanced and semi-supervised classiﬁcation for prognosis of ACLF.,” Journal of Intelligent and Fuzzy Systems, vol. 28, no. 2, pp. 737–745, 2015.

[61] M. Wu and J. Ye, “A Small Sphere and Large Margin Approach for Novelty Detection Using Training Data with Outliers.,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 31, no. 11, pp. 2088–2092, 2009.

[62] Jayadeva, R. Khemchandani, and S. Chandra, “Twin Support Vector Machines for Pattern Classiﬁcation.,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 29, no. 5, pp. 905–910, 2007.

[63] F. Li, C. Yu, N. Yang, F. Xia, G. Li, and F. Kaveh-Yazdy, “Iterative Nearest Neighborhood Oversampling in Semisupervised Learning from Imbalanced Data,” The Scientiﬁc World Journal, Volume 2013, Article ID 875450, 2013, Dec. 2013.

[64] M. Lichman, “UCI Machine Learning Repository,” 2013.

[65] Y. LeCun and C. Cortes, “MNIST handwritten digit database,” 2010.

[66] A. E. Ghoul and H. Sahbi, “Semi-supervised learning using a graph-based phase ﬁeld model for imbalanced data set classiﬁcation.,” in ICASSP, pp. 2942–2946, IEEE, 2014.

[67] M. Rochery, I. Jermyn, and J. Zerubia, “Phase Field Models and Higher-Order Active Contours.,” in ICCV, pp. 970–976, IEEE Computer Society, 2005.

[68] A. Stanescu and D. Caragea, “Ensemble-based semi-supervised learning approaches for imbalanced splice site datasets.,” in BIBM (H. J. Zheng, W. Dubitzky, X. Hu, J.-K. Hao, D. P. Berrar, K.-H. Cho, Y. Wang, and D. R. Gilbert, eds.), pp. 432–437, IEEE Computer Society, 2014.

[69] J. Tanha, M. van Someren, and H. Afsarmanesh, “Semi-supervised self-training for decision tree classiﬁers,” International Journal of Machine Learning and Cybernetics, 2015.

[70] J. Xie and T. Xiong, “Stochastic Semi-supervised Learning.,” in Active Learning and Experimental Design @ AISTATS (I. Guyon, G. C. Cawley, G. Dror, V. Lemaire, and A. R. Statnikov, eds.), vol. 16 of JMLR Proceedings, pp. 85–98, JMLR.org, 2011.

[71] B. A. Almogahed and I. A. Kakadiaris, “Empowering Imbalanced Data in Supervised Learning: A Semisupervised Learning Approach.,” in ICANN (S. Wermter, C. Weber, W. Duch, T. Honkela, P. D. KoprinkovaHristova, S. Magg, G. Palm, and A. E. P. Villa, eds.), vol. 8681 of Lecture Notes in Computer Science, pp. 523–530, Springer, 2014.

[72] S. Li, Z. Wang, G. Zhou, and S. Y. M. Lee, “Semi-Supervised Learning for Imbalanced Sentiment Classiﬁcation.,” in IJCAI (T. Walsh, ed.), pp. 1826–1831, IJCAI/AAAI, 2011.

[73] A. Estabrooks, T. Jo, and N. Japkowicz,“A Multiple Resampling Method for Learning from Imbalanced Data Sets.,” Computational Intelligence, vol. 20, no. 1, pp. 18–36, 2004.

[74] F. J. Provost and G. M. Weiss,“Learning When Training Data are Costly: The Eﬀect of Class Distribution on Tree Induction,” CoRR, vol. abs/1106.4557, pp. 315–354, 2011.

[75] X. Zhu, “Semi–Supervised Learning in Literature Survey,” Tech. Rep. 1530, Computer Sciences, University of Wisconsin-Madison, 2005.

I had the good fortune of reading your article. It was well-written sir and contained sound, practical advice. You pointed out several things that I will remember for years to come. Thank you Sir. As a laymen outside from the ML industry I can understand those practical examples, CIL, Labelled datasets,Semi-supervised classiﬁcation algorithms, Supervised classiﬁcation algorithms,training data, unlabeled or labeled data sets etc. Thank you inspiration…appreciates it. Vazhutthukkal 🙂