This project contains the implementation of KNN Undersampling method in several languages, as proposed in the original paper.
Abstract
In supervised learning, the imbalanced number of instances among the classes in a dataset can make the algorithms to classify one instance from the minority class as one from the majority class. With the aim to solve this problem, the KNN algorithm provides a basis to other balancing methods. These balancing methods are revisited in this work, and a new and simple approach of KNN undersampling is proposed. The experiments demonstrated that the KNN undersampling method outperformed other sampling methods. The proposed method also outperformed the results of other studies, and indicates that the simplicity of KNN can be used as a base for efficient algorithms in machine learning and knowledge discovery.For more details about the KNN Undersampling, please visit: Beckmann, M., Ebecken, N.F.F.,de Lima, B.S.L.B.P. 2015, A KNN Undersampling Approach for Data Balancing.