A Cost-Sensitive K-Nearest Neighbors (K-NN) Classification in Handling Multi-class Imbalance Problem

Authors

  • Adli A Nababan Universitas Sumatera Utara, Indonesia Author
  • Erna B Nababan Universitas Sumatera Utara, Indonesia Author
  • Muhammad Zarlis Universitas Sumatera Utara, Indonesia Author
  • Sutarman Wage Universitas Sumatera Utara, Indonesia Author

Keywords:

Cost-sensitive, K-nearest-neighbors, Multiclass, Classification, Gain ratio, PCA, Metacost

Abstract

The problem of unbalanced multiclass data greatly affects the classification process in machine learning. Unbalanced multiclass data is an interesting case to study until now. Several studies have shown that the minority class in the dataset is often considered unimportant or less influential than the majority class. This data imbalance problem arises when the minority class is misclassified. Misclassification causes low accuracy values and affects classifier performance. In this research, feature selection using Gain Ratio (GR) and Principal Component Analysis (PCA) were used to determine the selection of the most relevant features in the dataset. The classifier model is made by remodeling K-Nearest Neighbors (K-NN) with minimal cost using the MetaCost method. Then validate by calculating the value of accuracy and total cost of the proposed method in solving the problem of data imbalance in multiclass. The results of testing the K-NN algorithm using feature selection are proven to be able to improve performance, especially feature selection using the Gain Ratio method. The MetaCost method is also proven to be able to provide better performance. The combination of MetaCost and GR has better performance, wherefrom the two datasets there is an increase of 0.0295 in the Accuracy value, 0.1132 in the Precision value, 0.0897 in the Recall, and 0.1021 in the F1-Score.

Downloads

Published

2021-12-01

How to Cite

A Cost-Sensitive K-Nearest Neighbors (K-NN) Classification in Handling Multi-class Imbalance Problem. (2021). Internetworking Indonesia Journal, 13(2), 17-21. https://internetworkingindonesia.org/index.php/iij/article/view/60