Journal of Applied Sciences ›› 2020, Vol. 38 ›› Issue (5): 659-671.doi: 10.3969/j.issn.0255-8297.2020.05.001

• Novel Technologies for Intelligent Computing • Previous Articles    

An Efficient K-Nearest Neighbor Decision Algorithm for Samples with Uncertain Labels

QI Qing1, SHEN Zhengfei1, CAO Jian1, YING Jun2, ZHAO Long2   

  1. 1. Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai 200240, China;
    2. Shanghai Harbor e-Logistics Software Co., Ltd., Shanghai 200080, China
  • Received:2020-06-19 Published:2020-10-14

Abstract: Case-based decision-making is a method to directly classify or predict current cases based on past historical cases. The K-nearest neighbor method is a widely used casebased decision-making model. In the K-nearest neighbor method, historical cases need to be labeled. But in practical applications, the labels themselves have uncertainties. This article discusses the problem of label uncertainty which has been ignored in existing casebased decision-making methods in detail, and setups a label uncertainty model based on Dempster-Shafer evidence theory for improving prediction performance. In addition, in order to improve the operation efficiency, a new boundary tree algorithm by combining the traditional boundary tree algorithm and the label uncertainty is proposed. This paper introduces the function and principle of the boundary tree algorithm, and optimizes the node transfer strategy and decision process of the new boundary tree algorithm. Experimental demonstration shows that the proposed method not only takes the label uncertainty into consideration, but also improves the decision efficiency of the traditional K-nearest neighbor model.

Key words: K-nearest neighbor algorithm, uncertainties of labels, boundary tree algorithm, optimization of decision speed

CLC Number: