一种面向不确定标签样本的K-近邻高效决策算法

doi:10.3969/j.issn.0255-8297.2020.05.001

Abstract

Abstract: Case-based decision-making is a method to directly classify or predict current cases based on past historical cases. The K-nearest neighbor method is a widely used casebased decision-making model. In the K-nearest neighbor method, historical cases need to be labeled. But in practical applications, the labels themselves have uncertainties. This article discusses the problem of label uncertainty which has been ignored in existing casebased decision-making methods in detail, and setups a label uncertainty model based on Dempster-Shafer evidence theory for improving prediction performance. In addition, in order to improve the operation efficiency, a new boundary tree algorithm by combining the traditional boundary tree algorithm and the label uncertainty is proposed. This paper introduces the function and principle of the boundary tree algorithm, and optimizes the node transfer strategy and decision process of the new boundary tree algorithm. Experimental demonstration shows that the proposed method not only takes the label uncertainty into consideration, but also improves the decision efficiency of the traditional K-nearest neighbor model.

Key words: K-nearest neighbor algorithm, uncertainties of labels, boundary tree algorithm, optimization of decision speed

CLC Number:

P751.1

QI Qing, SHEN Zhengfei, CAO Jian, YING Jun, ZHAO Long. An Efficient K-Nearest Neighbor Decision Algorithm for Samples with Uncertain Labels[J]. Journal of Applied Sciences, 2020, 38(5): 659-671.

References

[1] Denoeux T. A k-nearest neighbor classification rule based on Dempster-Shafer theory[J]. IEEE Transactions on Systems, Man, and Cybernetics, 1995, 25(5):804-813.
[2] Mathy C, Derbinsky N, Bento J. The boundary forest algorithm for online supervised and unsupervised learning[C]//Proceeding of the 29th AAAI Conference on Artificial Intelligence, 2015:2864-2870.
[3] Tsymbal A. The problem of concept drift:definitions and related work[J]. Computer Science Department, 2004, 106(2):58.
[4] Noh Y K, Zhang B T, Lee D D. Generative local metric learning for nearest neighbor classification[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(1):106.
[5] García-Pedrajas N, del Castillo J A R, Cerruela-García G. A proposal for local k values for k-nearest neighbor rule[J]. IEEE Transactions on Neural Networks and Learning Systems, 2015, 28(2):470-475.
[6] Filip F G, Zamfirescu C B, Ciurea C. Computer-supported collaborative decision-making[M]. Cham:Springer International Publishing, 2017.
[7] Shen H B, Yang J, Chou K C. Fuzzy KNN for predicting membrane protein types from pseudo-amino acid composition[J]. Journal of Theoretical Biology, 2006, 240(1):9-13.
[8] Rosa J L A, Ebecken N F F. Data mining for data classification based on the KNN-fuzzy method supported by genetic algorithm[C]//International Conference on High Performance Computing for Computational Science. Springer, Berlin, Heidelberg, 2002:126-133.
[9] Ren Y, Li G, Zhang J. The maximum imputation framework for neighborhood-based collaborative filtering[J]. Social Network Analysis and Mining, 2014, 4(1):207.
[10] Ali P U S, Ventakeswaran D C J. Improved evidence theoretic KNN classifier based on theory of evidence[J]. International Journal of Computer Applications, 2011, 15(5):37-41.
[11] Gray R M. Entropy and information theory[M]. Berlin:Springer Science & Business Media, 2011.
[12] Zhu N, Cao J, Shen K. A decision support system with intelligent recommendation for multidisciplinary medical treatment[J]. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), 2020, 16(1s):1-23.

[1]	GAO Hua, WANG Anbang, WANG Yuncai. Progress in High-Speed Classical Physical Key Distribution Techniques [J]. Journal of Applied Sciences, 2020, 38(4): 507-519.
[2]	WANG Jian, CHEN Shi. Progress in Vortex-Multiplexed Communications Based on Conventional Fibers [J]. Journal of Applied Sciences, 2020, 38(4): 559-578.
[3]	SHE Shengfei, MEI Lin, ZHOU Zhenyu, HOU Chaoqi, GUO Haitao. Progress in Radiation-Resistant Erbium-Doped and Erbium-Ytterbium Co-doped Fibers for Space Optical Communication [J]. Journal of Applied Sciences, 2020, 38(4): 579-594.
[4]	QIAO Lijun, YANG Qiang, CHAI Mengmeng, WEI Xiaojing, ZHANG Jianzhong, XU Hongchun, ZHANG Mingjiang. Progress in Chaotic Semiconductor Lasers [J]. Journal of Applied Sciences, 2020, 38(4): 595-611.
[5]	HAO Tengfei, SHI Nuannuan, LI Wei, ZHU Ninghua, LI Ming. Multi-band Linearly Frequency Modulated Fourier Domain Mode-Locked Optoelectronic Oscillator [J]. Journal of Applied Sciences, 2020, 38(4): 640-646.
[6]	CHI Nan, NIU Wenqing, JIA Junlian, HA Yinaer. Anti-nonlinear Support Vector Machine Based Geometrically Shaping Visible Light Communication System [J]. Journal of Applied Sciences, 2020, 38(4): 647-658.
[7]	ZHANG Yuxin, LI Xi, SONG Yang, LI Changhui. Urban Spatial Form Analysis of GBA Based on “LJ1-01” Nighttime Light Remote Sensing Images [J]. Journal of Applied Sciences, 2020, 38(3): 466-477.
[8]	WANG Mengxuan, ZHANG Sheng, WANG Yue, LEI Ting, DU Wen. Research and Application of Improved CRNN Model in Classification of Alarm Texts [J]. Journal of Applied Sciences, 2020, 38(3): 388-400.
[9]	ZHAO Chunliu, LI Jiali, XU Ben, GONG Huaping, WANG Dongning. Research Progress of Fiber Micro Cavity Fabry-Perot Interference Sensors [J]. Journal of Applied Sciences, 2020, 38(2): 226-259.
[10]	GENG Youfu, LI Xuejin. Research on Temperature Sensors Based on Microstructured Fiber [J]. Journal of Applied Sciences, 2020, 38(2): 260-278.
[11]	CHEN Jiageng, LIU Qingwen, ZHAO Shuangxiang, He Zuyuan. Progress in High Resolution Demodulation Techniquesfor Wavelength-Encoded Optical Fiber Sensor [J]. Journal of Applied Sciences, 2020, 38(2): 279-295.
[12]	ZHAO Yunhe, LIU Yunqi. Few-Mode Fiber Long-Period Gratings—From Mode Conversion to High Sensitivity Fiber-Optic Sensing [J]. Journal of Applied Sciences, 2020, 38(2): 310-338.
[13]	YANG Yanan, LI Yiming, NIE Lihai, ZHANG Ning, ZHAO Laiping. Cost-Efficient Task Scheduling in Geo-distributed Datacenters [J]. Journal of Applied Sciences, 2019, 37(6): 859-874.
[14]	LIU Yangyi, SU Chengli, SHI Huiyuan, LI Ping, BO Guihua. Wireless Temperature Control System for High Temperature Heating Furnace Based on PFC-PID Algorithm [J]. Journal of Applied Sciences, 2019, 37(6): 875-886.
[15]	LI Songbin, YANG Jie, LIU Peng, WANG Lingrui. Steganalysis of Motion Vector-Based Steganography in H.264/AVC by Correlation Network Model [J]. Journal of Applied Sciences, 2019, 37(5): 663-672.

An Efficient K-Nearest Neighbor Decision Algorithm for Samples with Uncertain Labels

PDF

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics

Comments