In sample classification, the fuzzy k-nearest neighbor (FkNN) method and the associate improved classification algorithms ignore the uneven distribution of samples and the noise samples, thus are unable to reflect the differences of class sample features, resulting in the low classification accuracy. In order to overcome the limitations, a fuzzy weighted k-nearest neighbor data classification method based on affinity is proposed in this paper. Firstly, the membership of samples is calculated based on affinity among samples. Then, the feature weights of class samples are determined by the fuzzy entropy values, and k-neighbors are selected according to the weighted Euclidean distance. Finally, the samples will be classified according to the fuzzy membership of the samples belong to each class. The experimental results on the UCI datasets show that the proposed method is effective.
LIU Cheng-cheng, JIANG Ying
. Data Classification Method of Fuzzy Weighted k-Nearest Neighbor Based on Affinity[J]. Journal of Applied Sciences, 2018
, 36(4)
: 679
-688
.
DOI: 10.3969/j.issn.0255-8297.2018.04.011
[1] Kesavaraj G, Sukumaran S. A study on classification techniques in data mining[C]//IEEE on Computing, Communications and Networking Technologies (ICCCNT), 2013 Fourth International Conference, 2013:1-7.
[2] 陈池梅,张林. 基于贝叶斯网络的海量数据多维分类学习方法研究[J]. 计算机应用研究,2016, 33(3):689-692. Chen C M, Zhang L. Bayesian net based multi-dimensional classification learning algorithm[J]. Application Research of Computers, 2016, 33(3):689-692. (in Chinese)
[3] Keller J M, Gray M R, Givens J A. A fuzzy k-nearest neighbor algorithm[J]. IEEE Transactions on Systems, man, and cybernetics, 1985(4):580-585.
[4] Lu F, Ni D, Wen C L. A fuzzy-evidential k-nearest neighbor classification algorithm[J]. Dianzi Xuebao (Acta Electronica Sinica), 2012, 40(12):2390-2395.
[5] 江涛,陈小莉,张玉芳,熊忠阳. 基于聚类算法的kNN文本分类算法研究[J]. 计算机工程与应用,2009, 45(7):153-155. Jiang T, Chen X L, Zhang Y F, Xiong Z Y. Improved kNN using clustering algorithm[J]. Computer Engineering and Applications, 2009, 45(7):153-155. (in Chinese)
[6] Denoeux T. A k-nearest neighbor classification rule based on Dempster-Shafer theory[J]. IEEE Transactions on Systems, man, and cybernetics, 1995, 25(5):804-813.
[7] 刘继宇,王强,罗朝晖,宋浩,张绿云. 基于粗糙集的加权kNN数据分类算法[J]. 计算机科学, 2015, 42(10):281-286. Liu J Y, Wang Q, Luo Z H, Song H, Zhang L Y. Weighted kNN data classification algorithm based on rough set[J]. Computer Science, 2015, 42(10):281-286. (in Chinese)
[8] Zhu M H, Luo D Y, Li-Qun Y I. A sequential weighted k-nearest neighbor classification method[J]. Acta Electronica Sinica, 2009, 37(11):2584-2588.
[9] 刘忠宝,赵文娟. 基于模糊大间隔最小球分类模型的恒星光谱离群数据挖掘方法[J]. 光谱学与光谱分析,2016, 36(4):1245-1248. Liu Z B, Zhao W J. Study on stellar spectral outliers mining based on fuzzy large margin and minimum ball classification model[J]. Spectroscopy and Spectral Analysis, 2016, 36(4):1245-1248. (in Chinese)
[10] 边肇祺,张学工. 模式识别[M]. 第2版. 北京:清华大学出版社,2000:282-283.
[11] Xu R, Wunsch D. Survey of clustering algorithms[J]. IEEE Transactions on Neural Networks, 2005, 16(3):645-678.
[12] 张翔,肖小玲,徐光祐. 基于样本之间紧密度的模糊支持向量机方法[J]. 软件学报,2006, 17(5):951-958. Zhang X, Xiao X L, Xu G Y. Fuzzy support vector machine based on affinity among samples[J]. Journal of Software, 2006, 17(5):951-958. (in Chinese)
[13] Bandemer H, Näther W. Fuzzy data analysis[M].[S.l.]:Springer Science & Business Media, 2012.
[14] Frak A, Asuncion A. UCI machine learning repository[OL]. http://archive.ics.uci.edu/ml/datasets.html.2013.
[15] 孙可,龚永红,邓振云. 一种高效的K值自适应的SA-kNN算法[J]. 计算机工程与科学,2015, 37(10):1965-1970. Sun K, Gong Y H, Deng Z Y. An efficient SA-kNN algorithm with adaptive K value[J]. Computer Engineering & Science, 2015, 37(10):1965-1970. (in Chinese)