计算机科学与应用

基于紧密度的模糊加权kNN数据分类方法

展开
  • 1. 云南省计算机技术应用重点实验室, 昆明 650500;
    2. 昆明理工大学 信息工程与自动化学院, 昆明 650500

收稿日期: 2017-08-23

  修回日期: 2017-10-04

  网络出版日期: 2018-07-31

基金资助

国家自然科学基金(No.61462049,No.61063006,No.60703116);云南省应用基础研究计划重点项目基金(No.2017FA033)资助

Data Classification Method of Fuzzy Weighted k-Nearest Neighbor Based on Affinity

Expand
  • 1. Yunnan Key Lab of Computer Technology Application, Kunming 650500, China;
    2. Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, China

Received date: 2017-08-23

  Revised date: 2017-10-04

  Online published: 2018-07-31

摘要

模糊k-最近邻(fuzzy k-nearest neighbor,FkNN)及其改进的分类方法忽略了样本存在分布不均匀以及噪声样本的情况,不能充分体现每个类样本特征的差异性,影响了分类的准确率.为此,提出了一种基于紧密度的模糊加权kNN数据分类方法.首先基于样本间紧密度计算样本的隶属度;然后根据特征的模糊熵值分别计算每个类样本特征的权重,并使用加权欧氏距离确定近邻训练样本;最后根据待分类样本所属的每个类别的隶属度确定其类别.对UCI多个数据集的实验结果表明该方法是有效的.

本文引用格式

刘诚诚, 姜瑛 . 基于紧密度的模糊加权kNN数据分类方法[J]. 应用科学学报, 2018 , 36(4) : 679 -688 . DOI: 10.3969/j.issn.0255-8297.2018.04.011

Abstract

In sample classification, the fuzzy k-nearest neighbor (FkNN) method and the associate improved classification algorithms ignore the uneven distribution of samples and the noise samples, thus are unable to reflect the differences of class sample features, resulting in the low classification accuracy. In order to overcome the limitations, a fuzzy weighted k-nearest neighbor data classification method based on affinity is proposed in this paper. Firstly, the membership of samples is calculated based on affinity among samples. Then, the feature weights of class samples are determined by the fuzzy entropy values, and k-neighbors are selected according to the weighted Euclidean distance. Finally, the samples will be classified according to the fuzzy membership of the samples belong to each class. The experimental results on the UCI datasets show that the proposed method is effective.

参考文献

[1] Kesavaraj G, Sukumaran S. A study on classification techniques in data mining[C]//IEEE on Computing, Communications and Networking Technologies (ICCCNT), 2013 Fourth International Conference, 2013:1-7.
[2] 陈池梅,张林. 基于贝叶斯网络的海量数据多维分类学习方法研究[J]. 计算机应用研究,2016, 33(3):689-692. Chen C M, Zhang L. Bayesian net based multi-dimensional classification learning algorithm[J]. Application Research of Computers, 2016, 33(3):689-692. (in Chinese)
[3] Keller J M, Gray M R, Givens J A. A fuzzy k-nearest neighbor algorithm[J]. IEEE Transactions on Systems, man, and cybernetics, 1985(4):580-585.
[4] Lu F, Ni D, Wen C L. A fuzzy-evidential k-nearest neighbor classification algorithm[J]. Dianzi Xuebao (Acta Electronica Sinica), 2012, 40(12):2390-2395.
[5] 江涛,陈小莉,张玉芳,熊忠阳. 基于聚类算法的kNN文本分类算法研究[J]. 计算机工程与应用,2009, 45(7):153-155. Jiang T, Chen X L, Zhang Y F, Xiong Z Y. Improved kNN using clustering algorithm[J]. Computer Engineering and Applications, 2009, 45(7):153-155. (in Chinese)
[6] Denoeux T. A k-nearest neighbor classification rule based on Dempster-Shafer theory[J]. IEEE Transactions on Systems, man, and cybernetics, 1995, 25(5):804-813.
[7] 刘继宇,王强,罗朝晖,宋浩,张绿云. 基于粗糙集的加权kNN数据分类算法[J]. 计算机科学, 2015, 42(10):281-286. Liu J Y, Wang Q, Luo Z H, Song H, Zhang L Y. Weighted kNN data classification algorithm based on rough set[J]. Computer Science, 2015, 42(10):281-286. (in Chinese)
[8] Zhu M H, Luo D Y, Li-Qun Y I. A sequential weighted k-nearest neighbor classification method[J]. Acta Electronica Sinica, 2009, 37(11):2584-2588.
[9] 刘忠宝,赵文娟. 基于模糊大间隔最小球分类模型的恒星光谱离群数据挖掘方法[J]. 光谱学与光谱分析,2016, 36(4):1245-1248. Liu Z B, Zhao W J. Study on stellar spectral outliers mining based on fuzzy large margin and minimum ball classification model[J]. Spectroscopy and Spectral Analysis, 2016, 36(4):1245-1248. (in Chinese)
[10] 边肇祺,张学工. 模式识别[M]. 第2版. 北京:清华大学出版社,2000:282-283.
[11] Xu R, Wunsch D. Survey of clustering algorithms[J]. IEEE Transactions on Neural Networks, 2005, 16(3):645-678.
[12] 张翔,肖小玲,徐光祐. 基于样本之间紧密度的模糊支持向量机方法[J]. 软件学报,2006, 17(5):951-958. Zhang X, Xiao X L, Xu G Y. Fuzzy support vector machine based on affinity among samples[J]. Journal of Software, 2006, 17(5):951-958. (in Chinese)
[13] Bandemer H, Näther W. Fuzzy data analysis[M].[S.l.]:Springer Science & Business Media, 2012.
[14] Frak A, Asuncion A. UCI machine learning repository[OL]. http://archive.ics.uci.edu/ml/datasets.html.2013.
[15] 孙可,龚永红,邓振云. 一种高效的K值自适应的SA-kNN算法[J]. 计算机工程与科学,2015, 37(10):1965-1970. Sun K, Gong Y H, Deng Z Y. An efficient SA-kNN algorithm with adaptive K value[J]. Computer Engineering & Science, 2015, 37(10):1965-1970. (in Chinese)
文章导航

/