应用科学学报 ›› 2018, Vol. 36 ›› Issue (4): 679-688.doi: 10.3969/j.issn.0255-8297.2018.04.011

• 计算机科学与应用 • 上一篇    下一篇

基于紧密度的模糊加权kNN数据分类方法

刘诚诚1,2, 姜瑛1,2   

  1. 1. 云南省计算机技术应用重点实验室, 昆明 650500;
    2. 昆明理工大学 信息工程与自动化学院, 昆明 650500
  • 收稿日期:2017-08-23 修回日期:2017-10-04 出版日期:2018-07-31 发布日期:2018-07-31
  • 通信作者: 姜瑛,教授,研究方向:云计算、大数据分析、软件质量保证与测试,E-mail:jy_910@163.com E-mail:jy_910@163.com
  • 基金资助:
    国家自然科学基金(No.61462049,No.61063006,No.60703116);云南省应用基础研究计划重点项目基金(No.2017FA033)资助

Data Classification Method of Fuzzy Weighted k-Nearest Neighbor Based on Affinity

LIU Cheng-cheng1,2, JIANG Ying1,2   

  1. 1. Yunnan Key Lab of Computer Technology Application, Kunming 650500, China;
    2. Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, China
  • Received:2017-08-23 Revised:2017-10-04 Online:2018-07-31 Published:2018-07-31

摘要: 模糊k-最近邻(fuzzy k-nearest neighbor,FkNN)及其改进的分类方法忽略了样本存在分布不均匀以及噪声样本的情况,不能充分体现每个类样本特征的差异性,影响了分类的准确率.为此,提出了一种基于紧密度的模糊加权kNN数据分类方法.首先基于样本间紧密度计算样本的隶属度;然后根据特征的模糊熵值分别计算每个类样本特征的权重,并使用加权欧氏距离确定近邻训练样本;最后根据待分类样本所属的每个类别的隶属度确定其类别.对UCI多个数据集的实验结果表明该方法是有效的.

关键词: 数据分类, 加权kNN, 紧密度, 模糊隶属度, 模糊熵

Abstract: In sample classification, the fuzzy k-nearest neighbor (FkNN) method and the associate improved classification algorithms ignore the uneven distribution of samples and the noise samples, thus are unable to reflect the differences of class sample features, resulting in the low classification accuracy. In order to overcome the limitations, a fuzzy weighted k-nearest neighbor data classification method based on affinity is proposed in this paper. Firstly, the membership of samples is calculated based on affinity among samples. Then, the feature weights of class samples are determined by the fuzzy entropy values, and k-neighbors are selected according to the weighted Euclidean distance. Finally, the samples will be classified according to the fuzzy membership of the samples belong to each class. The experimental results on the UCI datasets show that the proposed method is effective.

Key words: weighted kNN, fuzzy membership, data classification, affinity, fuzzy entropy

中图分类号: