应用科学学报 ›› 2015, Vol. 33 ›› Issue (5): 550-558.doi: 10.3969/j.issn.0255-8297.2015.05.009

• 信号与信息处理 • 上一篇    下一篇

一种基于CUDA的局部敏感哈希算法

张一凡1,2, 余小清1,2, 安炫东1,2, 万旺根1,2   

  1. 1. 上海大学通信与工程学院, 上海 200444;
    2. 上海大学智慧城市研究院, 上海 200444
  • 收稿日期:2015-02-01 修回日期:2015-03-29 出版日期:2015-09-30 发布日期:2015-09-30
  • 通信作者: 余小清,副教授,研究方向:语音信号处理、数据挖掘、GPU加速、嵌入式系统、点云数据处理,E-mail:yxq@staff.shu.edu.cn E-mail:yxq@staff.shu.edu.cn
  • 基金资助:

    国家"863"高技术研究发展计划基金(No.2013AA01A603);上海市教育委员会科研创新项目基金(No.14YZ011)资助

A Locality Sensitive Hashing Algorithm Based on CUDA

ZHANG Yi-fan1,2, YU Xiao-qing1,2, AN Xuan-dong1,2, WAN Wang-gen1,2   

  1. 1. School of Communication and Information Engineering, Shanghai University, Shanghai 200444, China;
    2. Institute of Smart City, Shanghai University, Shanghai 200444, China
  • Received:2015-02-01 Revised:2015-03-29 Online:2015-09-30 Published:2015-09-30

摘要: 传统的局部敏感哈希算法建立哈希表时往往需要较大的内存空间以及较长的建立时间. 在查询阶段,查询样本K个最近邻数据项的所需时间超过整个运行时间的95%. 针对这些问题,运用计算设备架构将局部敏感哈希算法移植至图形处理器,并用多线程并行计算数据项的哈希值来建立哈希表. 查询阶段在全局内存中引入基于工作队列的多样本查询,以提高算法的运行效率. 实验结果表明,所提出的算法与传统的局部敏感哈希算法相比,能在不降低运算精度的情况下将运算速度提高近12倍.

关键词: 计算设备架构, 图形处理器, 局部敏感哈希, K最近邻

Abstract: The traditional locality sensitive hashing (LSH) algorithm often takes a large memory space and long settling time to build a hash table. In the query phase, it takes more than 95% of the overall running time to search K nearest neighbor data items of samples. To solve these problems, this paper uses the compute unified device architecture (CUDA) to transplant LSH algorithm into a GPU, using parallel multi-threads to calculate the values of data items to build the hash table. In the query phase, we introduce multisample query based on work queue to global memory to improve efficiency of the algorithm. Experimental results show that computing speed of the proposed LSH algorithm is 12 times faster than the traditional LSH algorithm.

Key words: compute unified device architecture (CUDA), graphics processing unit (GPU), locality sensitive hashing, K-nearest neighbor

中图分类号: