一种基于CUDA的局部敏感哈希算法

doi:10.3969/j.issn.0255-8297.2015.05.009

应用科学学报 ›› 2015, Vol. 33 ›› Issue (5): 550-558.doi: 10.3969/j.issn.0255-8297.2015.05.009

一种基于CUDA的局部敏感哈希算法

张一凡^1,2, 余小清^1,2, 安炫东^1,2, 万旺根^1,2

1. 上海大学通信与工程学院, 上海 200444;
2. 上海大学智慧城市研究院, 上海 200444

收稿日期:2015-02-01 修回日期:2015-03-29 出版日期:2015-09-30 发布日期:2015-09-30
通信作者: 余小清,副教授,研究方向:语音信号处理、数据挖掘、GPU加速、嵌入式系统、点云数据处理,E-mail:yxq@staff.shu.edu.cn E-mail:yxq@staff.shu.edu.cn
基金资助:
国家"863"高技术研究发展计划基金(No.2013AA01A603);上海市教育委员会科研创新项目基金(No.14YZ011)资助

A Locality Sensitive Hashing Algorithm Based on CUDA

ZHANG Yi-fan^1,2, YU Xiao-qing^1,2, AN Xuan-dong^1,2, WAN Wang-gen^1,2

1. School of Communication and Information Engineering, Shanghai University, Shanghai 200444, China;
2. Institute of Smart City, Shanghai University, Shanghai 200444, China

Received:2015-02-01 Revised:2015-03-29 Online:2015-09-30 Published:2015-09-30

摘要/Abstract

摘要： 传统的局部敏感哈希算法建立哈希表时往往需要较大的内存空间以及较长的建立时间. 在查询阶段,查询样本K个最近邻数据项的所需时间超过整个运行时间的95%. 针对这些问题,运用计算设备架构将局部敏感哈希算法移植至图形处理器,并用多线程并行计算数据项的哈希值来建立哈希表. 查询阶段在全局内存中引入基于工作队列的多样本查询,以提高算法的运行效率. 实验结果表明,所提出的算法与传统的局部敏感哈希算法相比,能在不降低运算精度的情况下将运算速度提高近12倍.

关键词: 计算设备架构, 图形处理器, 局部敏感哈希, K最近邻

Abstract: The traditional locality sensitive hashing (LSH) algorithm often takes a large memory space and long settling time to build a hash table. In the query phase, it takes more than 95% of the overall running time to search K nearest neighbor data items of samples. To solve these problems, this paper uses the compute unified device architecture (CUDA) to transplant LSH algorithm into a GPU, using parallel multi-threads to calculate the values of data items to build the hash table. In the query phase, we introduce multisample query based on work queue to global memory to improve efficiency of the algorithm. Experimental results show that computing speed of the proposed LSH algorithm is 12 times faster than the traditional LSH algorithm.

Key words: compute unified device architecture (CUDA), graphics processing unit (GPU), locality sensitive hashing, K-nearest neighbor

中图分类号:

TP332

张一凡, 余小清, 安炫东, 万旺根. 一种基于CUDA的局部敏感哈希算法[J]. 应用科学学报, 2015, 33(5): 550-558.

ZHANG Yi-fan, YU Xiao-qing, AN Xuan-dong, WAN Wang-gen. A Locality Sensitive Hashing Algorithm Based on CUDA[J]. Journal of Applied Sciences, 2015, 33(5): 550-558.

参考文献

[1] 凌康. 基于局部敏感哈希的相似性搜索技术研究[D]. 南京:南京大学,2012.

[2] Donkin D. Multidimensional search problem [J]. Siam Journal on Computing, 1976, 5(2): 181-186.

[3] Yeh T, Lee J, Darrell T. Adaptive vocabulary forests br dynamic indexing and category learning [C]//Proceedings of the 11th IEEE International Conference on Computer Vision Rio de Janeiro, 2007. Washington, DC, USA: IEEE Computer Society, 2007: 1-8.

[4] Indyk P, Motwani R. Approximate nearest neighbors: towards removing the curse of dimensionality[C]//Proceedings of 30th ACM Symposium on Theory of Computing. New York, USA, 1998: 604-613.

[5] Gionis A, Indyk P. Similarity search in high dimensions via hashing [C]//Proceedings of the 25th International Conference on Very Large Data Bases, Edinburgh, UK, 1999: 518-529.

[6] Wu Y, Shou L D, Chen G. CB-LSH: an efficient LSH indexing algorithm based on compressed bitmap [J]. Journal of Zhejiang University, 2012, 3(46): 376-385.

[7] Yin S Y, Badr M, Vodislav D. Dynamic multi-probe LSH: an I/O efficient index structure for approximate nearest neighbor search [J]. Lecture Notes in Computer Science, 2013, 1(8055): 48-62.

[8] Gu X G, Zhang Y D, Zhang L. An improved method of locality sensitive hashing for indexing large-scale and high-dimensional features [J]. Signal Processing, 2013, 93(8): 2244-2255.

[9] Jiang K, Que Q C, Kulis B. Revisiting kernelized locality-sensitive hashing for improved largescale image retrieval [C]//IEEE Conference on Computer Vision and Pattern Recognition, 2014.

[10] 雷德川. 基于CUDA的GPU加速迭代重建算法研究[D]. 中国工程物理研究院,2012.

[11] O'donnell R, Wu Y, Zhou Y. Optimal lower bounds for locality-sensitive hashing [C]// Proceedings of Innovations in Computer Science, 2011: 275-283.

[12] Andoni A, Indyk P, Nguyen H L. Beyond locality-sensitive hashing [C]//Proceedings of the Twenty-Fifth Annual ACM-SIAM Symposium on Discrete Algorithms, 2014: 1018-1028.

[13] 李淼,孙荣坤. 基于p稳定分布局部敏感哈希地址的鲁棒音频检索方法[J]. 信号处理,2012, 28(3): 367-375. Li M, Sun R K. Robust audio retrieval method using local-sensitive hashing address based on p-stable distribution [J]. Signal Processing, 2012, 28(3): 367-375. (in Chinese)

[14] Pagh R, Rodler F F. Cuckoo hashing [J]. Journal of Algorithms, 2004, 51(18): 122-144.

[15] Alcantara D A, Sharf A. Real-time parallel hashing on the GPU [J]. ACM Transactions on Graphics, 2009, 28(154): 1-9.

一种基于CUDA的局部敏感哈希算法

A Locality Sensitive Hashing Algorithm Based on CUDA

PDF

可视化

被引次数

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 0

编辑推荐

Metrics

本文评价