Signal and Information Processing

A Locality Sensitive Hashing Algorithm Based on CUDA

Expand
  • 1. School of Communication and Information Engineering, Shanghai University, Shanghai 200444, China;
    2. Institute of Smart City, Shanghai University, Shanghai 200444, China

Received date: 2015-02-01

  Revised date: 2015-03-29

  Online published: 2015-09-30

Abstract

The traditional locality sensitive hashing (LSH) algorithm often takes a large memory space and long settling time to build a hash table. In the query phase, it takes more than 95% of the overall running time to search K nearest neighbor data items of samples. To solve these problems, this paper uses the compute unified device architecture (CUDA) to transplant LSH algorithm into a GPU, using parallel multi-threads to calculate the values of data items to build the hash table. In the query phase, we introduce multisample query based on work queue to global memory to improve efficiency of the algorithm. Experimental results show that computing speed of the proposed LSH algorithm is 12 times faster than the traditional LSH algorithm.

Cite this article

ZHANG Yi-fan, YU Xiao-qing, AN Xuan-dong, WAN Wang-gen . A Locality Sensitive Hashing Algorithm Based on CUDA[J]. Journal of Applied Sciences, 2015 , 33(5) : 550 -558 . DOI: 10.3969/j.issn.0255-8297.2015.05.009

References

[1] 凌康. 基于局部敏感哈希的相似性搜索技术研究[D]. 南京:南京大学,2012.

[2] Donkin D. Multidimensional search problem [J]. Siam Journal on Computing, 1976, 5(2): 181-186.

[3] Yeh T, Lee J, Darrell T. Adaptive vocabulary forests br dynamic indexing and category learning [C]//Proceedings of the 11th IEEE International Conference on Computer Vision Rio de Janeiro, 2007. Washington, DC, USA: IEEE Computer Society, 2007: 1-8.

[4] Indyk P, Motwani R. Approximate nearest neighbors: towards removing the curse of dimensionality[C]//Proceedings of 30th ACM Symposium on Theory of Computing. New York, USA, 1998: 604-613.

[5] Gionis A, Indyk P. Similarity search in high dimensions via hashing [C]//Proceedings of the 25th International Conference on Very Large Data Bases, Edinburgh, UK, 1999: 518-529.

[6] Wu Y, Shou L D, Chen G. CB-LSH: an efficient LSH indexing algorithm based on compressed bitmap [J]. Journal of Zhejiang University, 2012, 3(46): 376-385.

[7] Yin S Y, Badr M, Vodislav D. Dynamic multi-probe LSH: an I/O efficient index structure for approximate nearest neighbor search [J]. Lecture Notes in Computer Science, 2013, 1(8055): 48-62.

[8] Gu X G, Zhang Y D, Zhang L. An improved method of locality sensitive hashing for indexing large-scale and high-dimensional features [J]. Signal Processing, 2013, 93(8): 2244-2255.

[9] Jiang K, Que Q C, Kulis B. Revisiting kernelized locality-sensitive hashing for improved largescale image retrieval [C]//IEEE Conference on Computer Vision and Pattern Recognition, 2014.

[10] 雷德川. 基于CUDA的GPU加速迭代重建算法研究[D]. 中国工程物理研究院,2012.

[11] O'donnell R, Wu Y, Zhou Y. Optimal lower bounds for locality-sensitive hashing [C]// Proceedings of Innovations in Computer Science, 2011: 275-283.

[12] Andoni A, Indyk P, Nguyen H L. Beyond locality-sensitive hashing [C]//Proceedings of the Twenty-Fifth Annual ACM-SIAM Symposium on Discrete Algorithms, 2014: 1018-1028.

[13] 李淼,孙荣坤. 基于p稳定分布局部敏感哈希地址的鲁棒音频检索方法[J]. 信号处理,2012, 28(3): 367-375. Li M, Sun R K. Robust audio retrieval method using local-sensitive hashing address based on p-stable distribution [J]. Signal Processing, 2012, 28(3): 367-375. (in Chinese)

[14] Pagh R, Rodler F F. Cuckoo hashing [J]. Journal of Algorithms, 2004, 51(18): 122-144.

[15] Alcantara D A, Sharf A. Real-time parallel hashing on the GPU [J]. ACM Transactions on Graphics, 2009, 28(154): 1-9.
Outlines

/