[1]郭志懋,周傲英.数据质量和数据清洗研究综述 [J].软件学报,2002, 13(11): 2076-2082.[2]FAN Wenfei. Extending dependencies with conditions for data cleaning [J].2008 8th IEEE International Conference on: Computer and Information Technology, 2008: 185-190.[3]NEELY M P. Data quality tools for data warehousing: a small sample survey [C]// Proceedings of MIT Conference on Information Quality, Center for Technology in Government, University at Albany/ SUNY: Germany (1998).[4]QIU Yuefeng, TIAN Zongping, JI Wenyun. An efficient approach for detecting approximately duplicate database records [J].Chinese Journal of Computers, 2001, 24(1): 69-77.[5]EKTEFA M, SIDI F, IBRAHIM H, JABAR M A, MEMAR S, RAMLI A.A threshold-based similarity measure for duplicate detection [C]//2011 IEEE Conference on Open Systems(ICOS), 2011: 37-41.[6]TREVOR C K.Automated detection of duplicate free-form English bug reports [J].MS Computer Science Thesis, Department of Computer Science, Morgantown, West Virginia, USA, 2009, West Virginia University.[7]RUNESON P, ALEXANDERSSON M, NYHOLM O.Detection of duplicate defect reports using natural language processing [C]//ICSE 2007: Proceedings of the 29th international conference on Software Engineering, Washington, DC, USA, 2007: 499-510.[8]NARAYANA V A, PREMCHAND P, GOVARDHAN A.A novel and efficient approach for near duplicate page detection in Web crawling [C]//2009 IEEE International Advance Computing Conference (IACC), 2009: 1492-1496.[9]WANG Bin, LI Zhiwei, LI Mingjing, MA Weiying.Large-scale duplicate detection for Web image search [C]//2006 IEEE International Conference on Multimedia and Expo, 2006: 353-356.[10]MASEK W, PATERSON M A.Fast algorithm computing string edit distance [J].Journal of Computer System Science, 1980, 20(1): 18-31.[11]LIAO Hsienyu, MENG_Laiyin, YI Cheng.A parallel implementation of the Smith-Waterman algorithm for massive sequences searching [C]//IEMBS’04.26th Annual International Coference of the IEEE: Engineering in Medicine and Biology Society, 2004, 2: 2817-2820.[12]WINKLER W E. The state record linkage and current research problems [J].Technical report, Statistics of Income Division, Internal Revenue Service Publication (1999).[13]LI Guohui, DU Xiaokun, HU Fangxiao, YANG Bing, TANG Xiaohong.Structure matching method based on functional dependencies [J].Journal of Software, 2009, 20(10): 2667-2678. |