软件复用技术能够有效降低新软件产品开发的时间、人力和成本.在软件复用中,基于待开发软件的基本描述与简单需求,如何衡量已有软件的可复用性并对大量已有软件进行快速、自动的可复用性评估,已成为首要解决的问题.目前已有较多评价软件产品或软件项目相似度的研究工作,但相似性并不等于可复用性.因此,该文通过调研软件产品可复用性的相关研究,定义了一套适用于开源软件仓库中软件项目的可复用性评价指标,并设计了基于待开发软件项目的基本需求快速查询可复用软件项目的算法,实现了可复用软件项目检索系统.通过实验及专家对检索结果的评价,验证了所提描述方法的高效性和可用性.
Through software reuse technology, reusing existing software components and modules can effectively reduce the time, labor and costs of new software product development. In software reuse, how to measure and evaluate the reusability of existing software is the first problem to be solved. Although there are a lot of researches assessing the similarities, it is not equal to the reusability. Therefore, this paper defines a set of assessment indexes which is applicable to the reusability of software projects in open source software repository, then designs an algorithm to quickly query reusable software projects based on the basic requirements of the software to be developed, and finally completes the retrieval system of the reusable software project. Experimental results and expert evaluation of the retrieval results verify the efficiency and usability of the method.
[1] Bajracharya S, Ngo T, Linstead E, et al. Sourcerer:a search engine for open source code supporting structure-based search[C]//Companion to the 21st ACM SIGPLAN Symposium on Object-Oriented Programming Systems, Languages, and Applications, 2006:681-682.
[2] Gu C, Yin G, Wang T, et al. A supervised approach for tag hierarchy construction in open source communities[C]//Proceedings of the 7th Asia-Pacific Symposium on Internetware, 2015:148-152.
[3] Girardi M R, Ibrahim B. Using English to retrieve software[J]. Journal of Systems and Software, 1995, 30(3):249-270.
[4] Paul S, Prakash A. A framework for source code search using program patterns[J]. IEEE Transactions on Software Engineering, 1994, 20(6):463-475.
[5] Zhang Y, Lo D, Kochhar P S, et al. Detecting similar repositories on GitHub[C]//2017 IEEE 24th International Conference on Software Analysis, Evolution and Reengineering (SANER). IEEE, 2017:13-23.
[6] Devlin J, Chang M W, Lee K, et al. Bert:pre-training of deep bidirectional transformers for language understanding[DB/OL]. 2018[2019-05-24]. https://arxiv.org/abs/1810.04805.
[7] Frakes W B, Kang K. Software reuse research:status and future[J]. IEEE Transactions on Software Engineering, 2005, 31(7):529-536.
[8] Haefliger S, Von Krogh G, Spaeth S. Code reuse in open source software[J]. Management Science, 2008, 54(1):180-193.
[9] Cossentino M, Burrafato P, Lombardo S, et al. Introducing pattern reuse in the design of multi-agent systems[C]//Net.ObjectDays:International Conference on Object-Oriented and Internet-Based Technologies, Concepts, and Applications for a Networked World. Berlin:Springer, 2002:107-120.
[10] Palomares C, Franch X, Quer C. Requirements reuse and patterns:a survey[C]//International Working Conference on Requirements Engineering:Foundation for Software Quality. Springer, Cham, 2014:301-308.
[11] Arango G F. Domain engineering for software reuse[M]. Irvine:University of California, 1988.
[12] Barreto A, Murta L G P, da Rocha A R C. Software process definition:a reuse-based approach[J]. Journal of Universal Computer Science, 2011, 17(13):1765-1799.
[13] Stewart K J, Ammeter A P, Maruping L M. Impacts of license choice and organizational sponsorship on user interest and development activity in open source software projects[J]. Information Systems Research, 2006, 17(2):126-144.
[14] Lerner J, Tirole J. The scope of open source licensing[J]. Journal of Law, Economics, and Organization, 2005, 21(1):20-56.
[15] 朱子骁, 邹艳珍, 华晨彦, 等. 基于StackOverflow数据的软件功能特征挖掘组织方法[J]. 软件学报, 2018, 29(8):2210-2225. Zhu Z X, Zou Y Z, Hua C Y, et al. Mining and organizing software functional features based on StackOverflow data[J]. Journal of Software, 2018, 29(8):2210-2225. (in Chinese)
[16] Manning C D, Surdeanu M, Bauer J, et al. The stanford CoreNLP natural language processing toolkit[C]//Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics:System Demonstrations, 2014:55-60.
[17] Marcus M, Santorini B, Marcinkiewicz M A. Building a large annotated corpus of English:The Penn Treebank[J]. Computational Linguistics, 1993, 19(2):313-330.
[18] Voorhees E M. The TREC-8 question answering track report[C]//Text Retrieval Conference, 1999, 99:77-82.