Software similarity measurements using UML diagrams: A systematic literature review
DOI:
https://doi.org/10.26594/register.v8i1.2248Keywords:
software similarity, similarity measurement, UML diagram similarity, semantic similarity, structural similarityAbstract
Every piece of software uses a model to derive its operational, auxiliary, and functional procedures. Unified Modeling Language (UML) is a standard displaying language for determining, recording, and building a software product. Several algorithms have been used by researchers to measure similarities between UML artifacts. However, there no literature studies have considered measurements of UML diagram similarities. This paper presents the results of a systematic literature review concerning similarity measurements between the UML diagrams of different software products. The study reviews and identifies similarity measurements of UML artifacts, with class diagram, sequence diagram, statechart diagram, and use case diagram being UML diagrams that are widely used as research objects for measuring similarity. Measuring similarity enables resolution of the problem domains of software reuse, similarity measurement, and clone detection. The instruments used to measure similarity are semantic and structural similarity. The findings indicate opportunities for future research regarding calculating other UML diagrams, compiling calculation information for each diagram, adapting semantic and structural similarity calculation methods, determining the best weight for each item in the diagram, testing novel proposed methods, and building or finding good datasets for use as testing material.References
[1] M. J. Chonoles, "Chapter 2 - What is UML?," in OCUP Certification Guide, Morgan Kaufmann, 2018, pp. 17-41.
[2] B. Kitchenham, R. Pretorius, D. Budgen, O. P. Brereton, M. Turner, M. Niazi and S. Linkman, "Systematic literature reviews in software engineering – A tertiary study," Information and Software Technology, vol. 52, no. 8, pp. 792-805, 2010.
[3] I. Inayat, S. S. Salim, S. Marczak, M. Daneva and S. Shamshirband, "A systematic literature review on agile requirements engineering practices and challenges," Computers in Human Behavior, vol. 51, pp. 915-929, 2015.
[4] K. Tuma, G. Calikli and R. Scandariato, "Threat analysis of software systems: A systematic literature review," Journal of Systems and Software, vol. 144, pp. 275-294, 2018.
[5] E. Souza, A. Moreira and M. Goulão, "Deriving architectural models from requirements specifications: A systematic mapping study," Information and Software Technology, vol. 109, pp. 26-39, 2019.
[6] W.-J. Park and D.-H. Bae, "A two-stage framework for UML specification matching," Information and Software Technology, vol. 53, no. 3, pp. 230-244, 2011.
[7] H. Störrle, "Towards clone detection in UML domain models," in ECSA '10: Proceedings of the Fourth European Conference on Software Architecture: Companion, 2010.
[8] K. Robles, A. Fraga, J. Morato and J. Llorens, "Towards an ontology-based retrieval of UML Class Diagrams," Information and Software Technology, vol. 54, no. 1, pp. 72-86, 2012.
[9] H. O. Salami and M. A. Ahmed, "A Framework for Class Diagram Retrieval Using Genetic Algorithm," in The 24th International Conference on Software Engineering & Knowledge Engineering, San Francisco Bay, 2012.
[10] B. Bonilla-Morales, S. Crespo and C. Clunie, "Reuse of Use Cases Diagrams: An Approach based on Ontologies and Semantic Web Technologies," IJCSI International Journal of Computer Science Issues, vol. 9, no. 1, pp. 24-29, 2012.
[11] H. O. Salami and M. Ahmed, "Class Diagram Retrieval Using Genetic Algorithm," in 2013 12th International Conference on Machine Learning and Applications, 2013.
[12] W. K. G. Assuncao and S. R. Vergilio, "Class Diagram Retrieval with Particle Swarm Optimization," in The 25th International Conference on Software Engineering and Knowledge Engineering (SEKE 2013), 2013.
[13] D. H. Qiu, H. Li and J. L. Sun, "Measuring software similarity based on structure and property of class diagram," in 2013 Sixth International Conference on Advanced Computational Intelligence (ICACI), 2013.
[14] H. O. Salami and M. Ahmed, "A framework for reuse of multi-view UML artifacts," The International Journal of Soft Computing and Software Engineering [JSCSE], vol. 3, no. 3, pp. 156-162, 156-162.
[15] S. Singh and R. Kaur, "Clone Detection in UML Class Models using Class Metrics," ACM SIGSOFT Software Engineering Notes, vol. 39, no. 3, 2014.
[16] M. A.-R. Al-Khiaty and M. Ahmed, "Similarity assessment of UML class diagrams using a greedy algorithm," in 2014 International Computer Science and Engineering Conference (ICSEC), 2014.
[17] M. A.-R. Al-Khiaty and M. Ahmed, "Similarity assessment of UML class diagrams using simulated annealing," in 2014 IEEE 5th International Conference on Software Engineering and Service Science, 2014.
[18] H. O. Salami and M. Ahmed, "Retrieving sequence diagrams using genetic algorithm," in 2014 11th International Joint Conference on Computer Science and Software Engineering (JCSSE), 2014.
[19] O. Nikiforova, K. Gusarovs, L. Kozacenko, D. Ahilcenoka and D. Ungurs, "An Approach to Compare UML Class Diagrams Based on Semantical Features of Their Elements," in ICSEA 2015: The Tenth International Conference on Software Engineering Advances, 2015.
[20] A. Elkamel, M. Gzara and H. Ben-Abdallah, "An UML class recommender system for software design," in 2016 IEEE/ACS 13th International Conference of Computer Systems and Applications (AICCSA), 2016.
[21] M. A.-R. Al-Khiaty and M. Ahmed, "UML Class Diagrams: Similarity Aspects and Matching," Lecture Notes on Software Engineering, vol. 4, no. 1, pp. 41-47, 2016.
[22] A. Adamu and W. M. N. W. Zainoon, "A Framework for Enhancing the Retrieval of UML Diagrams. In: Kapitsaki G., Santana de Almeida E. (eds) Software Reuse: Bridging with Social-Awareness," in International Conference on Software Reuse, Cham, 2016.
[23] M. A.-R. Al-Khiaty and M. Ahmed, "Matching UML class diagrams using a Hybridized Greedy-Genetic algorithm," in 2017 12th International Scientific and Technical Conference on Computer Sciences and Information Technologies (CSIT), 2017.
[24] A. Adamu and W. M. N. W. Zainon, "Multiview Similarity Assessment Technique of UML Diagrams," Procedia Computer Science, vol. 124, pp. 311-318, 2017.
[25] A. Adamu and W. M. N. W. Zainon, "Similarity Assessment of UML Sequence Diagrams Using Dynamic Programming. In: Badioze Zaman H. et al. (eds) Advances in Visual Informatics," in International Visual Informatics Conference, Cham, 2017.
[26] D. O. Siahaan, Y. Desnelita, Gustientiedina and S. Sunarti, "Structural and semantic similarity measurement of UML sequence diagrams," in 2017 11th International Conference on Information & Communication Technology and System (ICTS), 2017.
[27] A. Adamu and W. M. N. W. Zainon, "Matching and retrieval of state machine diagrams from software repositories using Cuckoo Search Algorithm," in 2017 8th International Conference on Information Technology (ICIT), 2017.
[28] R. Fauzan, D. O. Siahaan, S. Rochimah and E. Triandini, "Class Diagram Similarity Measurement: A Different Approach," in 2018 3rd International Conference on Information Technology, Information System and Electrical Engineering (ICITISEE), 215-219, 2018.
[29] R. Fauzan, D. O. Siahaan, S. Rochimah and E. Triandini, "Activity Diagram Similarity Measurement: A Different Approach," in 2018 International Seminar on Research of Information Technology and Intelligent Systems (ISRITI), 2018.
[30] A. Adamu, W. M. N. Wan and S. M. Abdulrahman, "Empirical Investigation of UML Models Matching through Different Weight Calibration," in ICSCA '19: Proceedings of the 2019 8th International Conference on Software and Computer Applications, 2019.
[31] P. E. Triandini, R. Fauzan, D. O. Siahaan and S. Rochimah, "Sequence Diagram Similarity Measurement: A Different Approach," in 2019 16th International Joint Conference on Computer Science and Software Engineering (JCSSE), 2019.
[32] R. Fauzan, D. O. Siahaan, S. Rochimah and E. Triandini, "Use Case Diagram Similarity Measurement: A New Approach," in 2019 12th International Conference on Information & Communication Technology and System (ICTS), 2019.
[33] P. Čech, "Matching UML class models using graph edit distance," Expert Systems with Applications, vol. 130, pp. 206-224, 2019.
[34] M. Bae, S. Kang and S. Oh, "Semantic similarity method for keyword query system on RDF," Neurocomputing, vol. 146, pp. 264-275, 2014.
[35] M. Fowler, UML Distilled: A Brief Guide to the Standard Object Modeling Language, 3rd ed., Addison-Wesley Professional, 2003.
[36] J. Kovse and T. Härder, "Generic XMI-Based UML Model Transformations," in International Conference on Object-Oriented Information Systems, Berlin, Heidelberg, 2002.
[37] M. L. McHugh, "Interrater reliability: the kappa statistic," Biochemia Medica, vol. 22, no. 3, pp. 276-282, 2012.
[38] J. L. Fleiss, B. Levin and M. C. Paik, Statistical Methods for Rates and Proportions, John Wiley & Sons, 2013.
[39] J. R. Landis and G. G. Koch, "The Measurement of Observer Agreement for Categorical Data," Biometrics, vol. 33, no. 1, pp. 159-174, 1977.
[40] K. L. Gwet, "Computing inter‐rater reliability and its variance in the presence of high agreement," British Journal of Mathematical and Statistical Psychology, vol. 61, no. 1, pp. 29-48, 2008.
[41] K. L. Gwet, "Testing the Difference of Correlated Agreement Coefficients for Statistical Significance," Educational and Psychological Measurement, vol. 76, no. 4, 2016.
[42] T. Ohyama, "Statistical inference of Gwet’s AC1 coefficient for multiple raters and binary outcomes," Communications in Statistics - Theory and Methods, 2020.
[43] E. Cho, "Making Reliability Reliable: A Systematic Approach to Reliability Coefficients," Organizational Research Methods, pp. 1-32, 2016.
[44] A. Wieland, C. F. Durach, J. Kembro and H. Treiblmaier, "Statistical and judgmental criteria for scale purification," Supply Chain Management, vol. 22, no. 4, pp. 321-328, 2017.
[45] S. J. Zepeda and A. M. Jimenez, "Teacher Evaluation and Reliability: Additional Insights Gathered from Inter-rater Reliability Analyses," Journal of Educational Supervision, vol. 2, no. 2, pp. 11-26, 2019.
[46] K. Gwet, "Kappa Statistic is not Satisfactory for Assessing the Extent of Agreement Between Raters," Statistical Methods for Inter-rater Reliability Assessment, no. 1, 2002.
[47] N. Wongpakaran, T. Wongpakaran, D. Wedding and K. L. Gwet, "A comparison of Cohen’s Kappa and Gwet’s AC1 when calculating inter-rater reliability coefficients: a study conducted with personality disorder samples," BMC Medical Research Methodology, vol. 12, no. 61, 2013.
[48] A. M. Jimenez and S. J. Zepeda, "A Comparison of Gwet’s AC1 and Kappa When Calculating Inter-Rater Reliability Coefficients in a Teacher Evaluation Context," Journal of Education Human Resources, vol. 38, no. 3, pp. 290-300, 2020.
[49] P. Pakray, S. Bandyopadhyay and A. Gelbukh, "Textual Entailment Using Lexical and Syntactic Similarity," International Journal of Artificial Intelligence & Applications (IJAIA), vol. 2, no. 1, pp. 43-58, 2011.
[50] A. Pawar and V. Mago, "Calculating the similarity between words and sentences using a lexical database and corpus statistics," 2018.
[51] D. Mazgutova and J. Kormos, "Syntactic and lexical development in an intensive English for Academic Purposes programme," Journal of Second Language Writing, vol. 29, pp. 3-15, 2015.
[52] B. Thompson and M. Post, "Paraphrase Generation as Zero-Shot Multilingual Translation: Disentangling Semantic Similarity from Lexical and Syntactic Diversity," in Proceedings of the 5th Conference on Machine Translation (WMT), 2020.
[53] R. Fauzan, D. Siahaan, S. Rochimah and E. Triandini, "A Different Approach on Automated Use Case Diagram Semantic Assessment," International Journal of Intelligent Engineering and Systems, vol. 14, no. 1, pp. 496-505, 2021.
[54] R. Fauzan, D. Siahaan, S. Rochimah and E. Triandini, "Automated Class Diagram Assessment using Semantic and Structural Similarities," International Journal of Intelligent Engineering and Systems, vol. 14, no. 2, pp. 52-66, 2021.
[55] R. Fauzan, D. O. Siahaan, S. Rochimah and E. Triandini, "Novel Approach to Automated Behavioral Diagram Assessment Using Label Similarity and Subgraph Edit Distance," Computer Science, vol. 22, no. 2, pp. 191-207, 2021.
[56] X. Zhang, S. Sun and K. Zhang, "A New Hybrid Improved Method for Measuring Concept Semantic Similarity in WordNet," The International Arab Journal of Information Technology, vol. 17, no. 4, pp. 433-439, 2020.
[57] Z. Wu and M. Palmer, "Verb Semantics and Lexical Selection," in ACL '94: Proceedings of the 32nd annual meeting on Association for Computational Linguistics, 1994.
[58] F. Husein and R. Sarno, "Developing Word Sense Disambiguation Corpuses Using Word2vec and Wu Palmer for Disambiguation," in 2018 International Seminar on Application for Technology of Information and Communication, 2018.
[59] R. P. Honeck, "Semantic similarity between sentences," Journal of Psycholinguistic Research, vol. 2, p. 137–151, 1973.
[60] P. Sunilkumar and A. P. Shaji, "A Survey on Semantic Similarity," in 2019 International Conference on Advances in Computing, Communication and Control (ICAC3), 2019.
[61] J.-B. Gao, B.-W. Zhang and X.-H. Chen, "A WordNet-based semantic similarity measurement combining edge-counting and information content theory," Engineering Applications of Artificial Intelligence, vol. 39, pp. 80-88, 2015.
[62] A. M. Jacobs and A. Kinder, "Features of word similarity," arXiv, 2018.
[63] Z. Yuan, L. Yan and Z. Ma, "Structural similarity measure between UML class diagrams based on UCG," Requirements Eng., vol. 25, p. 213–229, 2020.
[64] L. A. Zager and G. C. Verghese, "Graph similarity scoring and matching," Applied Mathematics Letters, vol. 21, no. 1, pp. 86-94, 2008.
[65] H. Bunke, "Exact (Graph) Matching," TU Wien, Szeged, 2013.
[66] M. Fey, J. E. Lenssen, C. Morris, J. Masci and N. M. Kriege, "Deep Graph Matching Consensus," arXiv, 2020.
[67] P. Swoboda, D. Kainmüller, A. Mokarian, C. Theobalt and F. Bernard, "A Convex Relaxation for Multi-Graph Matching," in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019.
[68] C. Liu, R. Wang, Z. Jiang and J. Yan, "Deep Reinforcement Learning of Graph Matching," arXiv, 2020.
[69] R. Hoffmann, C. McCreesh and C. Reilly, "Between Subgraph Isomorphism and Maximum Common Subgraph," in Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI-17), 2017.
[70] C. Luo, X. Wang, C. Su and Z. Ni, "A Fixture Design Retrieving Method Based on Constrained Maximum Common Subgraph," IEEE Transactions on Automation Science and Engineering, vol. 15, no. 2, pp. 692-704, 2018.
[71] E. Duesbury, J. D. Holliday and P. Willett, "Maximum Common Subgraph Isomorphism Algorithms: A Review," MATCH Communications in Mathematical and in Computer Chemistry, vol. 77, no. 2, pp. 213-232, 2017.
[72] Y. Bai, D. Xu, A. Wang, K. Gu, X. Wu, A. Marinovic, C. Ro, Y. Sun and W. Wang, "Fast Detection of Maximum Common Subgraph via Deep Q-Learning," arXiv, 2020.
[73] H. Munawaroh, D. O. Siahaan, R. Fauzan and E. Triandini, "Structural Similarity Measurement using Graph Edit Distance-Greedy on State chart Diagrams," in 2020 2nd International Conference on Cybernetics and Intelligent System (ICORIS), 2020.
[74] K. Riesen, M. Ferrer and H. Bunke, "Approximate Graph Edit Distance in Quadratic Time," IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 17, no. 2, pp. 483-494, 2020.
[75] F. Zulfa, D. O. Siahaan, R. Fauzan and E. Triandini, "Inter-Structure and Intra-Structure Similarity of Use Case Diagram using Greedy Graph Edit Distance," in 2020 2nd International Conference on Cybernetics and Intelligent System (ICORIS), 2020.
[76] K. Riesen and H. Bunke, "Improving Approximate Graph Edit Distance by Means of a Greedy Swap Strategy," in International Conference on Image and Signal Processing, Cham, 2014.
[77] K. Riesen and H. Bunke, "Graph Edit Distance — Novel Approximation Algorithms," in Handbook of Pattern Recognition and Computer Vision, 2016, pp. 275-291.
Downloads
Published
How to Cite
Issue
Section
License
Please find the rights and licenses in Register: Jurnal Ilmiah Teknologi Sistem Informasi. By submitting the article/manuscript of the article, the author(s) agree with this policy. No specific document sign-off is required.
1. License
The non-commercial use of the article will be governed by the Creative Commons Attribution license as currently displayed on Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
2. Author(s)' Warranties
The author warrants that the article is original, written by stated author(s), has not been published before, contains no unlawful statements, does not infringe the rights of others, is subject to copyright that is vested exclusively in the author and free of any third party rights, and that any necessary written permissions to quote from other sources have been obtained by the author(s).
3. User/Public Rights
Register's spirit is to disseminate articles published are as free as possible. Under the Creative Commons license, Register permits users to copy, distribute, display, and perform the work for non-commercial purposes only. Users will also need to attribute authors and Register on distributing works in the journal and other media of publications. Unless otherwise stated, the authors are public entities as soon as their articles got published.
4. Rights of Authors
Authors retain all their rights to the published works, such as (but not limited to) the following rights;
Copyright and other proprietary rights relating to the article, such as patent rights,
The right to use the substance of the article in own future works, including lectures and books,
The right to reproduce the article for own purposes,
The right to self-archive the article (please read out deposit policy),
The right to enter into separate, additional contractual arrangements for the non-exclusive distribution of the article's published version (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal (Register: Jurnal Ilmiah Teknologi Sistem Informasi).
5. Co-Authorship
If the article was jointly prepared by more than one author, any authors submitting the manuscript warrants that he/she has been authorized by all co-authors to be agreed on this copyright and license notice (agreement) on their behalf, and agrees to inform his/her co-authors of the terms of this policy. Register will not be held liable for anything that may arise due to the author(s) internal dispute. Register will only communicate with the corresponding author.
6. Royalties
Being an open accessed journal and disseminating articles for free under the Creative Commons license term mentioned, author(s) aware that Register entitles the author(s) to no royalties or other fees.
7. Miscellaneous
Register will publish the article (or have it published) in the journal if the article’s editorial process is successfully completed. Register's editors may modify the article to a style of punctuation, spelling, capitalization, referencing and usage that deems appropriate. The author acknowledges that the article may be published so that it will be publicly accessible and such access will be free of charge for the readers as mentioned in point 3.