SURVEY OF EXISTING METHODS FOR DETECTING SOURCE CODE DUPLICATES
Abstract and keywords
Abstract (English):
This work is devoted to solving the identifying software duplicates problem. To do this, a short survey of existing search methods is made, consisting of the following: textual, lexical, syntactic, metric, semantic. Then a comparative analysis of the methods is carried out according to the following criteria: accuracy, completeness, speed, resource efficiency, scope of implementation; the comparison results are given in tabular form. Also, promising approaches for searching for duplicates are considered, namely, the following: machine learning, graph analysis, syntax tree analysis, dynamic characteristics analysis, spatial characteristics analysis, abstract syntax analysis. Ways to continue investigation are indicated.

Keywords:
information security, duplicate search, source code, information protection, copyright
Text
Publication text (PDF): Read Download
References

1. Abdullin T. I., Baev V. D., Buynevich M. V., Burzunov D. D., Vasilieva I. N., and Galiullina E. F. Digital technologies and problems of information security: monograph. - St. Petersburg: SPGEU, 2021. - 163 p.

2. Soinikov M.A. Recovery of damage caused by a crime against intellectual property: procedural aspects // Lex Russica (Russian law). - 2019. - No. 12 (157). - P. 80-86.

3. Sleta V.D. Support for code reuse based on the ontological approach // Modern information technologies. - 2010. - No. 11. - P. 178-181.

4. Kosolapov Yu.V. On detecting attacks such as reuse of executable code // Modeling and analysis of information systems. - 2019. - Vol. 26. - No. 2. - P. 213-228.

5. Buynevich M.V., Izrailov K.E. Fundamentals of Cybersecurity: Ways to Analyze Programs: A Study Guide. - St. Petersburg: St. Petersburg University of the State Fire Service of the Ministry of Emergency Situations of Russia, 2022. - 92 p.

6. Buynevich M.V., Izrailov K.E. Fundamentals of cybersecurity: ways to protect against program analysis: a tutorial. - St. Petersburg: St. Petersburg University of the State Fire Service of the Ministry of Emergency Situations of Russia, 2022. - 76 p.

7. Izrailov K.E. Methodology for evaluating the effectiveness of algorithmization tools used to search for vulnerabilities // Informatization and communication. - 2014. - No. 3. - pp. 39-42.

8. Romanov N.E., Izrailov K.E., Pokusov V.V. Intelligent Programming Support System: Machine Learning feat. rapid development of secure programs // Informatization and communication. - 2021. - No. 5. - P. 7-17. - DOI:https://doi.org/10.34219/2078-8320-2021-12-5-7-16

9. Liss A.R., Andrianov I.A. Analysis and development of methods for searching for duplicates in the program code. - 2010. - No. 7. - pp. 55-61.

10. Demidova L.A., Sovetov P.N., Gorchakov A.V. Clustering representations of program texts based on Markov chains // Bulletin of the Ryazan State Radio Engineering University. - 2022. - No. 81. - pp. 51-64.

11. Stepanov D.S., Itsykson V.M. Finding Duplicate Compiler Errors by Generating Witness Programs // Software Engineering. - 2023. - Vol. 14. - No. 4. - pp. 165-174.

12. Gusev S.S. Methods for analyzing arbitrary texts and program source codes from the point of view of the presence of identical fragments // Analysis, modeling, management, development of socio-economic systems (AMUR-2022): collection of scientific papers of the XVI International School-Symposium AMUR-2022 (Simferopol-Sudak, 14 - September 27, 2022), 2022. - pp 124-132.

13. Kotenko I., Izrailov K., Buinevich M. Static Analysis of Information Systems for IoT Cyber Security: A Survey of Machine Learning Approaches // Sensors. - 2022. - Vol. 22. - Iss. 4. - pp. 1335. - DOI:https://doi.org/10.3390/s22041335

14. Izrailov K.E. Modeling a program with vulnerabilities from the standpoint of the evolution of its ideas. Part 1. Scheme of the life cycle // Proceedings of educational institutions of communication. - 2023. - Vol. 9. - No. 1. - pp. 75-93. - DOIhttps://doi.org/10.31854/1813-324X-2023-9-1-75-93

15. Kotenko I., Izrailov K., Buinevich M. Analytical Modeling for Identification of the Machine Code Architecture of Cyberphysical Devices in Smart Homes // Sensors. - 2022. - Vol. 22. - Iss. 3. - pp. 1017. - DOI:https://doi.org/10.3390/s22031017

Login or Create
* Forgot password?