SURVEY OF EXISTING METHODS FOR DETECTING SOURCE CODE DUPLICATES
Abstract and keywords
Abstract (English):
This work is devoted to solving the identifying software duplicates problem. To do this, a short survey of existing search methods is made, consisting of the following: textual, lexical, syntactic, metric, semantic. Then a comparative analysis of the methods is carried out according to the following criteria: accuracy, completeness, speed, resource efficiency, scope of implementation; the comparison results are given in tabular form. Also, promising approaches for searching for duplicates are considered, namely, the following: machine learning, graph analysis, syntax tree analysis, dynamic characteristics analysis, spatial characteristics analysis, abstract syntax analysis. Ways to continue investigation are indicated.

Keywords:
information security, duplicate search, source code, information protection, copyright
References

1. Abdullin T. I., Baev V. D., Buynevich M. V., Burzunov D. D., Vasilieva I. N., and Galiullina E. F. Digital technologies and problems of information security: monograph. - St. Petersburg: SPGEU, 2021. - 163 p.

2. Soinikov M.A. Recovery of damage caused by a crime against intellectual property: procedural aspects // Lex Russica (Russian law). - 2019. - No. 12 (157). - P. 80-86.

3. Sleta V.D. Support for code reuse based on the ontological approach // Modern information technologies. - 2010. - No. 11. - P. 178-181.

4. Kosolapov Yu.V. On detecting attacks such as reuse of executable code // Modeling and analysis of information systems. - 2019. - Vol. 26. - No. 2. - P. 213-228.

5. Buynevich M.V., Izrailov K.E. Fundamentals of Cybersecurity: Ways to Analyze Programs: A Study Guide. - St. Petersburg: St. Petersburg University of the State Fire Service of the Ministry of Emergency Situations of Russia, 2022. - 92 p.

6. Buynevich M.V., Izrailov K.E. Fundamentals of cybersecurity: ways to protect against program analysis: a tutorial. - St. Petersburg: St. Petersburg University of the State Fire Service of the Ministry of Emergency Situations of Russia, 2022. - 76 p.

7. Izrailov K.E. Methodology for evaluating the effectiveness of algorithmization tools used to search for vulnerabilities // Informatization and communication. - 2014. - No. 3. - pp. 39-42.

8. Romanov N.E., Izrailov K.E., Pokusov V.V. Intelligent Programming Support System: Machine Learning feat. rapid development of secure programs // Informatization and communication. - 2021. - No. 5. - P. 7-17. - DOI:https://doi.org/10.34219/2078-8320-2021-12-5-7-16

9. Liss A.R., Andrianov I.A. Analysis and development of methods for searching for duplicates in the program code. - 2010. - No. 7. - pp. 55-61.

10. Demidova L.A., Sovetov P.N., Gorchakov A.V. Clustering representations of program texts based on Markov chains // Bulletin of the Ryazan State Radio Engineering University. - 2022. - No. 81. - pp. 51-64.

11. Stepanov D.S., Itsykson V.M. Finding Duplicate Compiler Errors by Generating Witness Programs // Software Engineering. - 2023. - Vol. 14. - No. 4. - pp. 165-174.

12. Gusev S.S. Methods for analyzing arbitrary texts and program source codes from the point of view of the presence of identical fragments // Analysis, modeling, management, development of socio-economic systems (AMUR-2022): collection of scientific papers of the XVI International School-Symposium AMUR-2022 (Simferopol-Sudak, 14 - September 27, 2022), 2022. - pp 124-132.

13. Kotenko I., Izrailov K., Buinevich M. Static Analysis of Information Systems for IoT Cyber Security: A Survey of Machine Learning Approaches // Sensors. - 2022. - Vol. 22. - Iss. 4. - pp. 1335. - DOI:https://doi.org/10.3390/s22041335

14. Izrailov K.E. Modeling a program with vulnerabilities from the standpoint of the evolution of its ideas. Part 1. Scheme of the life cycle // Proceedings of educational institutions of communication. - 2023. - Vol. 9. - No. 1. - pp. 75-93. - DOIhttps://doi.org/10.31854/1813-324X-2023-9-1-75-93

15. Kotenko I., Izrailov K., Buinevich M. Analytical Modeling for Identification of the Machine Code Architecture of Cyberphysical Devices in Smart Homes // Sensors. - 2022. - Vol. 22. - Iss. 3. - pp. 1017. - DOI:https://doi.org/10.3390/s22031017

Login or Create
* Forgot password?