Russian Federation
Russian Federation
Russian Federation
The purpose of the work is to study popular machine learning methods used to ensure the security of information systems and their users from phishing. This article discusses the current technologies of intruders to carry out attacks using social engineering methods, security measures to ensure the security of corporate users, as well as the classification of methods for detecting illegitimate Internet resources using machine learning technologies. As existing machine learning algorithms that allow the identification of dangerous resources, the article presents: Bayes' theorem, the classifier principle, the k-nearest neighbor algorithm and logistic regression, as well as statistical information on the frequency of detection of popular signs of phishing and malicious resources. The article concludes that an integrated approach to ensuring infrastructure protection, taking into account a multi-vector analysis.
classification, phishing, information security, machine learning
1. Import substitution in the information security market [Electronic resource]. URL: https://habr.com/ru/articles/676664/
2. Criminal activity: how are hacker attacks on the Verny network and SDEK related? [Electronic resource]. URL: https://iz.ru/1706284/ivan-chernousov-valerii-kodachigov-evgeniia-pertceva/ugrozovaia-aktivnost-kak-sviazany-khakerskie-ataki-na-set-vernyi-i-sdek
3. Hackers open the sales season [Electronic resource]. URL: https://www.kommersant.ru/doc/4548082
4. Gophish is a phishing framework. How to write fake emails and deceive your employees [Electronic resource]. URL: https://xakep.ru/2016/12/07/gophish-phishing-framework-howto/
5. Sberbank has created a flash game for employees after phishing "Gref letters" [Electronic resource]. URL: https://www.rbc.ru/technology_and_media/15/02/2017/58a430e69a79472baa6d0ad?from=newsfeed/
6. Naive Bayes algorithm in machine learning [Electronic resource]. URL: https://www.guru99.com/ru/naive-bayes-classifiers.html
7. The k-nearest neighbor method (k-nearest neighbour) [Electronic resource]. URL: https://proglib.io/p/metod-k-blizhayshih-sosedey-k-nearest-neighbour-2021-07-19
8. Logistic Regression in Machine Learning [Electronic resource]. URL: https://www.geeksforgeeks.org/understanding-logistic-regression/
9. Punycode [Electronic resource]. URL: https://ru.wikipedia.org/wiki/Punycode/
10. Phishing with Unicode Domains [Electronic resource]. URL: https://www.xudongz.com/blog/2017/idn-phishing/
11. Rao R.S., Pais A.R. Two level filtering mechanism to detect phishing sites using lightweight visual similarity approach // Journal of Ambient Intelligence and Humanized Computing. – 2020. – V. 11. – No. 9. – P. 3853-3872. DOI: https://doi.org/10.1007/s12652-019-01637-z
12. Nagaraj K., Bhattacharjee B., Sridhar A., Sharvani G.S. Detection of phishing websites using a novel twofold ensemble model // Journal of Systems and Information Technology. – 2018. – V. 20. – No 3. – P. 321-357. DOI: https://doi.org/10.1108/JSIT-09-2017-0074
13. Sönmez Y., Tuncer T., Gökal H., Avci E. Phishing web sites features classification based on extreme learning machine // 2018 6th International Symposium on Digital Forensic and Security (ISDFS). – 2018. – P. 1-5. DOI: https://doi.org/10.1109/ISDFS.2018.8355342
14. Zamir A., Khan H.U., Iqbal T., Yousaf N., Aslam F., Anjum A., Hamdani M. Phishing web site detection using diverse machine learning algorithms // The Electronic Library. – 2020. – V. 38. – No 1. - P. 65-80. DOI: https://doi.org/10.1108/EL-05-2019-0118
15. Sonowal G., Kuppusamy K.S. PhiDMA - A Phishing Detection Model with Multi-filter Approach // Journal of King Saud University-Computer and Information Sciences. – 2020. – V. 32. – No. 1. – P. 99-112. DOI: https://doi.org/10.1016/j.jksuci.2017.07.005
16. Purwanto R., Paly A., Blair A., Jha S. PhishZip: A New Compression-based Algorithm for Detecting Phishing Websites // 2020 IEEE Conference on Communications and Network Security (CNS). – IEEE, 2020. – P. 1-9. DOI: https://doi.org/10.1109/CNS48642.2020.9162211