USING MACHINE LEARNING ALGORITHMS TO RECOGNIZE PHISHING RESOURCES
Abstract and keywords
Abstract (English):
The purpose of the work is to study popular machine learning methods used to ensure the security of information systems and their users from phishing. This article discusses the current technologies of intruders to carry out attacks using social engineering methods, security measures to ensure the security of corporate users, as well as the classification of methods for detecting illegitimate Internet resources using machine learning technologies. As existing machine learning algorithms that allow the identification of dangerous resources, the article presents: Bayes' theorem, the classifier principle, the k-nearest neighbor algorithm and logistic regression, as well as statistical information on the frequency of detection of popular signs of phishing and malicious resources. The article concludes that an integrated approach to ensuring infrastructure protection, taking into account a multi-vector analysis.

Keywords:
classification, phishing, information security, machine learning
Text
Publication text (PDF): Read Download
References

1. Import substitution in the information security market [Electronic resource]. URL: https://habr.com/ru/articles/676664/

2. Criminal activity: how are hacker attacks on the Verny network and SDEK related? [Electronic resource]. URL: https://iz.ru/1706284/ivan-chernousov-valerii-kodachigov-evgeniia-pertceva/ugrozovaia-aktivnost-kak-sviazany-khakerskie-ataki-na-set-vernyi-i-sdek

3. Hackers open the sales season [Electronic resource]. URL: https://www.kommersant.ru/doc/4548082

4. Gophish is a phishing framework. How to write fake emails and deceive your employees [Electronic resource]. URL: https://xakep.ru/2016/12/07/gophish-phishing-framework-howto/

5. Sberbank has created a flash game for employees after phishing "Gref letters" [Electronic resource]. URL: https://www.rbc.ru/technology_and_media/15/02/2017/58a430e69a79472baa6d0ad?from=newsfeed/

6. Naive Bayes algorithm in machine learning [Electronic resource]. URL: https://www.guru99.com/ru/naive-bayes-classifiers.html

7. The k-nearest neighbor method (k-nearest neighbour) [Electronic resource]. URL: https://proglib.io/p/metod-k-blizhayshih-sosedey-k-nearest-neighbour-2021-07-19

8. Logistic Regression in Machine Learning [Electronic resource]. URL: https://www.geeksforgeeks.org/understanding-logistic-regression/

9. Punycode [Electronic resource]. URL: https://ru.wikipedia.org/wiki/Punycode/

10. Phishing with Unicode Domains [Electronic resource]. URL: https://www.xudongz.com/blog/2017/idn-phishing/

11. Rao R.S., Pais A.R. Two level filtering mechanism to detect phishing sites using lightweight visual similarity approach // Journal of Ambient Intelligence and Humanized Computing. – 2020. – V. 11. – No. 9. – P. 3853-3872. DOI: https://doi.org/10.1007/s12652-019-01637-z

12. Nagaraj K., Bhattacharjee B., Sridhar A., Sharvani G.S. Detection of phishing websites using a novel twofold ensemble model // Journal of Systems and Information Technology. – 2018. – V. 20. – No 3. – P. 321-357. DOI: https://doi.org/10.1108/JSIT-09-2017-0074

13. Sönmez Y., Tuncer T., Gökal H., Avci E. Phishing web sites features classification based on extreme learning machine // 2018 6th International Symposium on Digital Forensic and Security (ISDFS). – 2018. – P. 1-5. DOI: https://doi.org/10.1109/ISDFS.2018.8355342

14. Zamir A., Khan H.U., Iqbal T., Yousaf N., Aslam F., Anjum A., Hamdani M. Phishing web site detection using diverse machine learning algorithms // The Electronic Library. – 2020. – V. 38. – No 1. - P. 65-80. DOI: https://doi.org/10.1108/EL-05-2019-0118

15. Sonowal G., Kuppusamy K.S. PhiDMA - A Phishing Detection Model with Multi-filter Approach // Journal of King Saud University-Computer and Information Sciences. – 2020. – V. 32. – No. 1. – P. 99-112. DOI: https://doi.org/10.1016/j.jksuci.2017.07.005

16. Purwanto R., Paly A., Blair A., Jha S. PhishZip: A New Compression-based Algorithm for Detecting Phishing Websites // 2020 IEEE Conference on Communications and Network Security (CNS). – IEEE, 2020. – P. 1-9. DOI: https://doi.org/10.1109/CNS48642.2020.9162211

Login or Create
* Forgot password?