Support Vector Machine Based Malware and Phishing Website Detection

Abstract
Authors
Keywords
Conclusion
References

The use of Internet leads to various security threats. It includes spamming, phishing or malware. The phishing attack retrieves the sensitive information like bank account number or email password etc. Most of the phishing attack use malicious URL. The Malicious URL will be displayed to the user like a legitimate URL. Malware is widely used to disrupt computer operation, gain access to users' computer systems or gather sensitive information. Nowadays, malware is a serious threat of the Internet. Detecting malicious URLs is an essential task in network security intelligence. In this paper we categories phishing and malware URLs using Support Vector Machine (SVM). The Support Vector Machine (SVM) is a widely used kernelbased method for binary classification. SVM is theoretically well founded and has been already applied to many practical problems. Our method uses a variety of discriminative features including textual properties, link structures, webpage contents, DNS information, and network traffic. It shows that our proposed method is good at detecting phishing and malware sites, correctly labeling approximately 95% of phishing and malware sites. We achieve high performance, including high level of true positive, true negative, sensitivity, precision, F-measure and overall accuracy compared with other approaches. So we can say SVM is a robust and efficient method that can be successfully used for classification of normal or phishing website.

Published In : IJCAT Journal Volume 3, Issue 5

Date of Publication : May 2016

Pages : 295-300

Figures :02

Tables : --

Publication Link :Support Vector Machine Based Malware and Phishing Website Detection

Rashmi Karnik : Department of Computer Engg., JSPM's Bhivarabai Sawant Institute of Technology & Research Savitribai Phule University, Pune.

Dr. Gayatri M. Bhandari : Department of Computer Engg., JSPM's Bhivarabai Sawant Institute of Technology & Research Savitribai Phule University, Pune.

Kernel based Approach, Malware, Phishing Support Vector Machine

Detecting the malicious URL is one of the crucial problems in internet. This paper investigates the problem of web site categorization i.e., Normal or Phishing. This paper presents the supervised machine learning approach SVM is used to categories phishing and malware sites. This paper extracts various numbers of features from the URL. The Support vector machine algorithm achieved high classification accuracy for analyzing similar data parts to those of rule-based heuristic techniques. Our proposed method is good at detecting phishing and malware sites, correctly labeling approximately 95% of phishing and malware sites.

[1] Ram B. Basnet, Andrew H. Sung, Quingzhong Liu, “Learning To Detect Phishing URLs”, IJRET: International Journal of Research in Engineering and Technology, Volume: 03 Issue: 06 | Jun-2014. [2] Usha Narra, Corrado Aaron Visaggio, Mark Stamp, Thomas H. Austin, “Clustering versus SVM for malware detection”, Springer, Journal of Computer Virology and Hacking Techniques 10/2015 [3] Anjali B. Sayamber ,Arati M. Dixit , “Malicious URL Detection and Identification”, International Journal of Computer Applications (0975 – 8887) Volume 99 – No.17, August 2014. [4] Michal Kruczkowski; Ewa Niewiadomska Szynkiewicz, “Support Vector Machine for Malware Analysis and Cla ssification” Web Intelligence (WI) and Intelligent Agent Technologies (IAT), 2014 IEEE/WIC/ACM International Joint Conferences [5] J. Ma, L. K. Saul, S. Savage, and G. M. Voelker, “Identifying Suspicious URLs: An Application of Largescale Online Learning,” in ICML ’09: Proceedings of the International Conference on Machine Learning, 2009, pp. 681–688. [6] C. Whittaker, B. Ryner, and M. Nazif, “Large-scale automatic classification of phishing pages,” in NDSS ’10, 2010. [7] P. Prakash, M. Kumar, R. R. Kompella, and M. Gupta, “Phishnet: predictive blacklisting to detect phishing attacks,” in INFOCOM’10: Proceedings of the 29th conference on Information communications. Piscataway, NJ, USA: IEEE Press, 2010, pp. 346–350. [8] Y. Cao, W. Han, and Y. Le, “Anti-phishing based on automated individual white-list,” in DIM ’08: Proceedings of the 4th ACM workshop on Digital identity management. New York, NY, USA: ACM, 2008, pp. 51–60. [9] Y. Zhang, J. Hong, and L. Cranor, “Cantina: A Contentbased Approach to Detecting Phishing Web sites,” in proceedings of the International World Wide Web Conference (WWW), 2007. [10] M. Fredrikson, S. Jha, M. Christodorescu, R. Sailer, and X. Yan, “Synthesizing near-optimal malware specifications from suspicious behaviors,” in Proc. IEEE Symp. Secur. Priv., Washington, DC IEEE Computer Society, May 2010, pp. 45–60 [11] A. Y. Fu, L. Wenyin, and X. Deng, “Detecting phishing web pages with visual similarity assessment based on earth mover’s distance (emd),” IEEE Trans. Dependable Secur. Comput., vol. 3, no. 4, pp. 301–311,2006. [12] A Practical Guide to Support Vector Classification Chih- Wei Hsu, Chih-Chung Chang, and Chih-Jen Lin Department of Computer Science National Taiwan University, Taipei 106, Taiwan http://www.csie.ntu.edu.tw/~cjlin Initial version: 2003 Last updated: April 15, 2010. [13] M. Hara, A. Yamada, and Y. Miyake, “Visual similaritybased phishing detection without victim site information,” in IEEE Symposium on Computational Intelligence in Cyber Security, 2009. CICS ’09, 2009, pp. 30 – 36 [14] Michael Atighetchi, Partha Pal “Attribute-based prevention of phishing attacks” Eighth IEEE international symposium on network computing and application, 2009. [15] Matthew Dunlop, Stephen Groat, and David Shelly” GoldPhish: Using Images for Content-Based Phishing Analysis”, in proceedings of internet monitoring and protection(ICIMP),fifth international conference, Barcelona, Pages 123-128, 201.