- Ram B. Basnet
Colorado Mesa University Grand Junction, Colorado, USA
rbasnet@coloradomesa.edu - Andrew H. Sung
University of Southern Mississippi Hattiesburg, Mississippi, USA
andrew.sung@usm.edu
ISSN: 2182-2069 (printed) / ISSN: 2182-2077 (online)
Learning to Detect Phishing Webpages
Phishing has become a lucrative business for cyber criminals whose victims range from end users to large corporations and government organizations. Though Internet users are generally becoming more aware of phishing websites, cyber scammers come up with novel schemes that circumvent phishing filters and often succeed in fooling even savvy users. Recent studies to detect phishing and malicious webpages using features from URLs alone show promise. The approach, however, may not be reliable and robust enough to detect evolving sophisticated phishing webpages. For examples, phishers can use URL shortening services to masquerade their phishing URLs, or use compromised legitimate websites to host their phishing campaign. Along with the features from URLs, we propose many novel content based features and apply cutting-edge machine learning techniques to demonstrate that our approach can detect phishing webpages with error rates 0.04-0.44%, false positive and false negative rates of 0.0-0.30% and 0.06-0.73% respectively on real-world data sets using Random Forests classifier, thereby improving previous results on the important problem of phishing detection.