Intelligent URL Classification System for Phishing Detection Using Machine Learning Techniques

GORLA GIRIDHAR,V.SARALA

doi:10.64751/

Authors

GORLA GIRIDHAR,V.SARALA Author

DOI:

https://doi.org/10.64751/

Abstract

The rapid expansion of internet usage has significantly increased cyber threats,
particularly phishing attacks that deceive users into revealing sensitive information.
Detecting malicious URLs has become a critical aspect of cybersecurity. This project
presents an intelligent URL classification system designed to detect phishing, malware,
defacement, and benign URLs using machine learning techniques. The system is
implemented using the Django web framework, providing an interactive interface for
both administrators and users.The proposed system leverages text-based feature
extraction techniques such as Count Vectorization to transform URLs into numerical
representations suitable for machine learning models. Multiple classification algorithms,
including Naive Bayes, Support Vector Machine (SVM), Logistic Regression, and
Stochastic Gradient Descent (SGD), are employed to evaluate performance and ensure
robustness. Additionally, an ensemble approach using a Voting Classifier is implemented
to improve prediction accuracy.The dataset used consists of labeled URLs categorized
into four classes: benign, phishing, defacement, and malware. During training, the dataset
is preprocessed and transformed, and models are trained using a train-test split strategy.
Performance metrics such as accuracy, confusion matrix, and classification reports are
used to evaluate the models.