Classification of Amazon Product Reviews Using Hybrid Embedding Techniques for Sentiment Analysis

Authors

  • G Dinesh Author
  • D Nagaraj Author

DOI:

https://doi.org/10.64751/

Keywords:

BERT, classification, feature extraction, machine learning, sentiment analysis, Word embeddings, Word2Vec

Abstract

Sentiment Analysis (SA) is an important part of Natural Language Processing (NLP) for getting views and feelings out of text data so that it can be correctly categorized and decisions can be made based on that information. A key part of improving prediction performance is making sure that features are represented correctly. The study uses Kaggle's Amazon review dataset, which has 4,000,000 records but was reduced to just 50,000 reviews for speed reasons. The dataset has a variety of product reviews that have been labeled with different types of emotion. Normalization, HTML/URL removal, tokenization, label encoding, and n-gram creation are all parts of preprocessing. Sentiment distribution and word clouds help you understand features. Several embedding methods are used to make feature vectors, such as Bag-of-Words (BoW), TF-IDF, Word2Vec, BERT, and a model that combines BERT and TF-IDF. Accuracy, precision, recall, F1-score, and confusion matrices are used to train and test machine learning models like Naive Bayes, KNN, Decision Tree, Random Forest, and Linear SVC. To make things even easier to understand and use, advanced models like XGBoost and ElectraModel are used with mixed embeddings that combine BERT, TF-IDF, and BoW. Explainable AI methods, like LIME and SHAP, help us figure out how important a feature is. Real-time sentiment prediction and topic models are possible with a Flask-based interface. When used with ElectraModel, Electra embeddings perform better than baseline and conventional embeddings, hitting 91.1% accuracy

Downloads

Published

2026-04-08

How to Cite

G Dinesh, & D Nagaraj. (2026). Classification of Amazon Product Reviews Using Hybrid Embedding Techniques for Sentiment Analysis. International Journal of Data Science and IoT Management System, 5(2(1), 80-88. https://doi.org/10.64751/

Similar Articles

1-10 of 489

You may also start an advanced similarity search for this article.