Machine-Generated Tweet Detection Using Deep Learning Techniques and FastText Representations

K. Pavani1 , K. Baby Ramya2 , Y. Meghana3

doi:10.64751/

Authors

K. Pavani1 , K. Baby Ramya2 , Y. Meghana3 Author

DOI:

https://doi.org/10.64751/

Abstract

A new way to influence public opinion on social media has emerged thanks to recent advancements in natural language creation. The generative capabilities of deep neural models have been greatly enhanced by advancements in language modeling, giving them better skills for content production. Consequently, text-generative models have become quite powerful, which adversaries can use to their advantage to build social bots, which in turn makes it easier to create real deepfake posts and influence public discourse. As a solution, we must develop trustworthy algorithms to identify deepfake messages on social media. The detection of machinegenerated content on social media platforms like Twitter is the current area of research interest. Using the open-source Tweepfake dataset, this research employs a simple deep learning model with word embeddings to distinguish between bot-generated and human-generated tweets. Using FastText word embeddings, a typical Convolutional Neural Network (CNN) architecture is built to detect deepfake tweets. In order to prove that the suggested method performed better than the baseline, this research used a large number of machine learning models. Term Frequency, FastText, FastText subword embeddings, Term Frequency-Inverse Document Frequency, and other properties were utilized by these core approaches. To further prove its effectiveness and highlight its benefits in solving the current challenge, the proposed method is compared to other deep learning models, such as CNN-LSTM and Long Short-Term Memory (LSTM). Results from experiments show that the CNN architecture configuration with FastText embeddings is suitable for effective and efficient classification of Twitter data, with an astounding 93% accuracy..