VISION TRANSFORMERS OF AI-GENERATED VISUAL CONTENTCLASSIFICATION

N. JAGADEESH, JAMPANA SIRISHA, AKUMALLA SIVA JYOTHSNA, YARAGANI NAGA HARSHA, SURATHU ROHIT SAI KUMAR

doi:10.5281/zenodo.19145351

Authors

N. JAGADEESH, JAMPANA SIRISHA, AKUMALLA SIVA JYOTHSNA, YARAGANI NAGA HARSHA, SURATHU ROHIT SAI KUMAR Author

DOI:

https://doi.org/10.5281/zenodo.19145351

Abstract

The rapid development of generative artificial intelligence (AI) models has significantly transformed the creation of digital visual content. Modern generative models such as diffusion models and generative adversarial networks are capable of producing highly realistic images that are often indistinguishable from genuine photographs. While these technologies have expanded opportunities in creative design, media production, and digital automation, they have also introduced serious challenges related to misinformation, deepfake dissemination, digital forgery, and copyright violations. Consequently, the ability to accurately classify AI-generated images has become a critical requirement for maintaining trust in digital media ecosystems. Traditional image classification approaches largely rely on convolutional neural networks (CNNs) that focus on local spatial features. Although CNNs have achieved strong performance in many vision tasks, they struggle to capture long-range dependencies and global contextual relationships that are important for detecting subtle artifacts present in AI-generated images. Vision Transformers (ViTs), which utilize self-attention mechanisms to model global image relationships, have emerged as a powerful alternative architecture for advanced visual understanding. This study proposes a Vision Transformer-based framework for detecting and classifying AI-generated visual content. The proposed system leverages transformer encoders to extract global contextual representations from images and improves classification performance compared to conventional CNN approaches. Experimental evaluation demonstrates that transformer-based architectures provide superior detection capability for synthetic images generated by modern AI models. The results highlight the potential of Vision Transformers in enhancing image authenticity verification systems and combating the growing threat of synthetic visual misinformation in digital platforms.

VISION TRANSFORMERS OF AI-GENERATED VISUAL CONTENTCLASSIFICATION

Authors

DOI:

Abstract

Downloads

Published

Issue

Section

License

How to Cite

Latest publications

Information

Language

IF