A Review of Scalable Big Data Processing in Cloud-Native Environments

Mr. Swapnil Joshi

doi:10.64751/ijdim.2026.v5.n2(2).pp628-635

Authors

Mr. Swapnil Joshi Author

DOI:

https://doi.org/10.64751/ijdim.2026.v5.n2(2).pp628-635

Keywords:

Big Data Processing, Cloud-Native Computing, Scalability, Apache Spark, Stream Processing, Big Data Frameworks

Abstract

The vast amount of data created by social media, Internet of Things (IoT) devices, e-commerce platforms, and enterprise applications has presented many difficulties in efficient storage, processing, and analysis of large data sets. The limitations faced by the traditional big data systems in solving today's data-intensive workloads include scalability, flexibility, and processing speed. Cloud-native technologies have proven to be effective solutions as they allow distributed computing, elastic scalability, the ability to deploy to containers and automated management of resources. In this paper, the scalable big data processing in cloudnative environment is reviewed. The study covers how big data architectures have changed over time, with a shift from Hadoopbased to microservices, containerization, orchestration and event-driven architectures. It discusses the key big data processing methods, such as batch, real-time and stream processing, and popular frameworks such as Apache Hadoop, Apache Spark, Apache Flink and Apache Kafka. Moreover, the paper examines the application of Docker, Kubernetes and serverless computing in scalable data processing. Lastly, the issues of security, interoperability, latency and resource optimization are discussed, along with future research directions on efficient cloud-native big data systems.