A Document Retrieval System Using Retrieval-Augmented Generation

Rayudu Bharath Kumar, Mr. Ch. Bhupathiraju

doi:10.64751/

Authors

Rayudu Bharath Kumar, Mr. Ch. Bhupathiraju Author

DOI:

https://doi.org/10.64751/

Abstract

In the age of exponential information growth, effective document retrieval has become essential for knowledge-based systems. This project presents a Document Retrieval System using RetrievalAugmented Generation (RAG), an architecture that combines traditional retrieval mechanisms with the generative power of transformer-based language models. RAG enhances the retrieval process by dynamically integrating external documents into the response-generation pipeline, allowing the system to produce more accurate and contextually relevant answers. The system first retrieves relevant passages from a pre-indexed document corpus using vector-similarity search (FAISS), and the retrieved documents are then passed to a generative language model that synthesises the final output based on both the input query and the retrieved content. This hybrid approach ensures factual grounding while maintaining the flexibility of generative models, and it reduces hallucinations by conditioning generation on retrieved context. The solution is implemented in Python using LangChain, FAISS, and transformer-based language models, and is particularly effective for question answering, summarisation, and knowledge-base augmentation. Testing and evaluation across document processing, semantic retrieval, response generation, scalability, and reliability confirmed that the system retrieves relevant information accurately and generates context-aware responses. The approach demonstrates improved performance over traditional retrieval-only or generation-only models, offering a scalable and intelligent document-access system suitable for enterprise, academic, and personal use.

A Document Retrieval System Using Retrieval-Augmented Generation

Authors

DOI:

Abstract

Downloads

Published

Issue

Section

License

How to Cite

Latest publications

Information

Language

IF