Spaces:
Running
title: RAG Project
emoji: π§
colorFrom: blue
colorTo: purple
sdk: docker
app_port: 8000
python_version: 3.1
π RAG System with LangChain and FastAPI π
Welcome to this repository! This project demonstrates how to build a powerful RAG system using LangChain and FastAPI for generating contextually relevant and accurate responses by integrating external data into the generative process.
π Project Overview
The RAG system combines retrieval and generation to provide smarter AI-driven responses. Using LangChain for document handling and embeddings, and FastAPI for deploying a fast, scalable API, this project includes:
- ποΈ Document Loading: Load data from various sources (text, PDFs, etc.).
- βοΈ Text Splitting: Break large documents into manageable chunks.
- π§ Embeddings: Generate vector embeddings for efficient search and retrieval.
- π Vector Stores: Store embeddings in a vector store for fast similarity searches.
- π§ Retrieval: Retrieve the most relevant document chunks based on user queries.
- π¬ Generative Response: Use retrieved data with language models (LLMs) to generate accurate, context-aware answers.
- π FastAPI: Deploy the RAG system as a scalable API for easy interaction.
βοΈ Setup and Installation
Prerequisites
Make sure you have the following installed:
- π Python 3.10+
- π³ Docker (optional, for deployment)
- π οΈ PostgreSQL or FAISS (for vector storage)
Installation Steps
Clone the repository:
git clone https://github.com/yadavkapil23/RAG_Project.gitSet up a virtual environment:
python -m venv venv source venv/bin/activate # For Linux/Mac venv\Scripts\activate # For WindowsInstall dependencies:
pip install -r requirements.txtRun the FastAPI server:
uvicorn main:app --reloadNow, your FastAPI app will be running at
http://127.0.0.1:8000π!
Set up Ollama π¦
This project uses Ollama to run local large language models.
Install Ollama: Follow the instructions on the Ollama website to download and install Ollama.
Pull a model: Pull a model to use with the application. This project uses
llama3.ollama pull llama3
π οΈ Features
- Retrieval-Augmented Generation: Combines the best of both worldsβretrieving relevant data and generating insightful responses.
- Scalable API: FastAPI makes it easy to deploy and scale the RAG system.
- Document Handling: Supports multiple document types for loading and processing.
- Vector Embeddings: Efficient search with FAISS or other vector stores.
π‘οΈ Security
- π OAuth2 and API Key authentication support for secure API access.
- π TLS/SSL for encrypting data in transit.
- π‘οΈ Data encryption for sensitive document storage.
π Deployment
Hugging Face Spaces (Docker) Deployment
This project is configured for a Hugging Face Space using the Docker runtime.
- Push this repository to GitHub (or connect local).
- Create a new Space on Hugging Face β Choose "Docker" SDK.
- Point it to this repo. Spaces will build using the
Dockerfileand runuvicornbinding to the providedPORT. - Ensure the file
data/sample.pdfexists (or replace it) to allow FAISS index creation on startup.
Notes:
- Models
Qwen/Qwen2-0.5B-Instructandall-MiniLM-L6-v2will be downloaded on first run; initial cold start may take several minutes. - Dependencies are CPU-friendly; no GPU is required.
- If you see OOM, consider reducing
max_new_tokensinvector_rag.pyor swapping to an even smaller instruct model.
Docker Deployment (Local)
If you want to deploy your RAG system using Docker, simply build the Docker image and run the container:
docker build -t rag-system .
docker run -p 8000:8000 rag-system
Cloud Deployment
Deploy your RAG system to the cloud using platforms like AWS, Azure, or Google Cloud with minimal setup.
π§ Future Enhancements
- π Real-time Data Integration: Add real-time data sources for dynamic responses.
- π€ Advanced Retrieval Techniques: Implement deep learning-based retrievers for better query understanding.
- π Monitoring Tools: Add monitoring with tools like Prometheus or Grafana for performance insights.
π€ Contributing
Want to contribute? Feel free to fork this repository, submit a pull request, or open an issue. We welcome all contributions! π οΈ
π License
This project is licensed under the MIT License.
π Thank you for checking out the RAG System with LangChain and FastAPI! If you have any questions or suggestions, feel free to reach out or open an issue. Let's build something amazing!