Spaces:
Running
title: Document RAG Chatbot
emoji: 🤖
colorFrom: indigo
colorTo: purple
sdk: docker
app_file: flask_app.py
pinned: false
short_description: An intelligent chatbot that understands your documents
Document RAG Chatbot
An intelligent, context-aware chatbot that understands your documents.
Upload a PDF or text file, and it will answer questions using only the information inside — no hallucinations, no fluff.
Built with Flask, LangChain, and Google Gemini, this project demonstrates a clean, modular approach to Retrieval-Augmented Generation (RAG) in action.
Highlights
- Upload and analyze PDF or TXT files
- Uses FAISS for fast semantic search
- Embeddings powered by HuggingFace Sentence Transformer
- Answers generated by Gemini 2.5 Flash
- Works with frontend-provided API key (no server-side storage)
- Clean, responsive interface for smooth chat interaction
Tech Stack
| Component | Technology |
|---|---|
| Backend | Flask |
| Language Model | Google Gemini |
| Vector Store | FAISS |
| Embeddings | HuggingFace all-MiniLM-L6-v2 |
| Frontend | HTML, CSS, JavaScript |
| Framework | LangChain |
Getting Started
1️⃣ Clone the Repository
git clone https://huggingface.co/spaces/<your-username>/Document-RAG-System
cd Document-RAG-System
2️⃣ Set Up Your Environment
python -m venv venv
# Activate
venv\Scripts\activate # Windows
source venv/bin/activate # macOS/Linux
3️⃣ Install Dependencies
pip install -r requirements.txt
Run the App
python flask_app.py
Once running, open your browser and go to: 👉 http://127.0.0.1:5000/
Get Your Gemini API Key
- Visit Google AI Studio
- Generate a Gemini API Key
- Paste it in the “API Key” field on the webpage
- Save and start chatting!
Your key is never stored — it’s used only in your current session.
How It Works
Here’s what happens behind the scenes:
- You upload your document.
- The file is split into small chunks (for efficient retrieval).
- Each chunk is embedded into a vector using HuggingFace embeddings.
- FAISS indexes these vectors for quick similarity search.
- When you ask a question, relevant chunks are retrieved and sent to Gemini.
- Gemini generates a focused, contextual answer — grounded in your document.
That’s Retrieval-Augmented Generation (RAG) in a nutshell.
Example Use Cases
- Summarize long reports
- Extract key information from research papers
- Study assistant for textbooks
- Legal, medical, or technical document Q&A
- Analyze and interpret crypto project whitepapers — understand tokenomics, roadmap, and team details before investing
Customization
You can tweak:
chunk_sizeandchunk_overlapinchatbot.py- The system message for tone or depth
- The Gemini model (to
gemini-2.5-flash,gemini-1.5-pro, etc.)
Author
Williams Odunayo Machine Learning Engineer | Builder of useful AI systems😉 🔗 GitHub • LinkedIn
License
Released under the MIT License. Free to use, modify, and build upon - attribution is appreciated.