Spaces:

Zeri00
/

Cogni-Chat-document-reader-v2

Running

App Files Files Community

Cogni-Chat-document-reader-v2 / README.md

riteshraut

fix/new update

becc8f7 12 days ago

preview code

raw

history blame contribute delete

13 kB

metadata

title: CogniChat - Chat with Your Documents
emoji: 🤖
colorFrom: blue
colorTo: purple
sdk: docker
pinned: false
license: mit
app_port: 7860

🤖 CogniChat - Intelligent Document Chat System

Transform your documents into interactive conversations powered by advanced RAG technology

CogniChat Demo

Features • Quick Start • Architecture • Deployment • API

🎯 Overview

CogniChat is a production-ready, intelligent document chat application that leverages Retrieval Augmented Generation (RAG) to enable natural conversations with your documents. Built with enterprise-grade technologies, it provides accurate, context-aware responses from your document corpus.

Why CogniChat?

🔉 Audio Overview of Your document:Simply ask the question and listen the audio. Now your document can speak with you.
🎯 Accurate Retrieval: Hybrid search combining BM25 and FAISS for optimal results
💬 Conversational Memory: Maintains context across multiple interactions
📄 Multi-Format Support: Handles PDF, DOCX, TXT, and image files
🚀 Production Ready: Docker support, comprehensive error handling, and security best practices
🎨 Modern UI: Responsive design with dark mode and real-time streaming

✨ Features

Core Capabilities

Feature	Description
Multi-Format Processing	Upload and process PDF, DOCX, TXT, and image files
Hybrid Search	Combines BM25 (keyword) and FAISS (semantic) for superior retrieval
Conversational AI	Powered by Groq's Llama 3.1 for intelligent responses
Memory Management	Maintains chat history for contextual conversations
Text-to-Speech	Built-in TTS for audio playback of responses
Streaming Responses	Real-time token streaming for better UX
Document Chunking	Intelligent text splitting for optimal context windows

Advanced Features

Semantic Embeddings: HuggingFace all-miniLM-L6-v2 for accurate vector representations
Reranking: Contextual compression for improved relevance
Error Handling: Comprehensive fallback mechanisms and error recovery
Security: Non-root Docker execution and environment-based secrets
Scalability: Optimized for both local and cloud deployments

🏗 Architecture

RAG Pipeline Overview

graph TB
    A[Document Upload] --> B[Document Processing]
    B --> C[Text Extraction]
    C --> D[Chunking Strategy]
    D --> E[Embedding Generation]
    E --> F[Vector Store FAISS]
    
    G[User Query] --> H[Query Embedding]
    H --> I[Hybrid Retrieval]
    
    F --> I
    J[BM25 Index] --> I
    
    I --> K[Reranking]
    K --> L[Context Assembly]
    L --> M[LLM Groq Llama 3.1]
    M --> N[Response Generation]
    N --> O[Streaming Output]
    
    P[Chat History] --> M
    N --> P
    
    style A fill:#e1f5ff
    style G fill:#e1f5ff
    style F fill:#ffe1f5
    style J fill:#ffe1f5
    style M fill:#f5e1ff
    style O fill:#e1ffe1

System Architecture

graph LR
    A[Client Browser] -->|HTTP/WebSocket| B[Flask Server]
    B --> C[Document Processor]
    B --> D[RAG Engine]
    B --> E[TTS Service]
    
    C --> F[(File Storage)]
    D --> G[(FAISS Vector DB)]
    D --> H[(BM25 Index)]
    D --> I[Groq API]
    
    J[HuggingFace Models] --> D
    
    style B fill:#4a90e2
    style D fill:#e24a90
    style I fill:#90e24a

Data Flow

Document Ingestion: Files are uploaded and validated
Processing Pipeline: Text extraction → Chunking → Embedding
Indexing: Dual indexing (FAISS + BM25) for hybrid search
Query Processing: User queries are embedded and searched
Retrieval: Top-k relevant chunks retrieved using hybrid approach
Generation: LLM generates contextual responses with citations
Streaming: Responses streamed back to client in real-time

🛠 Technology Stack

Backend

Component	Technology	Purpose
Framework	Flask 2.3+	Web application framework
RAG	LangChain	RAG pipeline orchestration
Vector DB	FAISS	Fast similarity search
Keyword Search	BM25	Sparse retrieval
LLM	Groq Llama 3.1	Response generation
Embeddings	HuggingFace Transformers	Semantic embeddings
Doc Processing	Unstructured, PyPDF, python-docx	Multi-format parsing

Frontend

Component	Technology
UI Framework	TailwindCSS
JavaScript	Vanilla ES6+
Icons	Font Awesome
Markdown	Marked.js

Infrastructure

Containerization: Docker + Docker Compose
Deployment: HuggingFace Spaces, local, cloud-agnostic
Security: Environment-based secrets, non-root execution

🚀 Quick Start

Prerequisites

Python 3.9+
Docker (optional, recommended)
Groq API Key (Get one here)

Installation Methods

🐳 Method 1: Docker (Recommended)

# Clone the repository
git clone https://github.com/RautRitesh/Chat-with-docs
cd cognichat

# Create environment file
cp .env.example .env

# Add your Groq API key to .env
echo "GROQ_API_KEY=your_actual_api_key_here" >> .env

# Build and run with Docker Compose
docker-compose up -d

# Or build manually
docker build -t cognichat .
docker run -p 7860:7860 --env-file .env cognichat

🐍 Method 2: Local Python Environment

# Clone the repository
git clone https://github.com/RautRitesh/Chat-with-docs
cd cognichat

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Set environment variables
export GROQ_API_KEY=your_actual_api_key_here

# Run the application
python app.py

🤗 Method 3: HuggingFace Spaces

Fork this repository
Create a new Space on HuggingFace
Link your forked repository
Add GROQ_API_KEY in Settings → Repository Secrets
Space will auto-deploy!

First Steps

Open http://localhost:7860 in your browser
Upload a document (PDF, DOCX, TXT, or image)
Wait for processing (progress indicator will show status)
Start chatting with your document!
Use the 🔊 button to hear responses via TTS

📦 Deployment

Environment Variables

Create a .env file with the following variables:

# Required
GROQ_API_KEY=your_groq_api_key_here

# Optional
PORT=7860
HF_HOME=/tmp/huggingface_cache  # For HF Spaces
FLASK_DEBUG=0  # Set to 1 for development
MAX_UPLOAD_SIZE=10485760  # 10MB default

Docker Deployment

# Production build
docker build -t cognichat:latest .

# Run with resource limits
docker run -d \
  --name cognichat \
  -p 7860:7860 \
  --env-file .env \
  --memory="2g" \
  --cpus="1.5" \
  cognichat:latest

Docker Compose

version: '3.8'

services:
  cognichat:
    build: .
    ports:
      - "7860:7860"
    environment:
      - GROQ_API_KEY=${GROQ_API_KEY}
    volumes:
      - ./data:/app/data
    restart: unless-stopped

HuggingFace Spaces Configuration

Add these files to your repository:

app_port in README.md header:

app_port: 7860

Repository Secrets:

GROQ_API_KEY: Your Groq API key

The application automatically detects HF Spaces environment and adjusts paths accordingly.

⚙️ Configuration

Document Processing Settings

# In app.py - Customize these settings
CHUNK_SIZE = 1000  # Characters per chunk
CHUNK_OVERLAP = 200  # Overlap between chunks
EMBEDDING_MODEL = "sentence-transformers/all-miniLM-L6-v2"
RETRIEVER_K = 5  # Number of chunks to retrieve

Model Configuration

# LLM Settings
LLM_PROVIDER = "groq"
MODEL_NAME = "llama-3.1-70b-versatile"
TEMPERATURE = 0.7
MAX_TOKENS = 2048

Search Configuration

# Hybrid Search Weights
FAISS_WEIGHT = 0.6  # Semantic search weight
BM25_WEIGHT = 0.4   # Keyword search weight

📚 API Reference

Endpoints

Upload Document

POST /upload
Content-Type: multipart/form-data

{
  "file": <binary>
}

Response:

{
  "status": "success",
  "message": "Document processed successfully",
  "filename": "example.pdf",
  "chunks": 45
}

Chat

POST /chat
Content-Type: application/json

{
  "message": "What is the main topic?",
  "stream": true
}

Response (Streaming):

data: {"token": "The", "done": false}
data: {"token": " main", "done": false}
data: {"token": " topic", "done": false}
data: {"done": true}

Clear Session

POST /clear

Response:

{
  "status": "success",
  "message": "Session cleared"
}

🔧 Troubleshooting

Common Issues

1. Permission Errors in Docker

Problem: Permission denied when writing to cache directories

Solution:

# Rebuild with proper permissions
docker build --no-cache -t cognichat .

# Or run with volume permissions
docker run -v $(pwd)/cache:/tmp/huggingface_cache \
  --user $(id -u):$(id -g) \
  cognichat

2. Model Loading Fails

Problem: Cannot download HuggingFace models

Solution:

# Pre-download models
python test_embeddings.py

# Or use HF_HOME environment variable
export HF_HOME=/path/to/writable/directory

3. Chat Returns 400 Error

Problem: Upload directory not writable (common in HF Spaces)

Solution: Application now automatically uses /tmp/uploads in HF Spaces environment. Ensure latest version is deployed.

4. API Key Invalid

Problem: Groq API returns authentication error

Solution:

Verify key at Groq Console
Check .env file has correct format: GROQ_API_KEY=gsk_...
Restart application after updating key

Debug Mode

Enable detailed logging:

export FLASK_DEBUG=1
export LANGCHAIN_VERBOSE=true
python app.py

🧪 Testing

# Run test suite
pytest tests/

# Test embedding model
python test_embeddings.py

# Test document processing
pytest tests/test_document_processor.py

# Integration tests
pytest tests/test_integration.py

🤝 Contributing

We welcome contributions! Please follow these steps:

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

Development Guidelines

Follow PEP 8 style guide
Add tests for new features
Update documentation
Ensure Docker build succeeds

📝 Changelog

Version 2.0 (October 2025)

✅ Major Improvements:

Fixed Docker permission issues
HuggingFace Spaces compatibility
Enhanced error handling
Multiple model loading fallbacks
Improved security (non-root execution)

✅ Bug Fixes:

Upload directory write permissions
Cache directory access
Model initialization reliability

Version 1.0 (Initial Release)

Basic RAG functionality
PDF and DOCX support
FAISS vector store
Conversational memory

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

LangChain for RAG framework
Groq for high-speed LLM inference
HuggingFace for embeddings and hosting
FAISS for efficient vector search

📞 Support

Issues: GitHub Issues
Discussions: GitHub Discussions
Email: riteshraut123321@gmail.com

Made with ❤️ by the CogniChat Team

⭐ Star us on GitHub • 🐛 Report Bug • ✨ Request Feature