jfang's picture
Upload 7 files
3718631 verified

A newer version of the Gradio SDK is available: 5.49.1

Upgrade

gprMax RAG Database System

Overview

This is a production-ready Retrieval-Augmented Generation (RAG) system for gprMax documentation. It provides efficient vector search capabilities for the gprMax documentation, enabling intelligent context retrieval for the chatbot.

Architecture

Components

  1. Document Processor: Extracts and chunks documentation from gprMax GitHub repository
  2. Embedding Model: Qwen2.5-0.5B (will upgrade to Qwen3-Embedding-0.6B when available)
  3. Vector Database: ChromaDB with persistent storage
  4. Retriever: Search and context retrieval utilities

Key Features

  • Automatic documentation extraction from gprMax GitHub repository
  • Intelligent chunking with configurable size and overlap
  • Persistent vector database using ChromaDB
  • Efficient similarity search with score thresholding
  • Metadata tracking for reproducibility

Installation

The database is automatically generated on first startup of the application. No manual installation required!

Automatic Generation

When the app starts:

  1. Checks if database exists at rag-db/chroma_db/
  2. If not found, automatically runs generate_db.py
  3. Clones gprMax repository and processes documentation
  4. Creates ChromaDB with default embeddings (all-MiniLM-L6-v2)
  5. Ready to use - this only happens once!

Manual Generation (Optional)

If you need to manually regenerate the database:

cd rag-db
python generate_db.py --recreate

Custom settings:

python generate_db.py \
    --db-path ./custom_db \
    --temp-dir ./temp \
    --device cuda \
    --recreate

2. Use Retriever in Application

from rag_db.retriever import create_retriever

# Initialize retriever
retriever = create_retriever(db_path="./rag-db/chroma_db")

# Search for relevant documents
results = retriever.search("How to create a source?", k=5)

# Get formatted context for LLM
context = retriever.get_context("antenna patterns", k=3)

# Get relevant source files
files = retriever.get_relevant_files("boundary conditions")

# Get database statistics
stats = retriever.get_stats()

3. Test Retriever

# Test with default query
python retriever.py

# Test with custom query
python retriever.py "How to model soil layers?"

Database Schema

Document Structure

{
    "id": "unique_hash",
    "text": "document_chunk_text",
    "metadata": {
        "source": "docs/relative/path.rst",
        "file_type": ".rst",
        "chunk_index": 0,
        "char_start": 0,
        "char_end": 1000
    }
}

Metadata File

Generated metadata.json contains:

{
    "created_at": "2024-01-01T00:00:00",
    "embedding_model": "Qwen/Qwen2.5-0.5B",
    "collection_name": "gprmax_docs_v1",
    "chunk_size": 1000,
    "chunk_overlap": 200,
    "total_documents": 1234
}

Configuration

Chunking Parameters

  • CHUNK_SIZE: 1000 characters (optimal for context windows)
  • CHUNK_OVERLAP: 200 characters (ensures continuity)

Embedding Model

  • Current: Qwen/Qwen2.5-0.5B (512-dim embeddings)
  • Future: Qwen/Qwen3-Embedding-0.6B (when available)

Database Settings

  • Storage: ChromaDB persistent client
  • Collection: gprmax_docs_v1 (versioned for updates)
  • Distance Metric: Cosine similarity

Maintenance

Regular Updates

Run monthly or when gprMax documentation updates:

# This will pull latest docs and update database
python generate_db.py

Database Backup

# Backup database
cp -r chroma_db chroma_db_backup_$(date +%Y%m%d)

Performance Tuning

  • Adjust CHUNK_SIZE and CHUNK_OVERLAP in generate_db.py
  • Modify batch sizes for large datasets
  • Use GPU acceleration with --device cuda

Integration with Main App

The RAG system integrates with the main Gradio app:

  1. Import retriever in app.py
  2. Use retriever to augment prompts with context
  3. Display source references in UI

Example integration:

# In app.py
from rag_db.retriever import create_retriever

retriever = create_retriever()

def augment_with_context(user_query):
    context = retriever.get_context(user_query, k=3)
    augmented_prompt = f"""
    Context from documentation:
    {context}
    
    User question: {user_query}
    """
    return augmented_prompt

Troubleshooting

Common Issues

  1. Database not found

    • Run python generate_db.py first
    • Check --db-path parameter
  2. Out of memory

    • Use smaller batch sizes
    • Use CPU instead of GPU
    • Reduce chunk size
  3. Slow generation

    • Use GPU with --device cuda
    • Reduce repository depth with shallow clone
    • Use pre-generated database

Logs

Check generation logs for detailed information:

python generate_db.py 2>&1 | tee generation.log

Future Enhancements

  1. Model Upgrade: Migrate to Qwen3-Embedding-0.6B when available
  2. Incremental Updates: Add documents without full regeneration
  3. Multi-modal Support: Include images and diagrams from docs
  4. Query Expansion: Automatic query reformulation for better retrieval
  5. Caching Layer: Redis cache for frequent queries
  6. Fine-tuned Embeddings: Domain-specific embedding model for gprMax

License

Same as parent project