Spaces:

jfang
/

gprmax-support-gsoc25

Running on Zero

App Files Files Community

gprmax-support-gsoc25 / rag-db /README.md

jfang

Upload 7 files

3718631 verified 2 months ago

preview code

raw

history blame contribute delete

5.25 kB

A newer version of the Gradio SDK is available: 5.49.1

Upgrade

gprMax RAG Database System

Overview

This is a production-ready Retrieval-Augmented Generation (RAG) system for gprMax documentation. It provides efficient vector search capabilities for the gprMax documentation, enabling intelligent context retrieval for the chatbot.

Architecture

Components

Document Processor: Extracts and chunks documentation from gprMax GitHub repository
Embedding Model: Qwen2.5-0.5B (will upgrade to Qwen3-Embedding-0.6B when available)
Vector Database: ChromaDB with persistent storage
Retriever: Search and context retrieval utilities

Key Features

Automatic documentation extraction from gprMax GitHub repository
Intelligent chunking with configurable size and overlap
Persistent vector database using ChromaDB
Efficient similarity search with score thresholding
Metadata tracking for reproducibility

Installation

The database is automatically generated on first startup of the application. No manual installation required!

Automatic Generation

When the app starts:

Checks if database exists at rag-db/chroma_db/
If not found, automatically runs generate_db.py
Clones gprMax repository and processes documentation
Creates ChromaDB with default embeddings (all-MiniLM-L6-v2)
Ready to use - this only happens once!

Manual Generation (Optional)

If you need to manually regenerate the database:

cd rag-db
python generate_db.py --recreate

Custom settings:

python generate_db.py \
    --db-path ./custom_db \
    --temp-dir ./temp \
    --device cuda \
    --recreate

2. Use Retriever in Application

from rag_db.retriever import create_retriever

# Initialize retriever
retriever = create_retriever(db_path="./rag-db/chroma_db")

# Search for relevant documents
results = retriever.search("How to create a source?", k=5)

# Get formatted context for LLM
context = retriever.get_context("antenna patterns", k=3)

# Get relevant source files
files = retriever.get_relevant_files("boundary conditions")

# Get database statistics
stats = retriever.get_stats()

3. Test Retriever

# Test with default query
python retriever.py

# Test with custom query
python retriever.py "How to model soil layers?"

Database Schema

Document Structure

{
    "id": "unique_hash",
    "text": "document_chunk_text",
    "metadata": {
        "source": "docs/relative/path.rst",
        "file_type": ".rst",
        "chunk_index": 0,
        "char_start": 0,
        "char_end": 1000
    }
}

Metadata File

Generated metadata.json contains:

{
    "created_at": "2024-01-01T00:00:00",
    "embedding_model": "Qwen/Qwen2.5-0.5B",
    "collection_name": "gprmax_docs_v1",
    "chunk_size": 1000,
    "chunk_overlap": 200,
    "total_documents": 1234
}

Configuration

Chunking Parameters

CHUNK_SIZE: 1000 characters (optimal for context windows)
CHUNK_OVERLAP: 200 characters (ensures continuity)

Embedding Model

Current: Qwen/Qwen2.5-0.5B (512-dim embeddings)
Future: Qwen/Qwen3-Embedding-0.6B (when available)

Database Settings

Storage: ChromaDB persistent client
Collection: gprmax_docs_v1 (versioned for updates)
Distance Metric: Cosine similarity

Maintenance

Regular Updates

Run monthly or when gprMax documentation updates:

# This will pull latest docs and update database
python generate_db.py

Database Backup

# Backup database
cp -r chroma_db chroma_db_backup_$(date +%Y%m%d)

Performance Tuning

Adjust CHUNK_SIZE and CHUNK_OVERLAP in generate_db.py
Modify batch sizes for large datasets
Use GPU acceleration with --device cuda

Integration with Main App

The RAG system integrates with the main Gradio app:

Import retriever in app.py
Use retriever to augment prompts with context
Display source references in UI

Example integration:

# In app.py
from rag_db.retriever import create_retriever

retriever = create_retriever()

def augment_with_context(user_query):
    context = retriever.get_context(user_query, k=3)
    augmented_prompt = f"""
    Context from documentation:
    {context}
    
    User question: {user_query}
    """
    return augmented_prompt

Troubleshooting

Common Issues

Database not found
- Run python generate_db.py first
- Check --db-path parameter
Out of memory
- Use smaller batch sizes
- Use CPU instead of GPU
- Reduce chunk size
Slow generation
- Use GPU with --device cuda
- Reduce repository depth with shallow clone
- Use pre-generated database

Logs

Check generation logs for detailed information:

python generate_db.py 2>&1 | tee generation.log

Future Enhancements

Model Upgrade: Migrate to Qwen3-Embedding-0.6B when available
Incremental Updates: Add documents without full regeneration
Multi-modal Support: Include images and diagrams from docs
Query Expansion: Automatic query reformulation for better retrieval
Caching Layer: Redis cache for frequent queries
Fine-tuned Embeddings: Domain-specific embedding model for gprMax

License

Same as parent project