Spaces:

GIZ
/

gina

Running on CPU Upgrade

App Files Files Community

ppsingh commited on Sep 3

Commit

08f161e

1 Parent(s): 7231f12

init

Browse files

Files changed (3) hide show

README.md +77 -7
app.py +500 -31
requirements.txt +163 -0

README.md CHANGED Viewed

@@ -1,14 +1,84 @@
 ---
-title: Gina
-emoji: 🐢
-colorFrom: gray
 colorTo: gray
 sdk: gradio
-sdk_version: 5.42.0
 app_file: app.py
 pinned: false
-license: apache-2.0
-short_description: A Chatbot to search for information on Circular Economy.
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 ---
+title: Gina Assistant
+emoji: 🧐
+colorFrom: yellow
 colorTo: gray
 sdk: gradio
+sdk_version: 4.44.1
 app_file: app.py
+fullWidth: true
 pinned: false
+startup_duration_timeout: 1h
+models:
+- BAAI/bge-m3
+- BAAI/bge-reranker-v2-m3
+- meta-llama/Llama-3.1-8B-Instruct
 ---
+short_description: AI-powered conversational assistant for Circular Economy
+license: apache-2.0
+## Technical Documentation of the system in accordance with EU AI Act
+**System Name:** Gina ChatBot
+**Provider / Supplier:** GIZ Data Service Center
+**As of:** August 2025
+## 1. General Description of the System
+Gina Bot is an AI-powered conversational assistant designed to help you understand compliance with and analyze the Circular Economy topic. This tool leverages advanced language models to help you get clear and structured answers about Circular Economy requirements, compliance procedures, and regulatory guidance.
+It combines a generative language assistant with a knowledge base implemented via Retrieval-Augmented Generation (RAG). The scope and functionality of the tool is focused on EU Deforestation Regulation compliance and related documentation.
+## 2. Models Used
+### Generative LLM
+- **Model Name:** [meta-llama/Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct)
+- **Model Source API:** [Nebius AI](https://studio.nebius.com/)
+### Retriever/Embedding
+- **Model Name:** [BAAI/bge-m3](https://huggingface.co/BAAI/bge-m3)
+- **Model Source:** Local Instance
+### Re-ranker
+- **Model Name:** [BAAI/bge-reranker-v2-m3](https://huggingface.co/BAAI/bge-reranker-v2-m3)
+- **Model Source:** Local Instance
+## 3. Model Training Data
+All the models mentioned above are being consumed without any fine-tuning or training being performed by the developer team of Gina Bot. And hence there is no training data which had been used by the development team of Gina Bot.
+## 4. Knowledge Base (Retrieval Component)
+- **Data Sources:** Public Circular Economy documentation, regulatory guidance, and compliance materials
+- **Embedding Model:** [BAAI/bge-m3](https://huggingface.co/BAAI/bge-m3)
+- **Embedding Dimension:** 1024
+- **Vector Database:** Qdrant (via API)
+- **Framework:** Langchain (custom RAG pipeline)
+- **Top-k:** 5 relevant text segments per query
+## 5. System Limitations and Non-Purposes
+- The system does not make autonomous decisions.
+- No processing of personal data except for the usage statistics as mentioned in Disclaimer.
+- Results are intended for orientation only – not for legal or regulatory compliance advice.
+- Users should consult official EU documentation and legal experts for definitive compliance guidance.
+## 6. Transparency Towards Users
+- The user interface clearly indicates the use of a generative AI model.
+- An explanation of the RAG method is included.
+- We collect usage statistics as detailed in Disclaimer tab of the app along with the explicit display in the user interface of the tool.
+- Feedback mechanism available (via https://huggingface.co/spaces/GIZ/gina_dev/discussions/new).
+## 7. Monitoring, Feedback, and Incident Reporting
+- User can provide feedback via UI by giving (Thumbs-up or down to AI-Generated answer). Alternatively for more detailed feedback please use https://huggingface.co/spaces/GIZ/gina_dev/discussions/new to report any issue.
+- Technical development is carried out by the GIZ Data Service Center.
+- No automated bias detection – but low risk due to content restrictions.
+## 8. Contact
+For any questions, please contact via https://huggingface.co/spaces/GIZ/gina_dev/discussions/new or send us email to dataservicecenter@giz.de

app.py CHANGED Viewed

@@ -1,33 +1,502 @@
 import gradio as gr
-# Define the HTML and CSS for the banner
-# We use a markdown component and inject a simple div with inline styling
-banner_content = """
-<div style='
-    background-color: #f0f0f0;
-    padding: 15px;
-    border-radius: 10px;
-    text-align: center;
-    margin: 20px auto;
-    width: fit-content;
-    border: 1px solid #ddd;
-    box-shadow: 0 4px 8px rgba(0,0,0,0.1);
-'>
-    <h2 style='
-        margin: 0;
-        color: #555;
-        font-family: sans-serif;
-    '>
-        🚧 Work in progress 🚧
-    </h2>
-</div>
-"""
-# Create the Gradio interface using gr.Blocks
-# Blocks gives us more control over the layout
-with gr.Blocks() as demo:
-    # Use gr.Markdown to display the HTML content
-    gr.Markdown(banner_content)
-# Launch the demo
-demo.launch()

 import gradio as gr
+import time
+import pandas as pd
+import asyncio
+from uuid import uuid4
+from gradio_client import Client, handle_file
+from utils.retriever import retrieve_paragraphs
+from utils.generator import generate
+from utils.logger import ChatLogger
+from huggingface_hub import CommitScheduler
+import json
+import ast
+import os
+from pathlib import Path
+# Set up dataset directory and HuggingFace integration
+# JSON_DATASET_DIR = Path("json_dataset")
+# JSON_DATASET_DIR.mkdir(parents=True, exist_ok=True)
+# JSON_DATASET_PATH = JSON_DATASET_DIR / f"logs-{uuid4()}.jsonl"
+# Set up dataset directory and HuggingFace integration
+JSON_DATASET_DIR = Path("json_dataset")
+# Check if directory exists and create if needed
+if not JSON_DATASET_DIR.exists():
+    try:
+        JSON_DATASET_DIR.mkdir(parents=True, exist_ok=True)
+        print(f"Created dataset directory at {JSON_DATASET_DIR}")
+    except Exception as e:
+        print(f"Error creating dataset directory: {str(e)}")
+        raise
+else:
+    print(f"Using existing dataset directory at {JSON_DATASET_DIR}")
+# Get HuggingFace token from environment
+SPACES_LOG = os.environ.get("GINA_SPACES_LOG")
+if not SPACES_LOG:
+    print("Warning: GINA_SPACES_LOG not found in environment, using local storage only")
+# Initialize scheduler with proper dataset configuration
+scheduler = CommitScheduler(
+    repo_id="GIZ/spaces_logs",
+    repo_type="dataset",
+    folder_path=JSON_DATASET_DIR,
+    path_in_repo="gina_chatbot",
+    token=SPACES_LOG if SPACES_LOG else None,
+    every=60  # Sync every 60 seconds
+)
+# Initialize logger with configured scheduler
+chat_logger = ChatLogger(scheduler = scheduler )
+# Sample questions for examples
+SAMPLE_QUESTIONS = {
+    "Fundamentos y tendencias internacionales de EC": [
+        "¿Cómo se diferencia el modelo de economía circular del modelo lineal tradicional de 'tomar, hacer, desechar'?",
+        "¿Cuáles son algunos de los principios clave de la economía circular y cómo se aplican en la práctica?",
+        "¿Podrías dar ejemplos de industrias o empresas que estén implementando con éxito prácticas de economía circular?"
+    ],
+    "EC en Colombia": [
+        "¿Qué políticas y normativas vigentes en Colombia impulsan la adopción de la economía circular?",
+        "¿Cómo pueden las regulaciones colombianas incentivar la innovación en el ecodiseño y la gestión de residuos?",
+        "¿Qué papel tienen los instrumentos económicos y fiscales en la promoción de la circularidad en el sector productivo de Colombia?"]
+}
+# Global variable to cache API results and prevent double calls
+geojson_analysis_cache = {}
+# Initialize Chat
+def start_chat(query, history):
+    """Start a new chat interaction"""
+    history = history + [(query, None)]
+    return gr.update(interactive=False), gr.update(selected=1), history
+def finish_chat():
+    """Finish chat and reset input"""
+    return gr.update(interactive=True, value="")
+def make_html_source(source,i):
+    """
+    takes the text and converts it into html format for display in "source" side tab
+    """
+    meta = source['answer_metadata']
+    content = source['answer'].strip()
+    name = meta['filename']
+    card = f"""
+        <div class="card" id="doc{i}">
+            <div class="card-content">
+                <h2>Doc {i} - {meta['filename']} - Page {int(meta['page'])}</h2>
+                <p>{content}</p>
+            </div>
+            <div class="card-footer">
+                <span>{name}</span>
+                <a href="{meta['filename']}#page={int(meta['page'])}" target="_blank" class="pdf-link">
+                    <span role="img" aria-label="Open PDF">🔗</span>
+                </a>
+            </div>
+        </div>
+        """
+    return card
+async def chat_response(query, history, category, request=None):
+    """Generate chat response based on method and inputs"""
+    try:
+        retrieved_paragraphs = retrieve_paragraphs(query, category)
+        context_retrieved = ast.literal_eval(retrieved_paragraphs)
+        # Build list of only content, no metadata
+        context_retrieved_formatted = "||".join(doc['answer'] for doc in context_retrieved)
+        context_retrieved_lst = [doc['answer'] for doc in context_retrieved]
+        # Prepare HTML for displaying source documents
+        docs_html = []
+        for i, d in enumerate(context_retrieved, 1):
+            docs_html.append(make_html_source(d, i))
+        docs_html = "".join(docs_html)
+        # Generate response
+        response = await generate(query=query, context=context_retrieved_lst)
+        # Log the interaction
+        try:
+            chat_logger.log(
+                query=query,
+                answer=response,
+                retrieved_content=context_retrieved_lst,
+                request=request
+            )
+        except Exception as e:
+            print(f"Logging error: {str(e)}")
+        # Stream response character by character
+        displayed_response = ""
+        for i, char in enumerate(response):
+            displayed_response += char
+            history[-1] = (query, displayed_response)
+            yield history, docs_html
+            # Only add delay every few characters to avoid being too slow
+            if i % 3 == 0:
+                await asyncio.sleep(0.02)
+    except Exception as e:
+        error_message = f"Error processing request: {str(e)}"
+        history[-1] = (query, error_message)
+        yield history, ""
+    # # Stream response word by word into the chat
+    # words = response.split()
+    # for i in range(len(words)):
+    #     history[-1] = (query, " ".join(words[:i+1]))
+    #     yield history, "**Sources:** Sample source documents would appear here..."
+    #     await asyncio.sleep(0.05)
+# def auto_analyze_file(file, history):
+#     """Automatically analyze uploaded GeoJSON file and add results to chat"""
+#     if file is not None:
+#         try:
+#             # Call API immediately and cache results
+#             file_key = f"{file.name}_{file.size if hasattr(file, 'size') else 'unknown'}"
+#             if file_key not in geojson_analysis_cache:
+#                 formatted_stats = "This is to be removed"
+#                 geojson_analysis_cache[file_key] = formatted_stats
+#             # Add analysis results directly to chat (no intermediate message)
+#             analysis_query = "📄 Análisis del GeoJSON cargado"
+#             cached_result = geojson_analysis_cache[file_key]
+#             # Add both query and response to history
+#             history = history + [(analysis_query, cached_result)]
+#             return history, "**Sources:** WhispAPI Analysis Results"
+#         except Exception as e:
+#             error_msg = f"❌ Error processing GeoJSON file: {str(e)}"
+#             history = history + [("📄 Error en análisis GeoJSON", error_msg)]
+#             return history, ""
+#     return history, ""
+def toggle_search_method(method):
+    """Toggle between GeoJSON upload and country selection"""
+    # if method == "Subir GeoJson":
+    #     return (
+    #         gr.update(visible=True),   # geojson_section
+    #         gr.update(visible=False),  # reports_section
+    #         gr.update(value=None),     # dropdown_country
+    #     )
+    # else:  # "Talk to Reports"
+    return (
+            #gr.update(visible=False),  # geojson_section
+            gr.update(visible=True),   # reports_section
+            gr.update(),               # dropdown_country
+        )
+def change_sample_questions(key):
+    """Update visible examples based on selected category"""
+    keys = list(SAMPLE_QUESTIONS.keys())
+    index = keys.index(key)
+    visible_bools = [False] * len(keys)
+    visible_bools[index] = True
+    return [gr.update(visible=visible_bools[i]) for i in range(len(keys))]
+# Set up Gradio Theme
+theme = gr.themes.Base(
+    primary_hue="green",
+    secondary_hue="blue",
+    font=[gr.themes.GoogleFont("Poppins"), "ui-sans-serif", "system-ui", "sans-serif"],
+    text_size=gr.themes.utils.sizes.text_sm,
+)
+init_prompt = """
+        Hola, soy Gina, una asistente conversacional con IA diseñada para ayudarte a comprender conceptos y ayudarte con el tema de la Economía Circular. Responderé a tus preguntas usando la base de datos de documentos sobre economía circular.
+        💡 **Cómo usarla (pestañas a la derecha)**
+        **Enfoque:** Selecciona la sección de informes/documentos.
+        **Ejemplos:** Selecciona entre ejemplos de preguntas de diferentes categorías.
+        **Fuentes:** Consulta las fuentes de contenido utilizadas para generar las respuestas para la verificación de datos.
+        ⚠️ Para conocer las limitaciones e información sobre la recopilación de datos, consulta la pestaña "Aviso legal"
+        """
+with gr.Blocks(title="Gina Bot", theme=theme, css="style.css") as demo:
+    # Main Chat Interface
+    with gr.Tab("Gina Bot"):
+        with gr.Row():
+            # Left column - Chat interface (2/3 width)
+            with gr.Column(scale=2):
+                chatbot = gr.Chatbot(
+                    value=[(None, init_prompt)],
+                    show_copy_button=True,
+                    show_label=False,
+                    layout="panel",
+                    avatar_images=(None, "chatbot_icon_2.png"),
+                    height="auto"
+                )
+                # Feedback UI
+                with gr.Column():
+                    with gr.Row(visible=False) as feedback_row:
+                        gr.Markdown("¿Te ha sido útil esta respuesta?")
+                        with gr.Row():
+                            okay_btn = gr.Button("👍 De acuerdo", size="sm")
+                            not_okay_btn = gr.Button("👎 No según lo esperado", size="sm")
+                    feedback_thanks = gr.Markdown("Gracias por los comentarios.", visible=False)
+                # Input textbox
+                with gr.Row():
+                    textbox = gr.Textbox(
+                        placeholder="Pregúntame cualquier cosa sobre Economía Circular",
+                        show_label=False,
+                        scale=7,
+                        lines=1,
+                        interactive=True
+                    )
+            # Right column - Controls and tabs (1/3 width)
+            with gr.Column(scale=1, variant="panel"):
+                with gr.Tabs() as tabs:
+                    # Data Sources Tab
+                    with gr.Tab("Fuentes de datos", id=2):
+                        with gr.Group(visible=True) as reports_section:
+                            dropdown_category = gr.Dropdown(
+                                ["Fundamentos y tendencias internacionales de EC", "Financiamiento en EC", "EC en Colombia"],
+                                # label="Selecciona país",
+                                label="Especifica tu área de interés",
+                                multiselect =True,
+                                value=["Fundamentos y tendencias internacionales de EC", "Financiamiento en EC", "EC en Colombia"],
+                                interactive=True,
+                            )
+                        # # GeoJSON Upload Section
+                        # with gr.Group(visible=True) as geojson_section:
+                        #     uploaded_file = gr.File(
+                        #         label="Subir GeoJson",
+                        #         file_types=[".geojson", ".json"],
+                        #         file_count="single"
+                        #     )
+                        #     upload_status = gr.Markdown("", visible=False)
+                        #     # Results table for WHISP API response
+                        #     results_table = gr.DataFrame(
+                        #         label="Resultados del análisis",
+                        #         visible=False,
+                        #         interactive=False,
+                        #         wrap=True,
+                        #         elem_classes="dataframe"
+                        #     )
+                        # Talk to Reports Section
+                    # Examples Tab
+                    with gr.Tab("Ejemplos", id=0):
+                        examples_hidden = gr.Textbox(visible=False)
+                        first_key = list(SAMPLE_QUESTIONS.keys())[0]
+                        dropdown_samples = gr.Dropdown(
+                            SAMPLE_QUESTIONS.keys(),
+                            value=first_key,
+                            interactive=True,
+                            show_label=True,
+                            label="Seleccione una categoría de preguntas de muestra."
+                        )
+                        # Create example sections
+                        sample_groups = []
+                        for i, (key, questions) in enumerate(SAMPLE_QUESTIONS.items()):
+                            examples_visible = True if i == 0 else False
+                            with gr.Row(visible=examples_visible) as group_examples:
+                                gr.Examples(
+                                    questions,
+                                    [examples_hidden],
+                                    examples_per_page=8,
+                                    run_on_click=False,
+                                )
+                            sample_groups.append(group_examples)
+                    # Sources Tab
+                    with gr.Tab("Fuentes", id=1, elem_id="sources-textbox"):
+                        sources_textbox = gr.HTML(
+                            show_label=False,
+                            value="Los documentos originales aparecerán aquí después de que hagas una pregunta..."
+                        )
+    # Guidelines Tab
+    with gr.Tab("Orientacion"):
+        gr.Markdown("""
+        #### Welcome to Gina Q&A!
+        This AI-powered assistant helps you understand Circular Economy.
+        ## 💬 How to Ask Effective Questions
+        | ❌ Less Effective | ✅ More Effective |
+        |------------------|-------------------|
+        | "What is economy?" | "What are impact of circular economy on businesses?" |
+        | "Tell me about compliance" | "What are country guidelines on circular economy" |
+        | "Show me data" | "What is the trend on waste and how circular economy is helping in resolving this?" |
+        ## 🔍 Using Data Sources
+        **Talk to Reports:** Select reports sections "Trend and fundamentals", "Financing Mechanisms", "Country Resource"
+        ## ⭐ Best Practices
+        - Be specific about regions, commodities, or time periods
+        - Ask one question at a time for clearer answers
+        - Use follow-up questions to explore topics deeper
+        - Provide context when possible
+        """)
+    # About Tab
+    with gr.Tab("sobre Gina"):
+        gr.Markdown("""
+        ## About Gina Q&A
+        The **Circular Economy** places some obligations on the manufacturers and business.
+        This AI-powered tool helps stakeholders:
+        - Understand circular Economy concepts and regulations
+        - Assess supply chain issues
+        - Navigate complex regulatory landscapes
+        **Developed by GIZ** for project in Colombia to enhance accessibility and understanding of circular Economy requirements
+        through advanced AI and geographic data processing capabilities.
+        ### Key Features:
+        - Country-specific compliance guidance
+        - Real-time question answering with source citations
+        - User-friendly interface for complex regulatory information
+        """)
+    # Disclaimer Tab
+    with gr.Tab("Disclaimer"):
+        gr.Markdown("""
+        ## Important Disclaimers
+        ⚠️ **Scope & Limitations:**
+        - This tool is designed for Circular Economy assistance and geographic data analysis
+        - Responses should not be considered official legal or compliance advice
+        - Always consult qualified professionals for official compliance decisions
+        ⚠️ **Data & Privacy:**
+        - We collect usage statistics to improve the tool
+        - Files are processed temporarily and not permanently stored
+        ⚠️ **AI Limitations:**
+        - Responses are AI-generated and may contain inaccuracies
+        - The tool is a prototype under continuous development
+        - Always verify important information with authoritative sources
+        **Data Collection:** We collect questions, answers, feedback, and anonymized usage statistics
+        to improve tool performance based on legitimate interest in service enhancement.
+        By using this tool, you acknowledge these limitations and agree to use responses responsibly.
+        """)
+    # Event Handlers
+    # Toggle search method
+    # search_method.change(
+    #     fn=toggle_search_method,
+    #     inputs=[search_method],
+    #     outputs=[reports_section, dropdown_category]
+    # )
+    # File upload - automatically analyze and display in chat (SIMPLIFIED)
+    # uploaded_file.change(
+    #     fn=auto_analyze_file,
+    #     inputs=[uploaded_file, chatbot],
+    #     outputs=[chatbot, sources_textbox],
+    #     queue=False
+    # )
+    # Chat functionality
+    textbox.submit(
+        start_chat,
+        [textbox, chatbot],
+        [textbox, tabs, chatbot],
+        queue=False
+    ).then(
+        chat_response,
+        [textbox, chatbot, dropdown_category],
+        [chatbot, sources_textbox]
+    ).then(
+        lambda: gr.update(visible=True),
+        outputs=[feedback_row]
+    ).then(
+        finish_chat,
+        outputs=[textbox]
+    )
+    # Examples functionality
+    examples_hidden.change(
+        start_chat,
+        [examples_hidden, chatbot],
+        [textbox, tabs, chatbot],
+        queue=False
+    ).then(
+        chat_response,
+        [examples_hidden, chatbot, dropdown_category],
+        [chatbot, sources_textbox]
+    ).then(
+        lambda: gr.update(visible=True),
+        outputs=[feedback_row]
+    ).then(
+        finish_chat,
+        outputs=[textbox]
+    )
+    # Sample questions dropdown
+    dropdown_samples.change(
+        change_sample_questions,
+        [dropdown_samples],
+        sample_groups
+    )
+    # Feedback buttons
+    # Feedback handlers with logging
+    def handle_feedback(feedback):
+        try:
+            # Get the last interaction from history
+            if chatbot.value:
+                last_query = chatbot.value[-1][0]
+                last_response = chatbot.value[-1][1]
+                # Log the feedback
+                chat_logger.log(
+                    query=last_query,
+                    answer=last_response,
+                    retrieved_content=[],  # Empty since this is feedback
+                    feedback=feedback
+                )
+        except Exception as e:
+            print(f"Feedback logging error: {str(e)}")
+        return gr.update(visible=False), gr.update(visible=True)
+    okay_btn.click(
+        lambda: handle_feedback("positive"),
+        outputs=[feedback_row, feedback_thanks]
+    )
+    not_okay_btn.click(
+        lambda: handle_feedback("negative"),
+        outputs=[feedback_row, feedback_thanks]
+    )
+# Launch the app
+if __name__ == "__main__":
+    demo.launch()

requirements.txt ADDED Viewed

	@@ -0,0 +1,163 @@

+#git+https://github.com/huggingface/huggingface_hub.git@main
+aiofiles==23.2.1
+aiohappyeyeballs==2.3.5
+aiohttp==3.10.3
+aiosignal==1.3.1
+annotated-types==0.7.0
+anyio==4.4.0
+asttokens==2.4.1
+async-timeout==4.0.3
+attrs==24.2.0
+authlib==1.3.1
+certifi==2024.7.4
+cffi==1.17.0
+charset-normalizer==3.3.2
+click==8.0.4
+comm==0.2.2
+contourpy==1.2.1
+cryptography==43.0.0
+cycler==0.12.1
+dataclasses-json==0.6.7
+datasets==2.20.0
+debugpy==1.8.5
+decorator==5.1.1
+dill==0.3.8
+exceptiongroup==1.2.2
+executing==2.0.1
+fastapi==0.112.0
+ffmpy==0.4.0
+filelock==3.15.4
+fonttools==4.53.1
+frozenlist==1.4.1
+fsspec==2024.5.0
+gradio-client==1.3.0
+gradio==4.44.1
+greenlet==3.0.3
+grpcio-tools==1.65.4
+grpcio==1.65.4
+h11==0.14.0
+h2==4.1.0
+hf-transfer==0.1.8
+hpack==4.0.0
+httpcore==1.0.5
+httpx==0.27.0
+huggingface-hub==0.34.0
+hyperframe==6.0.1
+idna==3.7
+importlib-resources==6.4.0
+ipykernel==6.29.5
+ipython==8.26.0
+itsdangerous==2.2.0
+jedi==0.19.1
+jinja2==3.1.4
+joblib==1.4.2
+jsonpatch==1.33
+jsonpointer==3.0.0
+jupyter-client==8.6.2
+jupyter-core==5.7.2
+kiwisolver==1.4.5
+langchain==0.3.26
+langchain-community==0.3.27
+langchain-core==0.3.70
+langchain-huggingface==0.3.0
+langchain-text-splitters==0.3.8
+langchain-together==0.3.0
+langsmith==0.4.8
+markdown-it-py==3.0.0
+markupsafe==2.1.5
+marshmallow==3.21.3
+matplotlib-inline==0.1.7
+matplotlib==3.9.2
+mdurl==0.1.2
+mpmath==1.3.0
+multidict==6.0.5
+multiprocess==0.70.16
+mypy-extensions==1.0.0
+nest-asyncio==1.6.0
+networkx==3.3
+numpy==1.26.4
+nvidia-cublas-cu12==12.1.3.1
+nvidia-cuda-cupti-cu12==12.1.105
+nvidia-cuda-nvrtc-cu12==12.1.105
+nvidia-cuda-runtime-cu12==12.1.105
+nvidia-cudnn-cu12==9.1.0.70
+nvidia-cufft-cu12==11.0.2.54
+nvidia-curand-cu12==10.3.2.106
+nvidia-cusolver-cu12==11.4.5.107
+nvidia-cusparse-cu12==12.1.0.106
+nvidia-nccl-cu12==2.20.5
+nvidia-nvjitlink-cu12==12.6.20
+nvidia-nvtx-cu12==12.1.105
+orjson==3.10.7
+openpyxl
+packaging==23.2
+pandas==2.2.2
+parso==0.8.4
+pexpect==4.9.0
+pillow==10.4.0
+pip==22.3.1
+platformdirs==4.2.2
+portalocker==2.10.1
+prompt-toolkit==3.0.47
+protobuf==5.27.3
+psutil==5.9.8
+ptyprocess==0.7.0
+pure-eval==0.2.3
+pyarrow-hotfix==0.6
+pyarrow==17.0.0
+pycparser==2.22
+pydantic-core==2.20.1
+pydantic==2.8.2
+pydub==0.25.1
+pygments==2.18.0
+pymupdf==1.23.26
+pymupdfb==1.23.22
+pyparsing==3.1.2
+python-dateutil==2.9.0.post0
+python-dotenv==1.0.1
+python-multipart==0.0.9
+pyyaml==6.0.2
+pyzmq==26.2.0
+pytz==2024.1
+#returns>=0.26.0
+qdrant-client==1.10.1
+regex==2024.7.24
+requests==2.32.3
+rich==13.7.1
+ruff==0.5.7
+safetensors==0.4.4
+scikit-learn==1.5.1
+scipy==1.14.0
+semantic-version==2.10.0
+sentence-transformers==3.0.1
+sentencepiece==0.2.0
+setuptools==65.5.0
+shellingham==1.5.4
+six==1.16.0
+sniffio==1.3.1
+spaces==0.29.3
+sqlalchemy==2.0.32
+stack-data==0.6.3
+starlette==0.37.2
+sympy==1.13.2
+tenacity==8.5.0
+threadpoolctl==3.5.0
+tokenizers==0.19.1
+tomlkit==0.12.0
+torch==2.4.0
+tornado==6.4.1
+tqdm==4.66.5
+traitlets==5.14.3
+transformers==4.44.0
+triton==3.0.0
+typer==0.12.3
+typing-extensions==4.12.2
+typing-inspect==0.9.0
+tzdata==2024.1
+urllib3==2.2.2
+uvicorn==0.30.6
+wcwidth==0.2.13
+websockets==11.0.3
+wheel==0.44.0
+xxhash==3.4.1
+yarl==1.9.4