Spaces:
GIZ
/
Running on CPU Upgrade

ppsingh commited on
Commit
08f161e
·
1 Parent(s): 7231f12
Files changed (3) hide show
  1. README.md +77 -7
  2. app.py +500 -31
  3. requirements.txt +163 -0
README.md CHANGED
@@ -1,14 +1,84 @@
1
  ---
2
- title: Gina
3
- emoji: 🐢
4
- colorFrom: gray
5
  colorTo: gray
6
  sdk: gradio
7
- sdk_version: 5.42.0
8
  app_file: app.py
 
9
  pinned: false
10
- license: apache-2.0
11
- short_description: A Chatbot to search for information on Circular Economy.
 
 
 
12
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
13
 
14
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
1
  ---
2
+ title: Gina Assistant
3
+ emoji: 🧐
4
+ colorFrom: yellow
5
  colorTo: gray
6
  sdk: gradio
7
+ sdk_version: 4.44.1
8
  app_file: app.py
9
+ fullWidth: true
10
  pinned: false
11
+ startup_duration_timeout: 1h
12
+ models:
13
+ - BAAI/bge-m3
14
+ - BAAI/bge-reranker-v2-m3
15
+ - meta-llama/Llama-3.1-8B-Instruct
16
  ---
17
+ short_description: AI-powered conversational assistant for Circular Economy
18
+ license: apache-2.0
19
+
20
+
21
+ ## Technical Documentation of the system in accordance with EU AI Act
22
+
23
+ **System Name:** Gina ChatBot
24
+
25
+ **Provider / Supplier:** GIZ Data Service Center
26
+
27
+ **As of:** August 2025
28
+
29
+ ## 1. General Description of the System
30
+
31
+ Gina Bot is an AI-powered conversational assistant designed to help you understand compliance with and analyze the Circular Economy topic. This tool leverages advanced language models to help you get clear and structured answers about Circular Economy requirements, compliance procedures, and regulatory guidance.
32
+
33
+ It combines a generative language assistant with a knowledge base implemented via Retrieval-Augmented Generation (RAG). The scope and functionality of the tool is focused on EU Deforestation Regulation compliance and related documentation.
34
+
35
+ ## 2. Models Used
36
+
37
+ ### Generative LLM
38
+ - **Model Name:** [meta-llama/Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct)
39
+ - **Model Source API:** [Nebius AI](https://studio.nebius.com/)
40
+
41
+ ### Retriever/Embedding
42
+ - **Model Name:** [BAAI/bge-m3](https://huggingface.co/BAAI/bge-m3)
43
+ - **Model Source:** Local Instance
44
+
45
+ ### Re-ranker
46
+ - **Model Name:** [BAAI/bge-reranker-v2-m3](https://huggingface.co/BAAI/bge-reranker-v2-m3)
47
+ - **Model Source:** Local Instance
48
+
49
+ ## 3. Model Training Data
50
+
51
+ All the models mentioned above are being consumed without any fine-tuning or training being performed by the developer team of Gina Bot. And hence there is no training data which had been used by the development team of Gina Bot.
52
+
53
+ ## 4. Knowledge Base (Retrieval Component)
54
+
55
+ - **Data Sources:** Public Circular Economy documentation, regulatory guidance, and compliance materials
56
+ - **Embedding Model:** [BAAI/bge-m3](https://huggingface.co/BAAI/bge-m3)
57
+ - **Embedding Dimension:** 1024
58
+ - **Vector Database:** Qdrant (via API)
59
+ - **Framework:** Langchain (custom RAG pipeline)
60
+ - **Top-k:** 5 relevant text segments per query
61
+
62
+ ## 5. System Limitations and Non-Purposes
63
+
64
+ - The system does not make autonomous decisions.
65
+ - No processing of personal data except for the usage statistics as mentioned in Disclaimer.
66
+ - Results are intended for orientation only – not for legal or regulatory compliance advice.
67
+ - Users should consult official EU documentation and legal experts for definitive compliance guidance.
68
+
69
+ ## 6. Transparency Towards Users
70
+
71
+ - The user interface clearly indicates the use of a generative AI model.
72
+ - An explanation of the RAG method is included.
73
+ - We collect usage statistics as detailed in Disclaimer tab of the app along with the explicit display in the user interface of the tool.
74
+ - Feedback mechanism available (via https://huggingface.co/spaces/GIZ/gina_dev/discussions/new).
75
+
76
+ ## 7. Monitoring, Feedback, and Incident Reporting
77
+
78
+ - User can provide feedback via UI by giving (Thumbs-up or down to AI-Generated answer). Alternatively for more detailed feedback please use https://huggingface.co/spaces/GIZ/gina_dev/discussions/new to report any issue.
79
+ - Technical development is carried out by the GIZ Data Service Center.
80
+ - No automated bias detection – but low risk due to content restrictions.
81
+
82
+ ## 8. Contact
83
 
84
+ For any questions, please contact via https://huggingface.co/spaces/GIZ/gina_dev/discussions/new or send us email to dataservicecenter@giz.de
app.py CHANGED
@@ -1,33 +1,502 @@
1
  import gradio as gr
 
 
 
 
 
 
 
 
 
 
 
 
 
2
 
3
- # Define the HTML and CSS for the banner
4
- # We use a markdown component and inject a simple div with inline styling
5
- banner_content = """
6
- <div style='
7
- background-color: #f0f0f0;
8
- padding: 15px;
9
- border-radius: 10px;
10
- text-align: center;
11
- margin: 20px auto;
12
- width: fit-content;
13
- border: 1px solid #ddd;
14
- box-shadow: 0 4px 8px rgba(0,0,0,0.1);
15
- '>
16
- <h2 style='
17
- margin: 0;
18
- color: #555;
19
- font-family: sans-serif;
20
- '>
21
- 🚧 Work in progress 🚧
22
- </h2>
23
- </div>
24
- """
25
-
26
- # Create the Gradio interface using gr.Blocks
27
- # Blocks gives us more control over the layout
28
- with gr.Blocks() as demo:
29
- # Use gr.Markdown to display the HTML content
30
- gr.Markdown(banner_content)
31
-
32
- # Launch the demo
33
- demo.launch()
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  import gradio as gr
2
+ import time
3
+ import pandas as pd
4
+ import asyncio
5
+ from uuid import uuid4
6
+ from gradio_client import Client, handle_file
7
+ from utils.retriever import retrieve_paragraphs
8
+ from utils.generator import generate
9
+ from utils.logger import ChatLogger
10
+ from huggingface_hub import CommitScheduler
11
+ import json
12
+ import ast
13
+ import os
14
+ from pathlib import Path
15
 
16
+ # Set up dataset directory and HuggingFace integration
17
+ # JSON_DATASET_DIR = Path("json_dataset")
18
+
19
+ # JSON_DATASET_DIR.mkdir(parents=True, exist_ok=True)
20
+ # JSON_DATASET_PATH = JSON_DATASET_DIR / f"logs-{uuid4()}.jsonl"
21
+
22
+
23
+ # Set up dataset directory and HuggingFace integration
24
+ JSON_DATASET_DIR = Path("json_dataset")
25
+
26
+ # Check if directory exists and create if needed
27
+ if not JSON_DATASET_DIR.exists():
28
+ try:
29
+ JSON_DATASET_DIR.mkdir(parents=True, exist_ok=True)
30
+ print(f"Created dataset directory at {JSON_DATASET_DIR}")
31
+ except Exception as e:
32
+ print(f"Error creating dataset directory: {str(e)}")
33
+ raise
34
+ else:
35
+ print(f"Using existing dataset directory at {JSON_DATASET_DIR}")
36
+
37
+ # Get HuggingFace token from environment
38
+ SPACES_LOG = os.environ.get("GINA_SPACES_LOG")
39
+ if not SPACES_LOG:
40
+ print("Warning: GINA_SPACES_LOG not found in environment, using local storage only")
41
+
42
+ # Initialize scheduler with proper dataset configuration
43
+ scheduler = CommitScheduler(
44
+ repo_id="GIZ/spaces_logs",
45
+ repo_type="dataset",
46
+ folder_path=JSON_DATASET_DIR,
47
+ path_in_repo="gina_chatbot",
48
+ token=SPACES_LOG if SPACES_LOG else None,
49
+ every=60 # Sync every 60 seconds
50
+ )
51
+
52
+ # Initialize logger with configured scheduler
53
+ chat_logger = ChatLogger(scheduler = scheduler )
54
+
55
+ # Sample questions for examples
56
+ SAMPLE_QUESTIONS = {
57
+ "Fundamentos y tendencias internacionales de EC": [
58
+ "¿Cómo se diferencia el modelo de economía circular del modelo lineal tradicional de 'tomar, hacer, desechar'?",
59
+ "¿Cuáles son algunos de los principios clave de la economía circular y cómo se aplican en la práctica?",
60
+ "¿Podrías dar ejemplos de industrias o empresas que estén implementando con éxito prácticas de economía circular?"
61
+ ],
62
+ "EC en Colombia": [
63
+ "¿Qué políticas y normativas vigentes en Colombia impulsan la adopción de la economía circular?",
64
+ "¿Cómo pueden las regulaciones colombianas incentivar la innovación en el ecodiseño y la gestión de residuos?",
65
+ "¿Qué papel tienen los instrumentos económicos y fiscales en la promoción de la circularidad en el sector productivo de Colombia?"]
66
+ }
67
+
68
+ # Global variable to cache API results and prevent double calls
69
+ geojson_analysis_cache = {}
70
+
71
+ # Initialize Chat
72
+ def start_chat(query, history):
73
+ """Start a new chat interaction"""
74
+ history = history + [(query, None)]
75
+ return gr.update(interactive=False), gr.update(selected=1), history
76
+
77
+ def finish_chat():
78
+ """Finish chat and reset input"""
79
+ return gr.update(interactive=True, value="")
80
+
81
+ def make_html_source(source,i):
82
+ """
83
+ takes the text and converts it into html format for display in "source" side tab
84
+ """
85
+ meta = source['answer_metadata']
86
+ content = source['answer'].strip()
87
+
88
+ name = meta['filename']
89
+ card = f"""
90
+ <div class="card" id="doc{i}">
91
+ <div class="card-content">
92
+ <h2>Doc {i} - {meta['filename']} - Page {int(meta['page'])}</h2>
93
+ <p>{content}</p>
94
+ </div>
95
+ <div class="card-footer">
96
+ <span>{name}</span>
97
+ <a href="{meta['filename']}#page={int(meta['page'])}" target="_blank" class="pdf-link">
98
+ <span role="img" aria-label="Open PDF">🔗</span>
99
+ </a>
100
+ </div>
101
+ </div>
102
+ """
103
+
104
+ return card
105
+
106
+ async def chat_response(query, history, category, request=None):
107
+ """Generate chat response based on method and inputs"""
108
+
109
+ try:
110
+ retrieved_paragraphs = retrieve_paragraphs(query, category)
111
+ context_retrieved = ast.literal_eval(retrieved_paragraphs)
112
+
113
+ # Build list of only content, no metadata
114
+ context_retrieved_formatted = "||".join(doc['answer'] for doc in context_retrieved)
115
+ context_retrieved_lst = [doc['answer'] for doc in context_retrieved]
116
+
117
+ # Prepare HTML for displaying source documents
118
+ docs_html = []
119
+ for i, d in enumerate(context_retrieved, 1):
120
+ docs_html.append(make_html_source(d, i))
121
+ docs_html = "".join(docs_html)
122
+
123
+ # Generate response
124
+ response = await generate(query=query, context=context_retrieved_lst)
125
+
126
+ # Log the interaction
127
+ try:
128
+ chat_logger.log(
129
+ query=query,
130
+ answer=response,
131
+ retrieved_content=context_retrieved_lst,
132
+ request=request
133
+ )
134
+ except Exception as e:
135
+ print(f"Logging error: {str(e)}")
136
+
137
+ # Stream response character by character
138
+ displayed_response = ""
139
+ for i, char in enumerate(response):
140
+ displayed_response += char
141
+ history[-1] = (query, displayed_response)
142
+ yield history, docs_html
143
+ # Only add delay every few characters to avoid being too slow
144
+ if i % 3 == 0:
145
+ await asyncio.sleep(0.02)
146
+
147
+ except Exception as e:
148
+ error_message = f"Error processing request: {str(e)}"
149
+ history[-1] = (query, error_message)
150
+ yield history, ""
151
+
152
+
153
+ # # Stream response word by word into the chat
154
+ # words = response.split()
155
+
156
+ # for i in range(len(words)):
157
+ # history[-1] = (query, " ".join(words[:i+1]))
158
+ # yield history, "**Sources:** Sample source documents would appear here..."
159
+ # await asyncio.sleep(0.05)
160
+
161
+ # def auto_analyze_file(file, history):
162
+ # """Automatically analyze uploaded GeoJSON file and add results to chat"""
163
+ # if file is not None:
164
+ # try:
165
+ # # Call API immediately and cache results
166
+ # file_key = f"{file.name}_{file.size if hasattr(file, 'size') else 'unknown'}"
167
+
168
+ # if file_key not in geojson_analysis_cache:
169
+ # formatted_stats = "This is to be removed"
170
+ # geojson_analysis_cache[file_key] = formatted_stats
171
+
172
+ # # Add analysis results directly to chat (no intermediate message)
173
+ # analysis_query = "📄 Análisis del GeoJSON cargado"
174
+ # cached_result = geojson_analysis_cache[file_key]
175
+
176
+ # # Add both query and response to history
177
+ # history = history + [(analysis_query, cached_result)]
178
+ # return history, "**Sources:** WhispAPI Analysis Results"
179
+
180
+ # except Exception as e:
181
+ # error_msg = f"❌ Error processing GeoJSON file: {str(e)}"
182
+ # history = history + [("📄 Error en análisis GeoJSON", error_msg)]
183
+ # return history, ""
184
+
185
+ # return history, ""
186
+
187
+ def toggle_search_method(method):
188
+ """Toggle between GeoJSON upload and country selection"""
189
+ # if method == "Subir GeoJson":
190
+ # return (
191
+ # gr.update(visible=True), # geojson_section
192
+ # gr.update(visible=False), # reports_section
193
+ # gr.update(value=None), # dropdown_country
194
+ # )
195
+ # else: # "Talk to Reports"
196
+ return (
197
+ #gr.update(visible=False), # geojson_section
198
+ gr.update(visible=True), # reports_section
199
+ gr.update(), # dropdown_country
200
+ )
201
+
202
+ def change_sample_questions(key):
203
+ """Update visible examples based on selected category"""
204
+ keys = list(SAMPLE_QUESTIONS.keys())
205
+ index = keys.index(key)
206
+ visible_bools = [False] * len(keys)
207
+ visible_bools[index] = True
208
+ return [gr.update(visible=visible_bools[i]) for i in range(len(keys))]
209
+
210
+ # Set up Gradio Theme
211
+ theme = gr.themes.Base(
212
+ primary_hue="green",
213
+ secondary_hue="blue",
214
+ font=[gr.themes.GoogleFont("Poppins"), "ui-sans-serif", "system-ui", "sans-serif"],
215
+ text_size=gr.themes.utils.sizes.text_sm,
216
+ )
217
+
218
+
219
+
220
+ init_prompt = """
221
+ Hola, soy Gina, una asistente conversacional con IA diseñada para ayudarte a comprender conceptos y ayudarte con el tema de la Economía Circular. Responderé a tus preguntas usando la base de datos de documentos sobre economía circular.
222
+ 💡 **Cómo usarla (pestañas a la derecha)**
223
+
224
+ **Enfoque:** Selecciona la sección de informes/documentos.
225
+ **Ejemplos:** Selecciona entre ejemplos de preguntas de diferentes categorías.
226
+ **Fuentes:** Consulta las fuentes de contenido utilizadas para generar las respuestas para la verificación de datos.
227
+ ⚠️ Para conocer las limitaciones e información sobre la recopilación de datos, consulta la pestaña "Aviso legal"
228
+ """
229
+
230
+ with gr.Blocks(title="Gina Bot", theme=theme, css="style.css") as demo:
231
+
232
+ # Main Chat Interface
233
+ with gr.Tab("Gina Bot"):
234
+ with gr.Row():
235
+ # Left column - Chat interface (2/3 width)
236
+ with gr.Column(scale=2):
237
+ chatbot = gr.Chatbot(
238
+ value=[(None, init_prompt)],
239
+ show_copy_button=True,
240
+ show_label=False,
241
+ layout="panel",
242
+ avatar_images=(None, "chatbot_icon_2.png"),
243
+ height="auto"
244
+ )
245
+
246
+ # Feedback UI
247
+ with gr.Column():
248
+ with gr.Row(visible=False) as feedback_row:
249
+ gr.Markdown("¿Te ha sido útil esta respuesta?")
250
+ with gr.Row():
251
+ okay_btn = gr.Button("👍 De acuerdo", size="sm")
252
+ not_okay_btn = gr.Button("👎 No según lo esperado", size="sm")
253
+ feedback_thanks = gr.Markdown("Gracias por los comentarios.", visible=False)
254
+
255
+ # Input textbox
256
+ with gr.Row():
257
+ textbox = gr.Textbox(
258
+ placeholder="Pregúntame cualquier cosa sobre Economía Circular",
259
+ show_label=False,
260
+ scale=7,
261
+ lines=1,
262
+ interactive=True
263
+ )
264
+
265
+ # Right column - Controls and tabs (1/3 width)
266
+ with gr.Column(scale=1, variant="panel"):
267
+ with gr.Tabs() as tabs:
268
+
269
+ # Data Sources Tab
270
+ with gr.Tab("Fuentes de datos", id=2):
271
+ with gr.Group(visible=True) as reports_section:
272
+ dropdown_category = gr.Dropdown(
273
+ ["Fundamentos y tendencias internacionales de EC", "Financiamiento en EC", "EC en Colombia"],
274
+ # label="Selecciona país",
275
+ label="Especifica tu área de interés",
276
+ multiselect =True,
277
+ value=["Fundamentos y tendencias internacionales de EC", "Financiamiento en EC", "EC en Colombia"],
278
+ interactive=True,
279
+ )
280
+
281
+ # # GeoJSON Upload Section
282
+ # with gr.Group(visible=True) as geojson_section:
283
+ # uploaded_file = gr.File(
284
+ # label="Subir GeoJson",
285
+ # file_types=[".geojson", ".json"],
286
+ # file_count="single"
287
+ # )
288
+ # upload_status = gr.Markdown("", visible=False)
289
+
290
+ # # Results table for WHISP API response
291
+ # results_table = gr.DataFrame(
292
+ # label="Resultados del análisis",
293
+ # visible=False,
294
+ # interactive=False,
295
+ # wrap=True,
296
+ # elem_classes="dataframe"
297
+ # )
298
+
299
+ # Talk to Reports Section
300
+
301
+
302
+ # Examples Tab
303
+ with gr.Tab("Ejemplos", id=0):
304
+ examples_hidden = gr.Textbox(visible=False)
305
+
306
+ first_key = list(SAMPLE_QUESTIONS.keys())[0]
307
+ dropdown_samples = gr.Dropdown(
308
+ SAMPLE_QUESTIONS.keys(),
309
+ value=first_key,
310
+ interactive=True,
311
+ show_label=True,
312
+ label="Seleccione una categoría de preguntas de muestra."
313
+ )
314
+
315
+ # Create example sections
316
+ sample_groups = []
317
+ for i, (key, questions) in enumerate(SAMPLE_QUESTIONS.items()):
318
+ examples_visible = True if i == 0 else False
319
+ with gr.Row(visible=examples_visible) as group_examples:
320
+ gr.Examples(
321
+ questions,
322
+ [examples_hidden],
323
+ examples_per_page=8,
324
+ run_on_click=False,
325
+ )
326
+ sample_groups.append(group_examples)
327
+
328
+ # Sources Tab
329
+ with gr.Tab("Fuentes", id=1, elem_id="sources-textbox"):
330
+ sources_textbox = gr.HTML(
331
+ show_label=False,
332
+ value="Los documentos originales aparecerán aquí después de que hagas una pregunta..."
333
+ )
334
+
335
+ # Guidelines Tab
336
+ with gr.Tab("Orientacion"):
337
+ gr.Markdown("""
338
+ #### Welcome to Gina Q&A!
339
+
340
+ This AI-powered assistant helps you understand Circular Economy.
341
+
342
+ ## 💬 How to Ask Effective Questions
343
+
344
+ | ❌ Less Effective | ✅ More Effective |
345
+ |------------------|-------------------|
346
+ | "What is economy?" | "What are impact of circular economy on businesses?" |
347
+ | "Tell me about compliance" | "What are country guidelines on circular economy" |
348
+ | "Show me data" | "What is the trend on waste and how circular economy is helping in resolving this?" |
349
+
350
+ ## 🔍 Using Data Sources
351
+
352
+ **Talk to Reports:** Select reports sections "Trend and fundamentals", "Financing Mechanisms", "Country Resource"
353
+
354
+ ## ⭐ Best Practices
355
+
356
+ - Be specific about regions, commodities, or time periods
357
+ - Ask one question at a time for clearer answers
358
+ - Use follow-up questions to explore topics deeper
359
+ - Provide context when possible
360
+ """)
361
+
362
+ # About Tab
363
+ with gr.Tab("sobre Gina"):
364
+ gr.Markdown("""
365
+ ## About Gina Q&A
366
+
367
+ The **Circular Economy** places some obligations on the manufacturers and business.
368
+
369
+ This AI-powered tool helps stakeholders:
370
+ - Understand circular Economy concepts and regulations
371
+ - Assess supply chain issues
372
+ - Navigate complex regulatory landscapes
373
+
374
+ **Developed by GIZ** for project in Colombia to enhance accessibility and understanding of circular Economy requirements
375
+ through advanced AI and geographic data processing capabilities.
376
+
377
+ ### Key Features:
378
+ - Country-specific compliance guidance
379
+ - Real-time question answering with source citations
380
+ - User-friendly interface for complex regulatory information
381
+ """)
382
+
383
+ # Disclaimer Tab
384
+ with gr.Tab("Disclaimer"):
385
+ gr.Markdown("""
386
+ ## Important Disclaimers
387
+
388
+ ⚠️ **Scope & Limitations:**
389
+ - This tool is designed for Circular Economy assistance and geographic data analysis
390
+ - Responses should not be considered official legal or compliance advice
391
+ - Always consult qualified professionals for official compliance decisions
392
+
393
+ ⚠️ **Data & Privacy:**
394
+ - We collect usage statistics to improve the tool
395
+ - Files are processed temporarily and not permanently stored
396
+
397
+ ⚠️ **AI Limitations:**
398
+ - Responses are AI-generated and may contain inaccuracies
399
+ - The tool is a prototype under continuous development
400
+ - Always verify important information with authoritative sources
401
+
402
+ **Data Collection:** We collect questions, answers, feedback, and anonymized usage statistics
403
+ to improve tool performance based on legitimate interest in service enhancement.
404
+
405
+ By using this tool, you acknowledge these limitations and agree to use responses responsibly.
406
+ """)
407
+
408
+ # Event Handlers
409
+
410
+ # Toggle search method
411
+ # search_method.change(
412
+ # fn=toggle_search_method,
413
+ # inputs=[search_method],
414
+ # outputs=[reports_section, dropdown_category]
415
+ # )
416
+
417
+ # File upload - automatically analyze and display in chat (SIMPLIFIED)
418
+ # uploaded_file.change(
419
+ # fn=auto_analyze_file,
420
+ # inputs=[uploaded_file, chatbot],
421
+ # outputs=[chatbot, sources_textbox],
422
+ # queue=False
423
+ # )
424
+
425
+ # Chat functionality
426
+ textbox.submit(
427
+ start_chat,
428
+ [textbox, chatbot],
429
+ [textbox, tabs, chatbot],
430
+ queue=False
431
+ ).then(
432
+ chat_response,
433
+ [textbox, chatbot, dropdown_category],
434
+ [chatbot, sources_textbox]
435
+ ).then(
436
+ lambda: gr.update(visible=True),
437
+ outputs=[feedback_row]
438
+ ).then(
439
+ finish_chat,
440
+ outputs=[textbox]
441
+ )
442
+
443
+ # Examples functionality
444
+ examples_hidden.change(
445
+ start_chat,
446
+ [examples_hidden, chatbot],
447
+ [textbox, tabs, chatbot],
448
+ queue=False
449
+ ).then(
450
+ chat_response,
451
+ [examples_hidden, chatbot, dropdown_category],
452
+ [chatbot, sources_textbox]
453
+ ).then(
454
+ lambda: gr.update(visible=True),
455
+ outputs=[feedback_row]
456
+ ).then(
457
+ finish_chat,
458
+ outputs=[textbox]
459
+ )
460
+
461
+ # Sample questions dropdown
462
+ dropdown_samples.change(
463
+ change_sample_questions,
464
+ [dropdown_samples],
465
+ sample_groups
466
+ )
467
+
468
+
469
+
470
+ # Feedback buttons
471
+ # Feedback handlers with logging
472
+ def handle_feedback(feedback):
473
+ try:
474
+ # Get the last interaction from history
475
+ if chatbot.value:
476
+ last_query = chatbot.value[-1][0]
477
+ last_response = chatbot.value[-1][1]
478
+
479
+ # Log the feedback
480
+ chat_logger.log(
481
+ query=last_query,
482
+ answer=last_response,
483
+ retrieved_content=[], # Empty since this is feedback
484
+ feedback=feedback
485
+ )
486
+ except Exception as e:
487
+ print(f"Feedback logging error: {str(e)}")
488
+ return gr.update(visible=False), gr.update(visible=True)
489
+
490
+ okay_btn.click(
491
+ lambda: handle_feedback("positive"),
492
+ outputs=[feedback_row, feedback_thanks]
493
+ )
494
+
495
+ not_okay_btn.click(
496
+ lambda: handle_feedback("negative"),
497
+ outputs=[feedback_row, feedback_thanks]
498
+ )
499
+
500
+ # Launch the app
501
+ if __name__ == "__main__":
502
+ demo.launch()
requirements.txt ADDED
@@ -0,0 +1,163 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #git+https://github.com/huggingface/huggingface_hub.git@main
2
+ aiofiles==23.2.1
3
+ aiohappyeyeballs==2.3.5
4
+ aiohttp==3.10.3
5
+ aiosignal==1.3.1
6
+ annotated-types==0.7.0
7
+ anyio==4.4.0
8
+ asttokens==2.4.1
9
+ async-timeout==4.0.3
10
+ attrs==24.2.0
11
+ authlib==1.3.1
12
+ certifi==2024.7.4
13
+ cffi==1.17.0
14
+ charset-normalizer==3.3.2
15
+ click==8.0.4
16
+ comm==0.2.2
17
+ contourpy==1.2.1
18
+ cryptography==43.0.0
19
+ cycler==0.12.1
20
+ dataclasses-json==0.6.7
21
+ datasets==2.20.0
22
+ debugpy==1.8.5
23
+ decorator==5.1.1
24
+ dill==0.3.8
25
+ exceptiongroup==1.2.2
26
+ executing==2.0.1
27
+ fastapi==0.112.0
28
+ ffmpy==0.4.0
29
+ filelock==3.15.4
30
+ fonttools==4.53.1
31
+ frozenlist==1.4.1
32
+ fsspec==2024.5.0
33
+ gradio-client==1.3.0
34
+ gradio==4.44.1
35
+ greenlet==3.0.3
36
+ grpcio-tools==1.65.4
37
+ grpcio==1.65.4
38
+ h11==0.14.0
39
+ h2==4.1.0
40
+ hf-transfer==0.1.8
41
+ hpack==4.0.0
42
+ httpcore==1.0.5
43
+ httpx==0.27.0
44
+ huggingface-hub==0.34.0
45
+ hyperframe==6.0.1
46
+ idna==3.7
47
+ importlib-resources==6.4.0
48
+ ipykernel==6.29.5
49
+ ipython==8.26.0
50
+ itsdangerous==2.2.0
51
+ jedi==0.19.1
52
+ jinja2==3.1.4
53
+ joblib==1.4.2
54
+ jsonpatch==1.33
55
+ jsonpointer==3.0.0
56
+ jupyter-client==8.6.2
57
+ jupyter-core==5.7.2
58
+ kiwisolver==1.4.5
59
+ langchain==0.3.26
60
+ langchain-community==0.3.27
61
+ langchain-core==0.3.70
62
+ langchain-huggingface==0.3.0
63
+ langchain-text-splitters==0.3.8
64
+ langchain-together==0.3.0
65
+ langsmith==0.4.8
66
+ markdown-it-py==3.0.0
67
+ markupsafe==2.1.5
68
+ marshmallow==3.21.3
69
+ matplotlib-inline==0.1.7
70
+ matplotlib==3.9.2
71
+ mdurl==0.1.2
72
+ mpmath==1.3.0
73
+ multidict==6.0.5
74
+ multiprocess==0.70.16
75
+ mypy-extensions==1.0.0
76
+ nest-asyncio==1.6.0
77
+ networkx==3.3
78
+ numpy==1.26.4
79
+ nvidia-cublas-cu12==12.1.3.1
80
+ nvidia-cuda-cupti-cu12==12.1.105
81
+ nvidia-cuda-nvrtc-cu12==12.1.105
82
+ nvidia-cuda-runtime-cu12==12.1.105
83
+ nvidia-cudnn-cu12==9.1.0.70
84
+ nvidia-cufft-cu12==11.0.2.54
85
+ nvidia-curand-cu12==10.3.2.106
86
+ nvidia-cusolver-cu12==11.4.5.107
87
+ nvidia-cusparse-cu12==12.1.0.106
88
+ nvidia-nccl-cu12==2.20.5
89
+ nvidia-nvjitlink-cu12==12.6.20
90
+ nvidia-nvtx-cu12==12.1.105
91
+ orjson==3.10.7
92
+ openpyxl
93
+ packaging==23.2
94
+ pandas==2.2.2
95
+ parso==0.8.4
96
+ pexpect==4.9.0
97
+ pillow==10.4.0
98
+ pip==22.3.1
99
+ platformdirs==4.2.2
100
+ portalocker==2.10.1
101
+ prompt-toolkit==3.0.47
102
+ protobuf==5.27.3
103
+ psutil==5.9.8
104
+ ptyprocess==0.7.0
105
+ pure-eval==0.2.3
106
+ pyarrow-hotfix==0.6
107
+ pyarrow==17.0.0
108
+ pycparser==2.22
109
+ pydantic-core==2.20.1
110
+ pydantic==2.8.2
111
+ pydub==0.25.1
112
+ pygments==2.18.0
113
+ pymupdf==1.23.26
114
+ pymupdfb==1.23.22
115
+ pyparsing==3.1.2
116
+ python-dateutil==2.9.0.post0
117
+ python-dotenv==1.0.1
118
+ python-multipart==0.0.9
119
+ pyyaml==6.0.2
120
+ pyzmq==26.2.0
121
+ pytz==2024.1
122
+ #returns>=0.26.0
123
+ qdrant-client==1.10.1
124
+ regex==2024.7.24
125
+ requests==2.32.3
126
+ rich==13.7.1
127
+ ruff==0.5.7
128
+ safetensors==0.4.4
129
+ scikit-learn==1.5.1
130
+ scipy==1.14.0
131
+ semantic-version==2.10.0
132
+ sentence-transformers==3.0.1
133
+ sentencepiece==0.2.0
134
+ setuptools==65.5.0
135
+ shellingham==1.5.4
136
+ six==1.16.0
137
+ sniffio==1.3.1
138
+ spaces==0.29.3
139
+ sqlalchemy==2.0.32
140
+ stack-data==0.6.3
141
+ starlette==0.37.2
142
+ sympy==1.13.2
143
+ tenacity==8.5.0
144
+ threadpoolctl==3.5.0
145
+ tokenizers==0.19.1
146
+ tomlkit==0.12.0
147
+ torch==2.4.0
148
+ tornado==6.4.1
149
+ tqdm==4.66.5
150
+ traitlets==5.14.3
151
+ transformers==4.44.0
152
+ triton==3.0.0
153
+ typer==0.12.3
154
+ typing-extensions==4.12.2
155
+ typing-inspect==0.9.0
156
+ tzdata==2024.1
157
+ urllib3==2.2.2
158
+ uvicorn==0.30.6
159
+ wcwidth==0.2.13
160
+ websockets==11.0.3
161
+ wheel==0.44.0
162
+ xxhash==3.4.1
163
+ yarl==1.9.4