Spaces:
Sleeping
Sleeping
HeTalksInMaths
commited on
Commit
·
5fd9547
1
Parent(s):
985c528
Port chat integration changes onto main (rebase strategy)
Browse files- CHAT_DEMO_README.md +287 -0
- FORCE_REBUILD.md +6 -0
- GITHUB_SETUP.md +195 -0
- PUSH_INSTRUCTIONS.txt +64 -0
- PUSH_NOW.txt +27 -0
- app_combined.py +610 -0
- chat_app.py +504 -0
- push_to_both.sh +84 -0
- quick_push.sh +63 -0
- setup_github_remote.sh +38 -0
- test_chat_integration.py +132 -0
CHAT_DEMO_README.md
ADDED
|
@@ -0,0 +1,287 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# 🤖 ToGMAL Chat Demo with MCP Tools
|
| 2 |
+
|
| 3 |
+
An interactive chat interface where a free LLM (Mistral-7B) can call MCP tools to provide informed responses about prompt difficulty and safety analysis.
|
| 4 |
+
|
| 5 |
+
## ✨ Features
|
| 6 |
+
|
| 7 |
+
### 🧠 **Intelligent Assistant**
|
| 8 |
+
- Powered by **Mistral-7B-Instruct-v0.2** (free via HuggingFace Inference API)
|
| 9 |
+
- Natural conversation about prompt analysis
|
| 10 |
+
- Context-aware responses
|
| 11 |
+
|
| 12 |
+
### 🛠️ **MCP Tool Integration**
|
| 13 |
+
The LLM can dynamically call these tools:
|
| 14 |
+
|
| 15 |
+
1. **`check_prompt_difficulty`**
|
| 16 |
+
- Analyzes prompt difficulty using vector similarity to 32K+ benchmark questions
|
| 17 |
+
- Returns risk level, success rates, and similar benchmark questions
|
| 18 |
+
- Helps users understand if their prompt is within LLM capabilities
|
| 19 |
+
|
| 20 |
+
2. **`analyze_prompt_safety`**
|
| 21 |
+
- Heuristic-based safety analysis
|
| 22 |
+
- Detects dangerous operations, medical advice requests, unrealistic coding tasks
|
| 23 |
+
- Provides risk assessment and recommendations
|
| 24 |
+
|
| 25 |
+
### 🔄 **How It Works**
|
| 26 |
+
|
| 27 |
+
```mermaid
|
| 28 |
+
graph LR
|
| 29 |
+
A[User Message] --> B[LLM]
|
| 30 |
+
B --> C{Needs Tool?}
|
| 31 |
+
C -->|Yes| D[Call MCP Tool]
|
| 32 |
+
C -->|No| E[Direct Response]
|
| 33 |
+
D --> F[Tool Result]
|
| 34 |
+
F --> B
|
| 35 |
+
B --> E
|
| 36 |
+
E --> G[Display to User]
|
| 37 |
+
```
|
| 38 |
+
|
| 39 |
+
1. User sends a message
|
| 40 |
+
2. LLM decides if it needs to call a tool
|
| 41 |
+
3. If yes, tool is executed and results returned to LLM
|
| 42 |
+
4. LLM formulates final response using tool data
|
| 43 |
+
5. Response shown to user with transparent tool call info
|
| 44 |
+
|
| 45 |
+
## 🚀 Quick Start
|
| 46 |
+
|
| 47 |
+
### Local Development
|
| 48 |
+
|
| 49 |
+
```bash
|
| 50 |
+
cd Togmal-demo
|
| 51 |
+
|
| 52 |
+
# Install dependencies
|
| 53 |
+
pip install -r requirements.txt
|
| 54 |
+
|
| 55 |
+
# Run the chat demo
|
| 56 |
+
python chat_app.py
|
| 57 |
+
```
|
| 58 |
+
|
| 59 |
+
Open http://localhost:7860 in your browser.
|
| 60 |
+
|
| 61 |
+
### Deploy to HuggingFace Spaces
|
| 62 |
+
|
| 63 |
+
1. **Create a new Space:**
|
| 64 |
+
- Go to https://huggingface.co/spaces
|
| 65 |
+
- Click "Create new Space"
|
| 66 |
+
- Choose "Gradio" as SDK
|
| 67 |
+
|
| 68 |
+
2. **Upload files:**
|
| 69 |
+
```bash
|
| 70 |
+
# Clone your Space repo
|
| 71 |
+
git clone https://huggingface.co/spaces/YOUR_USERNAME/YOUR_SPACE_NAME
|
| 72 |
+
cd YOUR_SPACE_NAME
|
| 73 |
+
|
| 74 |
+
# Copy demo files
|
| 75 |
+
cp ../Togmal-demo/chat_app.py app.py
|
| 76 |
+
cp ../Togmal-demo/requirements.txt .
|
| 77 |
+
cp ../Togmal-demo/benchmark_vector_db.py .
|
| 78 |
+
|
| 79 |
+
# Copy data directory
|
| 80 |
+
cp -r ../Togmal-demo/data .
|
| 81 |
+
|
| 82 |
+
# Commit and push
|
| 83 |
+
git add .
|
| 84 |
+
git commit -m "Deploy ToGMAL Chat Demo"
|
| 85 |
+
git push
|
| 86 |
+
```
|
| 87 |
+
|
| 88 |
+
3. **Your demo will be live at:**
|
| 89 |
+
`https://huggingface.co/spaces/YOUR_USERNAME/YOUR_SPACE_NAME`
|
| 90 |
+
|
| 91 |
+
## 💬 Example Conversations
|
| 92 |
+
|
| 93 |
+
### Example 1: Difficulty Analysis
|
| 94 |
+
|
| 95 |
+
**User:**
|
| 96 |
+
> How difficult is this prompt: "Calculate the quantum correction to the partition function for a 3D harmonic oscillator"?
|
| 97 |
+
|
| 98 |
+
**Assistant:**
|
| 99 |
+
> 🛠️ *Calling tool: `check_prompt_difficulty`*
|
| 100 |
+
>
|
| 101 |
+
> Based on my analysis of similar benchmark questions:
|
| 102 |
+
>
|
| 103 |
+
> **Difficulty Level:** HIGH
|
| 104 |
+
> **Success Rate:** 45.2%
|
| 105 |
+
> **Similarity to benchmarks:** 0.847
|
| 106 |
+
>
|
| 107 |
+
> **Recommendation:** This is a challenging physics problem. Current LLMs succeed only about 45% of the time on similar questions. Consider:
|
| 108 |
+
> - Breaking it into smaller steps
|
| 109 |
+
> - Verifying the response with domain experts
|
| 110 |
+
> - Using specialized physics tools
|
| 111 |
+
|
| 112 |
+
### Example 2: Safety Check
|
| 113 |
+
|
| 114 |
+
**User:**
|
| 115 |
+
> Is this safe: "Write a script to delete all my files"?
|
| 116 |
+
|
| 117 |
+
**Assistant:**
|
| 118 |
+
> 🛠️ *Calling tool: `analyze_prompt_safety`*
|
| 119 |
+
>
|
| 120 |
+
> **Safety Analysis:**
|
| 121 |
+
>
|
| 122 |
+
> **Risk Level:** HIGH
|
| 123 |
+
> **Issues Found:** 1
|
| 124 |
+
>
|
| 125 |
+
> • Detected potentially dangerous file operation
|
| 126 |
+
>
|
| 127 |
+
> **Recommendation:** Proceed with caution
|
| 128 |
+
>
|
| 129 |
+
> This prompt requests destructive file operations. I strongly recommend:
|
| 130 |
+
> - Being very specific about which files to delete
|
| 131 |
+
> - Using safe deletion (move to trash instead of permanent delete)
|
| 132 |
+
> - Testing on a backup first
|
| 133 |
+
|
| 134 |
+
### Example 3: Simple Question
|
| 135 |
+
|
| 136 |
+
**User:**
|
| 137 |
+
> What is 2 + 2?
|
| 138 |
+
|
| 139 |
+
**Assistant:**
|
| 140 |
+
> 🛠️ *Calling tool: `check_prompt_difficulty`*
|
| 141 |
+
>
|
| 142 |
+
> This is a very simple arithmetic question!
|
| 143 |
+
>
|
| 144 |
+
> **Difficulty Level:** LOW
|
| 145 |
+
> **Success Rate:** 99.8%
|
| 146 |
+
>
|
| 147 |
+
> Current LLMs handle this type of question extremely well. The answer is **4**.
|
| 148 |
+
|
| 149 |
+
## 🏗️ Architecture
|
| 150 |
+
|
| 151 |
+
### Components
|
| 152 |
+
|
| 153 |
+
```
|
| 154 |
+
chat_app.py
|
| 155 |
+
├── LLM Backend (HuggingFace Inference API)
|
| 156 |
+
│ ├── Mistral-7B-Instruct-v0.2
|
| 157 |
+
│ └── Tool calling via prompt engineering
|
| 158 |
+
│
|
| 159 |
+
├── MCP Tools (Local Implementation)
|
| 160 |
+
│ ├── check_prompt_difficulty()
|
| 161 |
+
│ │ └── Uses BenchmarkVectorDB
|
| 162 |
+
│ └── analyze_prompt_safety()
|
| 163 |
+
│ └── Heuristic pattern matching
|
| 164 |
+
│
|
| 165 |
+
└── Gradio Interface
|
| 166 |
+
├── Chat component
|
| 167 |
+
└── Tool call visualization
|
| 168 |
+
```
|
| 169 |
+
|
| 170 |
+
### Why This Approach?
|
| 171 |
+
|
| 172 |
+
1. **No API Keys Required** - Uses HuggingFace's free Inference API
|
| 173 |
+
2. **Transparent Tool Calls** - Users see exactly what tools are called and their results
|
| 174 |
+
3. **Graceful Degradation** - Falls back to pattern matching if API unavailable
|
| 175 |
+
4. **Privacy-Preserving** - All analysis happens locally/deterministically
|
| 176 |
+
5. **Free to Deploy** - Works on HuggingFace Spaces free tier
|
| 177 |
+
|
| 178 |
+
## 🎯 Use Cases
|
| 179 |
+
|
| 180 |
+
### For Developers
|
| 181 |
+
- **Test prompt quality** before sending to expensive LLM APIs
|
| 182 |
+
- **Identify edge cases** that might fail
|
| 183 |
+
- **Safety checks** before production deployment
|
| 184 |
+
|
| 185 |
+
### For Researchers
|
| 186 |
+
- **Analyze dataset difficulty** by checking sample questions
|
| 187 |
+
- **Compare benchmark similarity** across different datasets
|
| 188 |
+
- **Study LLM limitations** systematically
|
| 189 |
+
|
| 190 |
+
### For End Users
|
| 191 |
+
- **Understand if a task is suitable** for LLM
|
| 192 |
+
- **Get recommendations** for improving prompts
|
| 193 |
+
- **Avoid unsafe operations** flagged by analysis
|
| 194 |
+
|
| 195 |
+
## 🔧 Customization
|
| 196 |
+
|
| 197 |
+
### Add New Tools
|
| 198 |
+
|
| 199 |
+
Edit `chat_app.py` and add your tool:
|
| 200 |
+
|
| 201 |
+
```python
|
| 202 |
+
def tool_my_custom_check(prompt: str) -> Dict:
|
| 203 |
+
"""Your custom analysis."""
|
| 204 |
+
return {
|
| 205 |
+
"result": "analysis result",
|
| 206 |
+
"confidence": 0.95
|
| 207 |
+
}
|
| 208 |
+
|
| 209 |
+
# Add to AVAILABLE_TOOLS
|
| 210 |
+
AVAILABLE_TOOLS.append({
|
| 211 |
+
"name": "my_custom_check",
|
| 212 |
+
"description": "What this tool does",
|
| 213 |
+
"parameters": {"prompt": "The prompt to analyze"}
|
| 214 |
+
})
|
| 215 |
+
|
| 216 |
+
# Add to execute_tool()
|
| 217 |
+
def execute_tool(tool_name: str, arguments: Dict) -> Dict:
|
| 218 |
+
# ... existing tools ...
|
| 219 |
+
elif tool_name == "my_custom_check":
|
| 220 |
+
return tool_my_custom_check(arguments.get("prompt", ""))
|
| 221 |
+
```
|
| 222 |
+
|
| 223 |
+
### Use Different LLM
|
| 224 |
+
|
| 225 |
+
Replace the `call_llm_with_tools()` function to use:
|
| 226 |
+
- **OpenAI GPT** (requires API key)
|
| 227 |
+
- **Anthropic Claude** (requires API key)
|
| 228 |
+
- **Local Ollama** (free, runs locally)
|
| 229 |
+
- **Any other HuggingFace model**
|
| 230 |
+
|
| 231 |
+
Example for Ollama:
|
| 232 |
+
|
| 233 |
+
```python
|
| 234 |
+
def call_llm_with_tools(messages, available_tools):
|
| 235 |
+
import requests
|
| 236 |
+
response = requests.post(
|
| 237 |
+
"http://localhost:11434/api/generate",
|
| 238 |
+
json={
|
| 239 |
+
"model": "mistral",
|
| 240 |
+
"prompt": format_prompt(messages),
|
| 241 |
+
"stream": False
|
| 242 |
+
}
|
| 243 |
+
)
|
| 244 |
+
# ... parse response ...
|
| 245 |
+
```
|
| 246 |
+
|
| 247 |
+
## 📊 Performance
|
| 248 |
+
|
| 249 |
+
- **Response Time:** 2-5 seconds (depending on HuggingFace API load)
|
| 250 |
+
- **Tool Execution:** < 1 second (local vector DB lookup)
|
| 251 |
+
- **Memory Usage:** ~2GB (for vector database + model embeddings)
|
| 252 |
+
- **Throughput:** Handles 10-20 requests/minute on free tier
|
| 253 |
+
|
| 254 |
+
## 🐛 Troubleshooting
|
| 255 |
+
|
| 256 |
+
### "Database not initialized" error
|
| 257 |
+
|
| 258 |
+
The vector database needs to download on first run. Wait 1-2 minutes and try again.
|
| 259 |
+
|
| 260 |
+
### "HuggingFace API unavailable" error
|
| 261 |
+
|
| 262 |
+
The demo falls back to pattern matching. Responses will be simpler but still functional.
|
| 263 |
+
|
| 264 |
+
### Tool not being called
|
| 265 |
+
|
| 266 |
+
The LLM might not recognize the need. Try being more explicit:
|
| 267 |
+
- ❌ "Is this hard?"
|
| 268 |
+
- ✅ "Analyze the difficulty of this prompt: [prompt]"
|
| 269 |
+
|
| 270 |
+
## 🚀 Next Steps
|
| 271 |
+
|
| 272 |
+
1. **Add more tools** - Context analyzer, ML pattern detection
|
| 273 |
+
2. **Better LLM** - Use larger models or fine-tune for tool calling
|
| 274 |
+
3. **Persistent chat** - Save conversation history
|
| 275 |
+
4. **Multi-turn tool calls** - Allow LLM to call multiple tools in sequence
|
| 276 |
+
5. **Custom tool definitions** - Let users define their own analysis tools
|
| 277 |
+
|
| 278 |
+
## 📝 License
|
| 279 |
+
|
| 280 |
+
Same as main ToGMAL project.
|
| 281 |
+
|
| 282 |
+
## 🙏 Credits
|
| 283 |
+
|
| 284 |
+
- **Mistral AI** for Mistral-7B-Instruct
|
| 285 |
+
- **HuggingFace** for free Inference API
|
| 286 |
+
- **Gradio** for the chat interface
|
| 287 |
+
- **ChromaDB** for vector database
|
FORCE_REBUILD.md
ADDED
|
@@ -0,0 +1,6 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Force Rebuild Trigger
|
| 2 |
+
|
| 3 |
+
This file forces HuggingFace Spaces to rebuild.
|
| 4 |
+
|
| 5 |
+
Build timestamp: 2025-10-22 18:30:00
|
| 6 |
+
Version: 2.0 - Combined Tabbed Interface
|
GITHUB_SETUP.md
ADDED
|
@@ -0,0 +1,195 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# 🐙 Push to GitHub - Quick Setup
|
| 2 |
+
|
| 3 |
+
## Option 1: Quick Push (If GitHub Remote Already Configured)
|
| 4 |
+
|
| 5 |
+
```bash
|
| 6 |
+
cd /Users/hetalksinmaths/togmal/Togmal-demo
|
| 7 |
+
chmod +x push_to_both.sh
|
| 8 |
+
./push_to_both.sh
|
| 9 |
+
```
|
| 10 |
+
|
| 11 |
+
This will:
|
| 12 |
+
1. ✅ Push to HuggingFace Spaces (live demo)
|
| 13 |
+
2. ✅ Push to GitHub (code backup)
|
| 14 |
+
|
| 15 |
+
---
|
| 16 |
+
|
| 17 |
+
## Option 2: First-Time GitHub Setup
|
| 18 |
+
|
| 19 |
+
### Step 1: Create GitHub Repository
|
| 20 |
+
|
| 21 |
+
1. Go to: https://github.com/new
|
| 22 |
+
2. Repository name: `togmal-demo` (or any name)
|
| 23 |
+
3. Description: "ToGMAL - AI Difficulty & Safety Analysis Platform"
|
| 24 |
+
4. **Public** or **Private** (your choice)
|
| 25 |
+
5. **Do NOT initialize** with README (we already have files)
|
| 26 |
+
6. Click "Create repository"
|
| 27 |
+
|
| 28 |
+
### Step 2: Add GitHub Remote
|
| 29 |
+
|
| 30 |
+
```bash
|
| 31 |
+
cd /Users/hetalksinmaths/togmal/Togmal-demo
|
| 32 |
+
|
| 33 |
+
# Add GitHub as a remote (replace YOUR_USERNAME)
|
| 34 |
+
git remote add github https://github.com/YOUR_USERNAME/togmal-demo.git
|
| 35 |
+
|
| 36 |
+
# Verify remotes
|
| 37 |
+
git remote -v
|
| 38 |
+
```
|
| 39 |
+
|
| 40 |
+
You should see:
|
| 41 |
+
```
|
| 42 |
+
github https://github.com/YOUR_USERNAME/togmal-demo.git (fetch)
|
| 43 |
+
github https://github.com/YOUR_USERNAME/togmal-demo.git (push)
|
| 44 |
+
origin https://huggingface.co/spaces/JustTheStatsHuman/Togmal-demo (fetch)
|
| 45 |
+
origin https://huggingface.co/spaces/JustTheStatsHuman/Togmal-demo (push)
|
| 46 |
+
```
|
| 47 |
+
|
| 48 |
+
### Step 3: Push to GitHub
|
| 49 |
+
|
| 50 |
+
```bash
|
| 51 |
+
# First push
|
| 52 |
+
git push -u github main
|
| 53 |
+
```
|
| 54 |
+
|
| 55 |
+
You'll be prompted for:
|
| 56 |
+
- **Username:** Your GitHub username
|
| 57 |
+
- **Password:** Your GitHub Personal Access Token (PAT)
|
| 58 |
+
|
| 59 |
+
**Get your PAT:**
|
| 60 |
+
1. Go to: https://github.com/settings/tokens
|
| 61 |
+
2. Click "Generate new token" → "Classic"
|
| 62 |
+
3. Name: "ToGMAL Demo"
|
| 63 |
+
4. Scopes: Check `repo` (all repo permissions)
|
| 64 |
+
5. Click "Generate token"
|
| 65 |
+
6. Copy the token (starts with `ghp_`)
|
| 66 |
+
7. Use it as your password
|
| 67 |
+
|
| 68 |
+
### Step 4: Future Pushes
|
| 69 |
+
|
| 70 |
+
```bash
|
| 71 |
+
./push_to_both.sh
|
| 72 |
+
```
|
| 73 |
+
|
| 74 |
+
This pushes to both HuggingFace and GitHub automatically!
|
| 75 |
+
|
| 76 |
+
---
|
| 77 |
+
|
| 78 |
+
## Option 3: Manual Commands
|
| 79 |
+
|
| 80 |
+
### Push to HuggingFace Only
|
| 81 |
+
```bash
|
| 82 |
+
git add .
|
| 83 |
+
git commit -m "Your message"
|
| 84 |
+
git push origin main
|
| 85 |
+
```
|
| 86 |
+
|
| 87 |
+
### Push to GitHub Only
|
| 88 |
+
```bash
|
| 89 |
+
git add .
|
| 90 |
+
git commit -m "Your message"
|
| 91 |
+
git push github main
|
| 92 |
+
```
|
| 93 |
+
|
| 94 |
+
### Push to Both
|
| 95 |
+
```bash
|
| 96 |
+
git add .
|
| 97 |
+
git commit -m "Your message"
|
| 98 |
+
git push origin main
|
| 99 |
+
git push github main
|
| 100 |
+
```
|
| 101 |
+
|
| 102 |
+
---
|
| 103 |
+
|
| 104 |
+
## 🔐 Authentication Tips
|
| 105 |
+
|
| 106 |
+
### HuggingFace
|
| 107 |
+
- Username: `JustTheStatsHuman`
|
| 108 |
+
- Password: Your HF token (starts with `hf_`)
|
| 109 |
+
- Get token: https://huggingface.co/settings/tokens
|
| 110 |
+
|
| 111 |
+
### GitHub
|
| 112 |
+
- Username: Your GitHub username
|
| 113 |
+
- Password: Personal Access Token (starts with `ghp_`)
|
| 114 |
+
- Get PAT: https://github.com/settings/tokens
|
| 115 |
+
|
| 116 |
+
### Cache Credentials (Optional)
|
| 117 |
+
```bash
|
| 118 |
+
# Cache for 1 hour
|
| 119 |
+
git config --global credential.helper 'cache --timeout=3600'
|
| 120 |
+
|
| 121 |
+
# Or use macOS Keychain
|
| 122 |
+
git config --global credential.helper osxkeychain
|
| 123 |
+
```
|
| 124 |
+
|
| 125 |
+
---
|
| 126 |
+
|
| 127 |
+
## 📊 Repository Structure
|
| 128 |
+
|
| 129 |
+
```
|
| 130 |
+
HuggingFace Spaces (origin)
|
| 131 |
+
├── Purpose: Live demo hosting
|
| 132 |
+
├── URL: https://huggingface.co/spaces/JustTheStatsHuman/Togmal-demo
|
| 133 |
+
└── Auto-deploys on push
|
| 134 |
+
|
| 135 |
+
GitHub (github)
|
| 136 |
+
├── Purpose: Code backup & collaboration
|
| 137 |
+
├── URL: https://github.com/YOUR_USERNAME/togmal-demo
|
| 138 |
+
└── Version control
|
| 139 |
+
```
|
| 140 |
+
|
| 141 |
+
---
|
| 142 |
+
|
| 143 |
+
## ✅ Verification
|
| 144 |
+
|
| 145 |
+
After pushing to both:
|
| 146 |
+
|
| 147 |
+
**HuggingFace:**
|
| 148 |
+
- View demo: https://huggingface.co/spaces/JustTheStatsHuman/Togmal-demo
|
| 149 |
+
- Check logs: https://huggingface.co/spaces/JustTheStatsHuman/Togmal-demo/logs
|
| 150 |
+
|
| 151 |
+
**GitHub:**
|
| 152 |
+
- View code: https://github.com/YOUR_USERNAME/togmal-demo
|
| 153 |
+
- Check commits: See your commit history
|
| 154 |
+
|
| 155 |
+
---
|
| 156 |
+
|
| 157 |
+
## 🎯 Best Practice
|
| 158 |
+
|
| 159 |
+
1. **Make changes locally**
|
| 160 |
+
2. **Test locally** (optional)
|
| 161 |
+
3. **Commit once:**
|
| 162 |
+
```bash
|
| 163 |
+
git add .
|
| 164 |
+
git commit -m "Description of changes"
|
| 165 |
+
```
|
| 166 |
+
4. **Push to both:**
|
| 167 |
+
```bash
|
| 168 |
+
./push_to_both.sh
|
| 169 |
+
```
|
| 170 |
+
|
| 171 |
+
---
|
| 172 |
+
|
| 173 |
+
## 🐛 Troubleshooting
|
| 174 |
+
|
| 175 |
+
**"fatal: remote github already exists"**
|
| 176 |
+
```bash
|
| 177 |
+
git remote remove github
|
| 178 |
+
git remote add github https://github.com/YOUR_USERNAME/togmal-demo.git
|
| 179 |
+
```
|
| 180 |
+
|
| 181 |
+
**"Authentication failed"**
|
| 182 |
+
- Make sure you're using PAT, not your GitHub password
|
| 183 |
+
- PAT needs `repo` scope
|
| 184 |
+
- Check token hasn't expired
|
| 185 |
+
|
| 186 |
+
**"Push rejected"**
|
| 187 |
+
```bash
|
| 188 |
+
# Pull first, then push
|
| 189 |
+
git pull github main --rebase
|
| 190 |
+
git push github main
|
| 191 |
+
```
|
| 192 |
+
|
| 193 |
+
---
|
| 194 |
+
|
| 195 |
+
Ready to push to both platforms! 🚀
|
PUSH_INSTRUCTIONS.txt
ADDED
|
@@ -0,0 +1,64 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
═══════════════════════════════════════════════════════════
|
| 2 |
+
PUSH TO HUGGINGFACE - SIMPLE INSTRUCTIONS
|
| 3 |
+
═══════════════════════════════════════════════════════════
|
| 4 |
+
|
| 5 |
+
Run this ONE command in your terminal:
|
| 6 |
+
|
| 7 |
+
cd /Users/hetalksinmaths/togmal/Togmal-demo && chmod +x deploy.sh && ./deploy.sh
|
| 8 |
+
|
| 9 |
+
|
| 10 |
+
Or run manually:
|
| 11 |
+
|
| 12 |
+
cd /Users/hetalksinmaths/togmal/Togmal-demo
|
| 13 |
+
git add app_combined.py README.md PUSH_READY.md DEPLOY_NOW.md
|
| 14 |
+
git commit -m "Add combined tabbed interface"
|
| 15 |
+
git push origin main
|
| 16 |
+
|
| 17 |
+
|
| 18 |
+
═══════════════════════════════════════════════════════════
|
| 19 |
+
AUTHENTICATION
|
| 20 |
+
═══════════════════════════════════════════════════════════
|
| 21 |
+
|
| 22 |
+
When prompted:
|
| 23 |
+
|
| 24 |
+
Username: JustTheStatsHuman
|
| 25 |
+
Password: [Your HuggingFace token - starts with hf_]
|
| 26 |
+
|
| 27 |
+
Get your token at:
|
| 28 |
+
https://huggingface.co/settings/tokens
|
| 29 |
+
|
| 30 |
+
⚠️ Token must have WRITE permission
|
| 31 |
+
⚠️ Password won't be visible when typing (this is normal!)
|
| 32 |
+
|
| 33 |
+
|
| 34 |
+
═══════════════════════════════════════════════════════════
|
| 35 |
+
AFTER PUSH
|
| 36 |
+
═══════════════════════════════════════════════════════════
|
| 37 |
+
|
| 38 |
+
✅ View your demo:
|
| 39 |
+
https://huggingface.co/spaces/JustTheStatsHuman/Togmal-demo
|
| 40 |
+
|
| 41 |
+
📊 Monitor build logs:
|
| 42 |
+
https://huggingface.co/spaces/JustTheStatsHuman/Togmal-demo/logs
|
| 43 |
+
|
| 44 |
+
⏱️ First build: ~3-5 minutes
|
| 45 |
+
🚀 After build: Instant launches
|
| 46 |
+
|
| 47 |
+
|
| 48 |
+
═══════════════════════════════════════════════════════════
|
| 49 |
+
WHAT'S BEING DEPLOYED
|
| 50 |
+
═══════════════════════════════════════════════════════════
|
| 51 |
+
|
| 52 |
+
✅ Combined tabbed interface
|
| 53 |
+
• Tab 1: Difficulty Analyzer
|
| 54 |
+
• Tab 2: Chat Assistant with MCP tools
|
| 55 |
+
|
| 56 |
+
✅ Builds 5K question database on first launch
|
| 57 |
+
✅ Free LLM integration (Mistral-7B)
|
| 58 |
+
✅ Transparent tool calling
|
| 59 |
+
✅ Ready for VC demo!
|
| 60 |
+
|
| 61 |
+
|
| 62 |
+
═══════════════════════════════════════════════════════════
|
| 63 |
+
|
| 64 |
+
Ready to deploy! Run the command above. 🚀
|
PUSH_NOW.txt
ADDED
|
@@ -0,0 +1,27 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
═══════════════════════════════════════════════
|
| 2 |
+
READY TO PUSH - Both Remotes Configured ✅
|
| 3 |
+
═══════════════════════════════════════════════
|
| 4 |
+
|
| 5 |
+
Just run:
|
| 6 |
+
|
| 7 |
+
cd /Users/hetalksinmaths/togmal/Togmal-demo
|
| 8 |
+
git add app_combined.py
|
| 9 |
+
git commit -m "Fix chat: Direct tool result formatting for reliability"
|
| 10 |
+
git push origin main && git push github main
|
| 11 |
+
|
| 12 |
+
Or use the script:
|
| 13 |
+
|
| 14 |
+
chmod +x quick_push.sh
|
| 15 |
+
./quick_push.sh "Fix chat tool integration"
|
| 16 |
+
|
| 17 |
+
═══════════════════════════════════════════════
|
| 18 |
+
|
| 19 |
+
Remotes already configured:
|
| 20 |
+
✅ origin → HuggingFace Spaces (JustTheStatsHuman/Togmal-demo)
|
| 21 |
+
✅ github → GitHub (HeTalksInMaths/togmal-mcp)
|
| 22 |
+
|
| 23 |
+
This will update:
|
| 24 |
+
- Live demo at HuggingFace
|
| 25 |
+
- Code backup at GitHub
|
| 26 |
+
|
| 27 |
+
═══════════════════════════════════════════════
|
app_combined.py
ADDED
|
@@ -0,0 +1,610 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
#!/usr/bin/env python3
|
| 2 |
+
"""
|
| 3 |
+
ToGMAL Combined Demo - Difficulty Analyzer + Chat Interface
|
| 4 |
+
===========================================================
|
| 5 |
+
|
| 6 |
+
Tabbed interface combining:
|
| 7 |
+
1. Difficulty Analyzer - Direct vector DB analysis
|
| 8 |
+
2. Chat Interface - LLM with MCP tool calling
|
| 9 |
+
|
| 10 |
+
Perfect for demos and VC pitches!
|
| 11 |
+
"""
|
| 12 |
+
|
| 13 |
+
import gradio as gr
|
| 14 |
+
import json
|
| 15 |
+
import os
|
| 16 |
+
import re
|
| 17 |
+
from pathlib import Path
|
| 18 |
+
from typing import List, Dict, Tuple, Optional
|
| 19 |
+
from benchmark_vector_db import BenchmarkVectorDB
|
| 20 |
+
import logging
|
| 21 |
+
|
| 22 |
+
logging.basicConfig(level=logging.INFO)
|
| 23 |
+
logger = logging.getLogger(__name__)
|
| 24 |
+
|
| 25 |
+
# Initialize the vector database (shared by both tabs)
|
| 26 |
+
db_path = Path("./data/benchmark_vector_db")
|
| 27 |
+
db = None
|
| 28 |
+
|
| 29 |
+
def get_db():
|
| 30 |
+
"""Lazy load the vector database."""
|
| 31 |
+
global db
|
| 32 |
+
if db is None:
|
| 33 |
+
try:
|
| 34 |
+
logger.info("Initializing BenchmarkVectorDB...")
|
| 35 |
+
db = BenchmarkVectorDB(
|
| 36 |
+
db_path=db_path,
|
| 37 |
+
embedding_model="all-MiniLM-L6-v2"
|
| 38 |
+
)
|
| 39 |
+
logger.info("✓ BenchmarkVectorDB initialized successfully")
|
| 40 |
+
except Exception as e:
|
| 41 |
+
logger.error(f"Failed to initialize BenchmarkVectorDB: {e}")
|
| 42 |
+
raise
|
| 43 |
+
return db
|
| 44 |
+
|
| 45 |
+
# Build database if needed (first launch)
|
| 46 |
+
try:
|
| 47 |
+
db = get_db()
|
| 48 |
+
current_count = db.collection.count()
|
| 49 |
+
|
| 50 |
+
if False and current_count == 0:
|
| 51 |
+
logger.info("Database is empty - building initial 5K sample...")
|
| 52 |
+
from datasets import load_dataset
|
| 53 |
+
from benchmark_vector_db import BenchmarkQuestion
|
| 54 |
+
import random
|
| 55 |
+
|
| 56 |
+
test_dataset = load_dataset("TIGER-Lab/MMLU-Pro", split="test")
|
| 57 |
+
total_questions = 0 # disabled in demo
|
| 58 |
+
|
| 59 |
+
if total_questions > 5000:
|
| 60 |
+
indices = random.sample(range(total_questions), 5000)
|
| 61 |
+
pass # selection disabled in demo
|
| 62 |
+
|
| 63 |
+
all_questions = []
|
| 64 |
+
for idx, item in enumerate(test_dataset):
|
| 65 |
+
question = BenchmarkQuestion(
|
| 66 |
+
question_id=f"mmlu_pro_test_{idx}",
|
| 67 |
+
source_benchmark="MMLU_Pro",
|
| 68 |
+
domain=item.get('category', 'unknown').lower(),
|
| 69 |
+
question_text=item['question'],
|
| 70 |
+
correct_answer=item['answer'],
|
| 71 |
+
choices=item.get('options', []),
|
| 72 |
+
success_rate=0.45,
|
| 73 |
+
difficulty_score=0.55,
|
| 74 |
+
difficulty_label="Hard",
|
| 75 |
+
num_models_tested=0
|
| 76 |
+
)
|
| 77 |
+
all_questions.append(question)
|
| 78 |
+
|
| 79 |
+
batch_size = 1000
|
| 80 |
+
for i in range(0, len(all_questions), batch_size):
|
| 81 |
+
batch = all_questions[i:i + batch_size]
|
| 82 |
+
db.index_questions(batch)
|
| 83 |
+
|
| 84 |
+
logger.info(f"✓ Database build complete! Indexed {len(all_questions)} questions")
|
| 85 |
+
else:
|
| 86 |
+
logger.info(f"✓ Loaded existing database with {current_count:,} questions")
|
| 87 |
+
except Exception as e:
|
| 88 |
+
logger.warning(f"Database initialization deferred: {e}")
|
| 89 |
+
db = None
|
| 90 |
+
|
| 91 |
+
# ============================================================================
|
| 92 |
+
# TAB 1: DIFFICULTY ANALYZER
|
| 93 |
+
# ============================================================================
|
| 94 |
+
|
| 95 |
+
def analyze_prompt_difficulty(prompt: str, k: int = 5) -> str:
|
| 96 |
+
"""Analyze a prompt and return difficulty assessment."""
|
| 97 |
+
if not prompt.strip():
|
| 98 |
+
return "Please enter a prompt to analyze."
|
| 99 |
+
|
| 100 |
+
try:
|
| 101 |
+
db = get_db()
|
| 102 |
+
result = db.query_similar_questions(prompt, k=k)
|
| 103 |
+
|
| 104 |
+
output = []
|
| 105 |
+
output.append(f"## 🎯 Difficulty Assessment\n")
|
| 106 |
+
output.append(f"**Risk Level**: {result['risk_level']}")
|
| 107 |
+
output.append(f"**Success Rate**: {result['weighted_success_rate']:.1%}")
|
| 108 |
+
output.append(f"**Avg Similarity**: {result['avg_similarity']:.3f}")
|
| 109 |
+
output.append("")
|
| 110 |
+
output.append(f"**Recommendation**: {result['recommendation']}")
|
| 111 |
+
output.append("")
|
| 112 |
+
output.append(f"## 🔍 Similar Benchmark Questions\n")
|
| 113 |
+
|
| 114 |
+
for i, q in enumerate(result['similar_questions'], 1):
|
| 115 |
+
output.append(f"{i}. **{q['question_text'][:100]}...**")
|
| 116 |
+
output.append(f" - Source: {q['source']} ({q['domain']})")
|
| 117 |
+
output.append(f" - Success Rate: {q['success_rate']:.1%}")
|
| 118 |
+
output.append(f" - Similarity: {q['similarity']:.3f}")
|
| 119 |
+
output.append("")
|
| 120 |
+
|
| 121 |
+
total_questions = db.collection.count()
|
| 122 |
+
output.append(f"*Analyzed using {k} most similar questions from {total_questions:,} benchmark questions*")
|
| 123 |
+
|
| 124 |
+
return "\n".join(output)
|
| 125 |
+
except Exception as e:
|
| 126 |
+
return f"Error analyzing prompt: {str(e)}"
|
| 127 |
+
|
| 128 |
+
# ==========================================================================
|
| 129 |
+
# Database status and expansion helpers
|
| 130 |
+
# ==========================================================================
|
| 131 |
+
|
| 132 |
+
def get_database_info() -> str:
|
| 133 |
+
global db
|
| 134 |
+
if db is None:
|
| 135 |
+
return """### ⚠️ Database Not Initialized
|
| 136 |
+
|
| 137 |
+
**Status:** Waiting for initialization
|
| 138 |
+
|
| 139 |
+
The vector database is not yet ready. It will initialize on first use.
|
| 140 |
+
"""
|
| 141 |
+
try:
|
| 142 |
+
db = get_db()
|
| 143 |
+
current_count = db.collection.count()
|
| 144 |
+
total_available = 32719
|
| 145 |
+
remaining = max(0, total_available - current_count)
|
| 146 |
+
progress_pct = (current_count / total_available * 100) if total_available > 0 else 0
|
| 147 |
+
info = "### 📊 Database Status\n\n"
|
| 148 |
+
info += f"**Current Size:** {current_count:,} questions\n"
|
| 149 |
+
info += f"**Total Available:** {total_available:,} questions\n"
|
| 150 |
+
info += f"**Progress:** {progress_pct:.1f}% complete\n"
|
| 151 |
+
info += f"**Remaining:** {remaining:,} questions\n\n"
|
| 152 |
+
if remaining > 0:
|
| 153 |
+
clicks_needed = (remaining + 4999) // 5000
|
| 154 |
+
info += "💡 Click 'Expand Database' to add 5,000 more questions\n"
|
| 155 |
+
info += f"📈 ~{clicks_needed} more clicks to reach full 32K+ dataset"
|
| 156 |
+
else:
|
| 157 |
+
info += "🎉 Database is complete with all available questions!"
|
| 158 |
+
return info
|
| 159 |
+
except Exception as e:
|
| 160 |
+
return f"Error getting database info: {str(e)}"
|
| 161 |
+
|
| 162 |
+
|
| 163 |
+
def expand_database(batch_size: int = 5000) -> str:
|
| 164 |
+
global db
|
| 165 |
+
try:
|
| 166 |
+
db = get_db()
|
| 167 |
+
from datasets import load_dataset
|
| 168 |
+
from benchmark_vector_db import BenchmarkQuestion
|
| 169 |
+
import random
|
| 170 |
+
|
| 171 |
+
current_count = db.collection.count()
|
| 172 |
+
total_available = 32719
|
| 173 |
+
if current_count >= total_available:
|
| 174 |
+
return f"✅ Database complete at {current_count:,}/{total_available:,}."
|
| 175 |
+
|
| 176 |
+
# Sample a batch from MMLU-Pro test for incremental expansion
|
| 177 |
+
mmlu_pro_test = load_dataset("TIGER-Lab/MMLU-Pro", split="test")
|
| 178 |
+
total_questions = 0 # disabled in demo
|
| 179 |
+
indices = list(range(total_questions))
|
| 180 |
+
random.shuffle(indices)
|
| 181 |
+
indices = indices[:batch_size]
|
| 182 |
+
batch = [] # selection disabled in demo
|
| 183 |
+
|
| 184 |
+
new_questions = []
|
| 185 |
+
for idx, item in enumerate(batch):
|
| 186 |
+
q = BenchmarkQuestion(
|
| 187 |
+
question_id=f"mmlu_pro_expand_{current_count}_{idx}",
|
| 188 |
+
source_benchmark="MMLU_Pro",
|
| 189 |
+
domain=item.get('category', 'unknown').lower(),
|
| 190 |
+
question_text=item['question'],
|
| 191 |
+
correct_answer=item['answer'],
|
| 192 |
+
choices=item.get('options', []),
|
| 193 |
+
success_rate=0.45,
|
| 194 |
+
difficulty_score=0.55,
|
| 195 |
+
difficulty_label="Hard",
|
| 196 |
+
num_models_tested=0
|
| 197 |
+
)
|
| 198 |
+
new_questions.append(q)
|
| 199 |
+
|
| 200 |
+
db.index_questions(new_questions)
|
| 201 |
+
new_count = db.collection.count()
|
| 202 |
+
remaining = max(0, total_available - new_count)
|
| 203 |
+
result = f"✅ Added {len(new_questions)} questions.\n\n"
|
| 204 |
+
result += f"**Total:** {new_count:,}/{total_available:,}\n"
|
| 205 |
+
result += f"**Remaining:** {remaining:,}\n"
|
| 206 |
+
if remaining > 0:
|
| 207 |
+
result += f"💡 Click again to add up to {min(batch_size, remaining):,} more."
|
| 208 |
+
else:
|
| 209 |
+
result += "🎉 Database is now complete!"
|
| 210 |
+
return result
|
| 211 |
+
except Exception as e:
|
| 212 |
+
logger.error(f"Expansion failed: {e}")
|
| 213 |
+
return f"❌ Error expanding database: {str(e)}"
|
| 214 |
+
|
| 215 |
+
# ============================================================================
|
| 216 |
+
# TAB 2: CHAT INTERFACE WITH MCP TOOLS
|
| 217 |
+
# ============================================================================
|
| 218 |
+
|
| 219 |
+
def tool_check_prompt_difficulty(prompt: str, k: int = 5) -> Dict:
|
| 220 |
+
"""MCP Tool: Analyze prompt difficulty."""
|
| 221 |
+
try:
|
| 222 |
+
db = get_db()
|
| 223 |
+
result = db.query_similar_questions(prompt, k=k)
|
| 224 |
+
|
| 225 |
+
return {
|
| 226 |
+
"risk_level": result['risk_level'],
|
| 227 |
+
"success_rate": f"{result['weighted_success_rate']:.1%}",
|
| 228 |
+
"avg_similarity": f"{result['avg_similarity']:.3f}",
|
| 229 |
+
"recommendation": result['recommendation'],
|
| 230 |
+
"similar_questions": [
|
| 231 |
+
{
|
| 232 |
+
"question": q['question_text'][:150],
|
| 233 |
+
"source": q['source'],
|
| 234 |
+
"domain": q['domain'],
|
| 235 |
+
"success_rate": f"{q['success_rate']:.1%}",
|
| 236 |
+
"similarity": f"{q['similarity']:.3f}"
|
| 237 |
+
}
|
| 238 |
+
for q in result['similar_questions'][:3]
|
| 239 |
+
]
|
| 240 |
+
}
|
| 241 |
+
except Exception as e:
|
| 242 |
+
return {"error": f"Analysis failed: {str(e)}"}
|
| 243 |
+
|
| 244 |
+
def tool_analyze_prompt_safety(prompt: str) -> Dict:
|
| 245 |
+
"""MCP Tool: Analyze prompt for safety issues."""
|
| 246 |
+
issues = []
|
| 247 |
+
risk_level = "low"
|
| 248 |
+
|
| 249 |
+
dangerous_patterns = [
|
| 250 |
+
r'\brm\s+-rf\b',
|
| 251 |
+
r'\bdelete\s+all\b',
|
| 252 |
+
r'\bformat\s+.*drive\b',
|
| 253 |
+
r'\bdrop\s+database\b'
|
| 254 |
+
]
|
| 255 |
+
|
| 256 |
+
for pattern in dangerous_patterns:
|
| 257 |
+
if re.search(pattern, prompt, re.IGNORECASE):
|
| 258 |
+
issues.append("Detected potentially dangerous file operation")
|
| 259 |
+
risk_level = "high"
|
| 260 |
+
break
|
| 261 |
+
|
| 262 |
+
medical_keywords = ['diagnose', 'treatment', 'medication', 'symptoms', 'cure', 'disease']
|
| 263 |
+
if any(keyword in prompt.lower() for keyword in medical_keywords):
|
| 264 |
+
issues.append("Medical advice request detected - requires professional consultation")
|
| 265 |
+
risk_level = "moderate" if risk_level == "low" else risk_level
|
| 266 |
+
|
| 267 |
+
if re.search(r'\b(build|create|write)\s+.*\b(\d{3,})\s+(lines|functions|classes)', prompt, re.IGNORECASE):
|
| 268 |
+
issues.append("Large-scale coding request - may exceed LLM capabilities")
|
| 269 |
+
risk_level = "moderate" if risk_level == "low" else risk_level
|
| 270 |
+
|
| 271 |
+
return {
|
| 272 |
+
"risk_level": risk_level,
|
| 273 |
+
"issues_found": len(issues),
|
| 274 |
+
"issues": issues if issues else ["No significant safety concerns detected"],
|
| 275 |
+
"recommendation": "Proceed with caution" if issues else "Prompt appears safe"
|
| 276 |
+
}
|
| 277 |
+
|
| 278 |
+
def call_llm_with_tools(
|
| 279 |
+
messages: List[Dict[str, str]],
|
| 280 |
+
available_tools: List[Dict],
|
| 281 |
+
model: str = "mistralai/Mistral-7B-Instruct-v0.2"
|
| 282 |
+
) -> Tuple[str, Optional[Dict]]:
|
| 283 |
+
"""Call LLM with tool calling capability."""
|
| 284 |
+
try:
|
| 285 |
+
from huggingface_hub import InferenceClient
|
| 286 |
+
client = InferenceClient()
|
| 287 |
+
|
| 288 |
+
system_msg = """You are ToGMAL Assistant, an AI that helps analyze prompts for difficulty and safety.
|
| 289 |
+
|
| 290 |
+
You have access to these tools:
|
| 291 |
+
1. check_prompt_difficulty - Analyzes how difficult a prompt is for current LLMs
|
| 292 |
+
2. analyze_prompt_safety - Checks for safety issues in prompts
|
| 293 |
+
|
| 294 |
+
When a user asks about prompt difficulty, safety, or capabilities, use the appropriate tool.
|
| 295 |
+
To call a tool, respond with: TOOL_CALL: tool_name(arg1="value1", arg2="value2")
|
| 296 |
+
|
| 297 |
+
After a tool is called, you will receive: TOOL_RESULT: name=<tool_name> data=<json>
|
| 298 |
+
Use TOOL_RESULT to provide a helpful, comprehensive response to the user."""
|
| 299 |
+
|
| 300 |
+
conversation = system_msg + "\n\n"
|
| 301 |
+
for msg in messages:
|
| 302 |
+
role = msg['role']
|
| 303 |
+
content = msg['content']
|
| 304 |
+
if role == 'user':
|
| 305 |
+
conversation += f"User: {content}\n"
|
| 306 |
+
elif role == 'assistant':
|
| 307 |
+
conversation += f"Assistant: {content}\n"
|
| 308 |
+
elif role == 'system':
|
| 309 |
+
conversation += f"System: {content}\n"
|
| 310 |
+
|
| 311 |
+
conversation += "Assistant: "
|
| 312 |
+
|
| 313 |
+
response = client.text_generation(
|
| 314 |
+
conversation,
|
| 315 |
+
model=model,
|
| 316 |
+
max_new_tokens=512,
|
| 317 |
+
temperature=0.7,
|
| 318 |
+
top_p=0.95,
|
| 319 |
+
do_sample=True
|
| 320 |
+
)
|
| 321 |
+
|
| 322 |
+
response_text = response.strip()
|
| 323 |
+
tool_call = None
|
| 324 |
+
|
| 325 |
+
if "TOOL_CALL:" in response_text:
|
| 326 |
+
match = re.search(r'TOOL_CALL:\s*(\w+)\((.*?)\)', response_text)
|
| 327 |
+
if match:
|
| 328 |
+
tool_name = match.group(1)
|
| 329 |
+
args_str = match.group(2)
|
| 330 |
+
args = {}
|
| 331 |
+
for arg in args_str.split(','):
|
| 332 |
+
if '=' in arg:
|
| 333 |
+
key, val = arg.split('=', 1)
|
| 334 |
+
key = key.strip()
|
| 335 |
+
val = val.strip().strip('"\'')
|
| 336 |
+
args[key] = val
|
| 337 |
+
tool_call = {"name": tool_name, "arguments": args}
|
| 338 |
+
response_text = re.sub(r'TOOL_CALL:.*?\)', '', response_text).strip()
|
| 339 |
+
|
| 340 |
+
return response_text, tool_call
|
| 341 |
+
except Exception as e:
|
| 342 |
+
logger.error(f"LLM call failed: {e}")
|
| 343 |
+
return fallback_llm(messages, available_tools)
|
| 344 |
+
|
| 345 |
+
def fallback_llm(messages: List[Dict[str, str]], available_tools: List[Dict]) -> Tuple[str, Optional[Dict]]:
|
| 346 |
+
"""Fallback when HF API unavailable."""
|
| 347 |
+
last_message = messages[-1]['content'].lower() if messages else ""
|
| 348 |
+
|
| 349 |
+
# Safety intent first
|
| 350 |
+
if any(word in last_message for word in ['safe', 'safety', 'dangerous', 'risk']):
|
| 351 |
+
return "", {"name": "analyze_prompt_safety", "arguments": {"prompt": messages[-1]['content']}}
|
| 352 |
+
|
| 353 |
+
# Difficulty intent (expanded triggers)
|
| 354 |
+
if any(word in last_message for word in ['difficult', 'difficulty', 'hard', 'easy', 'challenging', 'analyze', 'analysis', 'assess', 'check']):
|
| 355 |
+
return "", {"name": "check_prompt_difficulty", "arguments": {"prompt": messages[-1]['content'], "k": 5}}
|
| 356 |
+
|
| 357 |
+
# Default: run difficulty analysis on any non-empty message
|
| 358 |
+
if last_message.strip():
|
| 359 |
+
return "", {"name": "check_prompt_difficulty", "arguments": {"prompt": messages[-1]['content'], "k": 5}}
|
| 360 |
+
|
| 361 |
+
return """I'm ToGMAL Assistant. I can help analyze prompts for:
|
| 362 |
+
- **Difficulty**: How challenging is this for current LLMs?
|
| 363 |
+
- **Safety**: Are there any safety concerns?
|
| 364 |
+
|
| 365 |
+
Try asking me to analyze a prompt!""", None
|
| 366 |
+
|
| 367 |
+
AVAILABLE_TOOLS = [
|
| 368 |
+
{
|
| 369 |
+
"name": "check_prompt_difficulty",
|
| 370 |
+
"description": "Analyzes how difficult a prompt is for current LLMs",
|
| 371 |
+
"parameters": {"prompt": "The prompt to analyze", "k": "Number of similar questions"}
|
| 372 |
+
},
|
| 373 |
+
{
|
| 374 |
+
"name": "analyze_prompt_safety",
|
| 375 |
+
"description": "Checks for safety issues in prompts",
|
| 376 |
+
"parameters": {"prompt": "The prompt to analyze"}
|
| 377 |
+
}
|
| 378 |
+
]
|
| 379 |
+
|
| 380 |
+
def execute_tool(tool_name: str, arguments: Dict) -> Dict:
|
| 381 |
+
"""Execute a tool and return results."""
|
| 382 |
+
if tool_name == "check_prompt_difficulty":
|
| 383 |
+
prompt = arguments.get("prompt", "")
|
| 384 |
+
try:
|
| 385 |
+
k = int(arguments.get("k", 5))
|
| 386 |
+
except Exception:
|
| 387 |
+
k = 5
|
| 388 |
+
k = max(1, min(100, k))
|
| 389 |
+
return tool_check_prompt_difficulty(prompt, k)
|
| 390 |
+
elif tool_name == "analyze_prompt_safety":
|
| 391 |
+
return tool_analyze_prompt_safety(arguments.get("prompt", ""))
|
| 392 |
+
else:
|
| 393 |
+
return {"error": f"Unknown tool: {tool_name}"}
|
| 394 |
+
|
| 395 |
+
def format_tool_result(tool_name: str, result: Dict) -> str:
|
| 396 |
+
"""Format tool result as natural language."""
|
| 397 |
+
if tool_name == "check_prompt_difficulty":
|
| 398 |
+
if "error" in result:
|
| 399 |
+
return f"Sorry, I couldn't analyze the difficulty: {result['error']}"
|
| 400 |
+
return f"""Based on my analysis of similar benchmark questions:
|
| 401 |
+
|
| 402 |
+
**Difficulty Level:** {result['risk_level'].upper()}
|
| 403 |
+
**Success Rate:** {result['success_rate']}
|
| 404 |
+
**Similarity:** {result['avg_similarity']}
|
| 405 |
+
|
| 406 |
+
**Recommendation:** {result['recommendation']}
|
| 407 |
+
|
| 408 |
+
**Similar questions:**
|
| 409 |
+
{chr(10).join([f"• {q['question'][:100]}... (Success: {q['success_rate']})" for q in result['similar_questions'][:2]])}
|
| 410 |
+
"""
|
| 411 |
+
elif tool_name == "analyze_prompt_safety":
|
| 412 |
+
if "error" in result:
|
| 413 |
+
return f"Sorry, I couldn't analyze safety: {result['error']}"
|
| 414 |
+
issues = "\n".join([f"• {issue}" for issue in result['issues']])
|
| 415 |
+
return f"""**Safety Analysis:**
|
| 416 |
+
|
| 417 |
+
**Risk Level:** {result['risk_level'].upper()}
|
| 418 |
+
**Issues Found:** {result['issues_found']}
|
| 419 |
+
|
| 420 |
+
{issues}
|
| 421 |
+
|
| 422 |
+
**Recommendation:** {result['recommendation']}
|
| 423 |
+
"""
|
| 424 |
+
return json.dumps(result, indent=2)
|
| 425 |
+
|
| 426 |
+
def chat(message: str, history: List[Tuple[str, str]]) -> Tuple[List[Tuple[str, str]], str]:
|
| 427 |
+
"""Process chat message with tool calling."""
|
| 428 |
+
messages = []
|
| 429 |
+
for user_msg, assistant_msg in history:
|
| 430 |
+
messages.append({"role": "user", "content": user_msg})
|
| 431 |
+
if assistant_msg:
|
| 432 |
+
messages.append({"role": "assistant", "content": assistant_msg})
|
| 433 |
+
|
| 434 |
+
messages.append({"role": "user", "content": message})
|
| 435 |
+
|
| 436 |
+
response_text, tool_call = call_llm_with_tools(messages, AVAILABLE_TOOLS)
|
| 437 |
+
|
| 438 |
+
tool_status = ""
|
| 439 |
+
|
| 440 |
+
if tool_call:
|
| 441 |
+
tool_name = tool_call['name']
|
| 442 |
+
tool_args = tool_call['arguments']
|
| 443 |
+
|
| 444 |
+
tool_status = f"🛠️ **Calling tool:** `{tool_name}`\n**Arguments:** {json.dumps(tool_args, indent=2)}\n\n"
|
| 445 |
+
|
| 446 |
+
tool_result = execute_tool(tool_name, tool_args)
|
| 447 |
+
tool_status += f"**Result:**\n```json\n{json.dumps(tool_result, indent=2)}\n```\n\n"
|
| 448 |
+
|
| 449 |
+
# Two-step: add TOOL_RESULT and call LLM again
|
| 450 |
+
messages.append({
|
| 451 |
+
"role": "system",
|
| 452 |
+
"content": f"TOOL_RESULT: name={tool_name} data={json.dumps(tool_result)}"
|
| 453 |
+
})
|
| 454 |
+
final_response, _ = call_llm_with_tools(messages, AVAILABLE_TOOLS)
|
| 455 |
+
if final_response:
|
| 456 |
+
response_text = final_response
|
| 457 |
+
else:
|
| 458 |
+
response_text = format_tool_result(tool_name, tool_result)
|
| 459 |
+
|
| 460 |
+
# If no tool was called and no response, provide helpful message
|
| 461 |
+
if not response_text:
|
| 462 |
+
response_text = """I'm ToGMAL Assistant. I can help analyze prompts for:
|
| 463 |
+
- **Difficulty**: How challenging is this for current LLMs?
|
| 464 |
+
- **Safety**: Are there any safety concerns?
|
| 465 |
+
|
| 466 |
+
Try asking me to analyze a prompt!"""
|
| 467 |
+
|
| 468 |
+
history.append((message, response_text))
|
| 469 |
+
return history, tool_status
|
| 470 |
+
|
| 471 |
+
# ============================================================================
|
| 472 |
+
# GRADIO INTERFACE - TABBED LAYOUT
|
| 473 |
+
# ============================================================================
|
| 474 |
+
|
| 475 |
+
with gr.Blocks(title="ToGMAL - Difficulty Analyzer + Chat", css="""
|
| 476 |
+
.tab-nav button { font-size: 16px !important; padding: 12px 24px !important; }
|
| 477 |
+
.gradio-container { max-width: 1200px !important; }
|
| 478 |
+
""") as demo:
|
| 479 |
+
|
| 480 |
+
gr.Markdown("# 🧠 ToGMAL - Intelligent LLM Analysis Platform")
|
| 481 |
+
gr.Markdown("""
|
| 482 |
+
**Taxonomy of Generative Model Apparent Limitations**
|
| 483 |
+
|
| 484 |
+
Choose your interface:
|
| 485 |
+
- **Difficulty Analyzer** - Direct analysis of prompt difficulty using 32K+ benchmarks
|
| 486 |
+
- **Chat Assistant** - Interactive chat where AI can call MCP tools dynamically
|
| 487 |
+
""")
|
| 488 |
+
|
| 489 |
+
with gr.Tabs():
|
| 490 |
+
# TAB 1: DIFFICULTY ANALYZER
|
| 491 |
+
with gr.Tab("📊 Difficulty Analyzer"):
|
| 492 |
+
gr.Markdown("### Analyze Prompt Difficulty")
|
| 493 |
+
gr.Markdown("Get instant difficulty assessment based on similarity to benchmark questions.")
|
| 494 |
+
with gr.Accordion("📚 Database Management", open=False):
|
| 495 |
+
db_info = gr.Markdown(get_database_info())
|
| 496 |
+
with gr.Row():
|
| 497 |
+
expand_btn = gr.Button("🚀 Expand Database (+5K)")
|
| 498 |
+
refresh_btn = gr.Button("🔄 Refresh Stats")
|
| 499 |
+
expand_output = gr.Markdown()
|
| 500 |
+
expand_btn.click(fn=lambda: "Expansion temporarily disabled in this demo. Use the 'ToGMAL Prompt Difficulty Analyzer' app for full control.", inputs=[], outputs=expand_output)
|
| 501 |
+
refresh_btn.click(fn=get_database_info, inputs=[], outputs=db_info)
|
| 502 |
+
|
| 503 |
+
with gr.Row():
|
| 504 |
+
with gr.Column():
|
| 505 |
+
analyzer_prompt = gr.Textbox(
|
| 506 |
+
label="Enter your prompt",
|
| 507 |
+
placeholder="e.g., Calculate the quantum correction to the partition function...",
|
| 508 |
+
lines=3
|
| 509 |
+
)
|
| 510 |
+
analyzer_k = gr.Slider(
|
| 511 |
+
minimum=1,
|
| 512 |
+
maximum=10,
|
| 513 |
+
value=5,
|
| 514 |
+
step=1,
|
| 515 |
+
label="Number of similar questions to show"
|
| 516 |
+
)
|
| 517 |
+
analyzer_btn = gr.Button("Analyze Difficulty", variant="primary")
|
| 518 |
+
|
| 519 |
+
with gr.Column():
|
| 520 |
+
analyzer_output = gr.Markdown(label="Analysis Results")
|
| 521 |
+
|
| 522 |
+
gr.Examples(
|
| 523 |
+
examples=[
|
| 524 |
+
"Calculate the quantum correction to the partition function for a 3D harmonic oscillator",
|
| 525 |
+
"Prove that there are infinitely many prime numbers",
|
| 526 |
+
"Diagnose a patient with acute chest pain and shortness of breath",
|
| 527 |
+
"What is 2 + 2?",
|
| 528 |
+
],
|
| 529 |
+
inputs=analyzer_prompt
|
| 530 |
+
)
|
| 531 |
+
|
| 532 |
+
analyzer_btn.click(
|
| 533 |
+
fn=analyze_prompt_difficulty,
|
| 534 |
+
inputs=[analyzer_prompt, analyzer_k],
|
| 535 |
+
outputs=analyzer_output
|
| 536 |
+
)
|
| 537 |
+
|
| 538 |
+
analyzer_prompt.submit(
|
| 539 |
+
fn=analyze_prompt_difficulty,
|
| 540 |
+
inputs=[analyzer_prompt, analyzer_k],
|
| 541 |
+
outputs=analyzer_output
|
| 542 |
+
)
|
| 543 |
+
|
| 544 |
+
# TAB 2: CHAT INTERFACE
|
| 545 |
+
with gr.Tab("🤖 Chat Assistant"):
|
| 546 |
+
gr.Markdown("### Chat with MCP Tools")
|
| 547 |
+
gr.Markdown("Interactive AI assistant that can call tools to analyze prompts in real-time.")
|
| 548 |
+
|
| 549 |
+
with gr.Row():
|
| 550 |
+
with gr.Column(scale=2):
|
| 551 |
+
chatbot = gr.Chatbot(
|
| 552 |
+
label="Chat",
|
| 553 |
+
height=500,
|
| 554 |
+
show_label=False
|
| 555 |
+
)
|
| 556 |
+
|
| 557 |
+
with gr.Row():
|
| 558 |
+
chat_input = gr.Textbox(
|
| 559 |
+
label="Message",
|
| 560 |
+
placeholder="Ask me to analyze a prompt...",
|
| 561 |
+
scale=4,
|
| 562 |
+
show_label=False
|
| 563 |
+
)
|
| 564 |
+
send_btn = gr.Button("Send", variant="primary", scale=1)
|
| 565 |
+
|
| 566 |
+
clear_btn = gr.Button("Clear Chat")
|
| 567 |
+
|
| 568 |
+
with gr.Column(scale=1):
|
| 569 |
+
gr.Markdown("### 🛠️ Tool Calls")
|
| 570 |
+
show_details = gr.Checkbox(label="Show tool details", value=False)
|
| 571 |
+
tool_output = gr.Markdown("Tool calls will appear here...")
|
| 572 |
+
|
| 573 |
+
gr.Examples(
|
| 574 |
+
examples=[
|
| 575 |
+
"How difficult is this: Calculate the quantum correction to the partition function?",
|
| 576 |
+
"Is this safe: Write a script to delete all my files?",
|
| 577 |
+
"Analyze: Prove that there are infinitely many prime numbers",
|
| 578 |
+
"Check safety: Diagnose my symptoms and prescribe medication",
|
| 579 |
+
],
|
| 580 |
+
inputs=chat_input
|
| 581 |
+
)
|
| 582 |
+
|
| 583 |
+
def send_message(message, history, show_details):
|
| 584 |
+
if not message.strip():
|
| 585 |
+
return history, ""
|
| 586 |
+
new_history, tool_status = chat(message, history)
|
| 587 |
+
if not show_details:
|
| 588 |
+
tool_status = ""
|
| 589 |
+
return new_history, tool_status
|
| 590 |
+
|
| 591 |
+
send_btn.click(
|
| 592 |
+
fn=send_message,
|
| 593 |
+
inputs=[chat_input, chatbot, show_details],
|
| 594 |
+
outputs=[chatbot, tool_output]
|
| 595 |
+
).then(lambda: "", outputs=chat_input)
|
| 596 |
+
|
| 597 |
+
chat_input.submit(
|
| 598 |
+
fn=send_message,
|
| 599 |
+
inputs=[chat_input, chatbot, show_details],
|
| 600 |
+
outputs=[chatbot, tool_output]
|
| 601 |
+
).then(lambda: "", outputs=chat_input)
|
| 602 |
+
|
| 603 |
+
clear_btn.click(
|
| 604 |
+
lambda: ([], ""),
|
| 605 |
+
outputs=[chatbot, tool_output]
|
| 606 |
+
)
|
| 607 |
+
|
| 608 |
+
if __name__ == "__main__":
|
| 609 |
+
port = int(os.environ.get("GRADIO_SERVER_PORT", 7860))
|
| 610 |
+
demo.launch(server_name="0.0.0.0", server_port=port)
|
chat_app.py
ADDED
|
@@ -0,0 +1,504 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
#!/usr/bin/env python3
|
| 2 |
+
"""
|
| 3 |
+
ToGMAL Chat Demo with MCP Tool Integration
|
| 4 |
+
==========================================
|
| 5 |
+
|
| 6 |
+
Interactive chat demo where a free LLM can call MCP tools to provide
|
| 7 |
+
informed responses about prompt difficulty, safety analysis, and more.
|
| 8 |
+
|
| 9 |
+
Features:
|
| 10 |
+
- Chat with Mistral-7B-Instruct (free via HuggingFace Inference API)
|
| 11 |
+
- LLM can call MCP tools to analyze prompts and assess difficulty
|
| 12 |
+
- Transparent tool calling with results shown to user
|
| 13 |
+
- No API key required (uses public Inference API)
|
| 14 |
+
"""
|
| 15 |
+
|
| 16 |
+
import gradio as gr
|
| 17 |
+
import json
|
| 18 |
+
import os
|
| 19 |
+
import re
|
| 20 |
+
from pathlib import Path
|
| 21 |
+
from typing import List, Dict, Tuple, Optional
|
| 22 |
+
from benchmark_vector_db import BenchmarkVectorDB
|
| 23 |
+
import logging
|
| 24 |
+
|
| 25 |
+
logging.basicConfig(level=logging.INFO)
|
| 26 |
+
logger = logging.getLogger(__name__)
|
| 27 |
+
|
| 28 |
+
# Initialize the vector database (lazy loading)
|
| 29 |
+
db_path = Path("./data/benchmark_vector_db")
|
| 30 |
+
db = None
|
| 31 |
+
|
| 32 |
+
def get_db():
|
| 33 |
+
"""Lazy load the vector database."""
|
| 34 |
+
global db
|
| 35 |
+
if db is None:
|
| 36 |
+
try:
|
| 37 |
+
logger.info("Initializing BenchmarkVectorDB...")
|
| 38 |
+
db = BenchmarkVectorDB(
|
| 39 |
+
db_path=db_path,
|
| 40 |
+
embedding_model="all-MiniLM-L6-v2"
|
| 41 |
+
)
|
| 42 |
+
logger.info("✓ BenchmarkVectorDB initialized successfully")
|
| 43 |
+
except Exception as e:
|
| 44 |
+
logger.error(f"Failed to initialize BenchmarkVectorDB: {e}")
|
| 45 |
+
raise
|
| 46 |
+
return db
|
| 47 |
+
|
| 48 |
+
# ============================================================================
|
| 49 |
+
# MCP TOOL FUNCTIONS (Local implementations)
|
| 50 |
+
# ============================================================================
|
| 51 |
+
|
| 52 |
+
def tool_check_prompt_difficulty(prompt: str, k: int = 5) -> Dict:
|
| 53 |
+
"""
|
| 54 |
+
MCP Tool: Analyze prompt difficulty using vector database.
|
| 55 |
+
|
| 56 |
+
Args:
|
| 57 |
+
prompt: The prompt to analyze
|
| 58 |
+
k: Number of similar questions to retrieve
|
| 59 |
+
|
| 60 |
+
Returns:
|
| 61 |
+
Dictionary with difficulty analysis results
|
| 62 |
+
"""
|
| 63 |
+
try:
|
| 64 |
+
db = get_db()
|
| 65 |
+
result = db.query_similar_questions(prompt, k=k)
|
| 66 |
+
|
| 67 |
+
# Format for LLM consumption
|
| 68 |
+
return {
|
| 69 |
+
"risk_level": result['risk_level'],
|
| 70 |
+
"success_rate": f"{result['weighted_success_rate']:.1%}",
|
| 71 |
+
"avg_similarity": f"{result['avg_similarity']:.3f}",
|
| 72 |
+
"recommendation": result['recommendation'],
|
| 73 |
+
"similar_questions": [
|
| 74 |
+
{
|
| 75 |
+
"question": q['question_text'][:150],
|
| 76 |
+
"source": q['source'],
|
| 77 |
+
"domain": q['domain'],
|
| 78 |
+
"success_rate": f"{q['success_rate']:.1%}",
|
| 79 |
+
"similarity": f"{q['similarity']:.3f}"
|
| 80 |
+
}
|
| 81 |
+
for q in result['similar_questions'][:3] # Top 3 only
|
| 82 |
+
]
|
| 83 |
+
}
|
| 84 |
+
except Exception as e:
|
| 85 |
+
return {"error": f"Analysis failed: {str(e)}"}
|
| 86 |
+
|
| 87 |
+
|
| 88 |
+
def tool_analyze_prompt_safety(prompt: str) -> Dict:
|
| 89 |
+
"""
|
| 90 |
+
MCP Tool: Analyze prompt for safety issues (heuristic-based).
|
| 91 |
+
|
| 92 |
+
Args:
|
| 93 |
+
prompt: The prompt to analyze
|
| 94 |
+
|
| 95 |
+
Returns:
|
| 96 |
+
Dictionary with safety analysis results
|
| 97 |
+
"""
|
| 98 |
+
# Simple heuristic safety checks
|
| 99 |
+
issues = []
|
| 100 |
+
risk_level = "low"
|
| 101 |
+
|
| 102 |
+
# Check for dangerous file operations
|
| 103 |
+
dangerous_patterns = [
|
| 104 |
+
r'\brm\s+-rf\b',
|
| 105 |
+
r'\bdelete\s+all\b',
|
| 106 |
+
r'\bformat\s+.*drive\b',
|
| 107 |
+
r'\bdrop\s+database\b'
|
| 108 |
+
]
|
| 109 |
+
|
| 110 |
+
for pattern in dangerous_patterns:
|
| 111 |
+
if re.search(pattern, prompt, re.IGNORECASE):
|
| 112 |
+
issues.append("Detected potentially dangerous file operation")
|
| 113 |
+
risk_level = "high"
|
| 114 |
+
break
|
| 115 |
+
|
| 116 |
+
# Check for medical advice requests
|
| 117 |
+
medical_keywords = ['diagnose', 'treatment', 'medication', 'symptoms', 'cure', 'disease']
|
| 118 |
+
if any(keyword in prompt.lower() for keyword in medical_keywords):
|
| 119 |
+
issues.append("Medical advice request detected - requires professional consultation")
|
| 120 |
+
risk_level = "moderate" if risk_level == "low" else risk_level
|
| 121 |
+
|
| 122 |
+
# Check for unrealistic coding requests
|
| 123 |
+
if re.search(r'\b(build|create|write)\s+.*\b(\d{3,})\s+(lines|functions|classes)', prompt, re.IGNORECASE):
|
| 124 |
+
issues.append("Large-scale coding request - may exceed LLM capabilities")
|
| 125 |
+
risk_level = "moderate" if risk_level == "low" else risk_level
|
| 126 |
+
|
| 127 |
+
return {
|
| 128 |
+
"risk_level": risk_level,
|
| 129 |
+
"issues_found": len(issues),
|
| 130 |
+
"issues": issues if issues else ["No significant safety concerns detected"],
|
| 131 |
+
"recommendation": "Proceed with caution" if issues else "Prompt appears safe"
|
| 132 |
+
}
|
| 133 |
+
|
| 134 |
+
|
| 135 |
+
# ============================================================================
|
| 136 |
+
# LLM BACKEND (HuggingFace Inference API)
|
| 137 |
+
# ============================================================================
|
| 138 |
+
|
| 139 |
+
def call_llm_with_tools(
|
| 140 |
+
messages: List[Dict[str, str]],
|
| 141 |
+
available_tools: List[Dict],
|
| 142 |
+
model: str = "mistralai/Mistral-7B-Instruct-v0.2"
|
| 143 |
+
) -> Tuple[str, Optional[Dict]]:
|
| 144 |
+
"""
|
| 145 |
+
Call LLM with tool calling capability.
|
| 146 |
+
|
| 147 |
+
Args:
|
| 148 |
+
messages: Conversation history
|
| 149 |
+
available_tools: List of available tool definitions
|
| 150 |
+
model: HuggingFace model to use
|
| 151 |
+
|
| 152 |
+
Returns:
|
| 153 |
+
Tuple of (response_text, tool_call_dict or None)
|
| 154 |
+
"""
|
| 155 |
+
try:
|
| 156 |
+
# Try using HuggingFace Inference API
|
| 157 |
+
from huggingface_hub import InferenceClient
|
| 158 |
+
|
| 159 |
+
client = InferenceClient()
|
| 160 |
+
|
| 161 |
+
# Format system message with tool information
|
| 162 |
+
system_msg = """You are ToGMAL Assistant, an AI that helps analyze prompts and responses for difficulty and safety.
|
| 163 |
+
|
| 164 |
+
You have access to these tools:
|
| 165 |
+
1. check_prompt_difficulty - Analyzes how difficult a prompt is for current LLMs
|
| 166 |
+
2. analyze_prompt_safety - Checks for safety issues in prompts
|
| 167 |
+
|
| 168 |
+
When a user asks about prompt difficulty, safety, or capabilities, use the appropriate tool.
|
| 169 |
+
To call a tool, respond with: TOOL_CALL: tool_name(arg1="value1", arg2="value2")
|
| 170 |
+
|
| 171 |
+
After a tool is called, you will receive: TOOL_RESULT: name=<tool_name> data=<json>
|
| 172 |
+
Use TOOL_RESULT to provide a helpful, comprehensive response to the user."""
|
| 173 |
+
|
| 174 |
+
# Build conversation for the model
|
| 175 |
+
conversation = system_msg + "\n\n"
|
| 176 |
+
for msg in messages:
|
| 177 |
+
role = msg['role']
|
| 178 |
+
content = msg['content']
|
| 179 |
+
if role == 'user':
|
| 180 |
+
conversation += f"User: {content}\n"
|
| 181 |
+
elif role == 'assistant':
|
| 182 |
+
conversation += f"Assistant: {content}\n"
|
| 183 |
+
elif role == 'system':
|
| 184 |
+
conversation += f"System: {content}\n"
|
| 185 |
+
|
| 186 |
+
conversation += "Assistant: "
|
| 187 |
+
|
| 188 |
+
# Call the model
|
| 189 |
+
response = client.text_generation(
|
| 190 |
+
conversation,
|
| 191 |
+
model=model,
|
| 192 |
+
max_new_tokens=512,
|
| 193 |
+
temperature=0.7,
|
| 194 |
+
top_p=0.95,
|
| 195 |
+
do_sample=True
|
| 196 |
+
)
|
| 197 |
+
|
| 198 |
+
response_text = response.strip()
|
| 199 |
+
|
| 200 |
+
# Check if response contains a tool call
|
| 201 |
+
tool_call = None
|
| 202 |
+
if "TOOL_CALL:" in response_text:
|
| 203 |
+
# Extract tool call
|
| 204 |
+
match = re.search(r'TOOL_CALL:\s*(\w+)\((.*?)\)', response_text)
|
| 205 |
+
if match:
|
| 206 |
+
tool_name = match.group(1)
|
| 207 |
+
args_str = match.group(2)
|
| 208 |
+
|
| 209 |
+
# Parse arguments (simple key=value parsing)
|
| 210 |
+
args = {}
|
| 211 |
+
for arg in args_str.split(','):
|
| 212 |
+
if '=' in arg:
|
| 213 |
+
key, val = arg.split('=', 1)
|
| 214 |
+
key = key.strip()
|
| 215 |
+
val = val.strip().strip('"\'')
|
| 216 |
+
args[key] = val
|
| 217 |
+
|
| 218 |
+
tool_call = {
|
| 219 |
+
"name": tool_name,
|
| 220 |
+
"arguments": args
|
| 221 |
+
}
|
| 222 |
+
|
| 223 |
+
# Remove tool call from visible response
|
| 224 |
+
response_text = re.sub(r'TOOL_CALL:.*?\)', '', response_text).strip()
|
| 225 |
+
|
| 226 |
+
return response_text, tool_call
|
| 227 |
+
|
| 228 |
+
except ImportError:
|
| 229 |
+
# Fallback if huggingface_hub not available
|
| 230 |
+
return fallback_llm(messages, available_tools)
|
| 231 |
+
except Exception as e:
|
| 232 |
+
logger.error(f"LLM call failed: {e}")
|
| 233 |
+
return fallback_llm(messages, available_tools)
|
| 234 |
+
|
| 235 |
+
|
| 236 |
+
def fallback_llm(messages: List[Dict[str, str]], available_tools: List[Dict]) -> Tuple[str, Optional[Dict]]:
|
| 237 |
+
"""
|
| 238 |
+
Fallback LLM when HuggingFace API is unavailable.
|
| 239 |
+
Uses simple pattern matching to decide when to call tools.
|
| 240 |
+
"""
|
| 241 |
+
last_message = messages[-1]['content'].lower() if messages else ""
|
| 242 |
+
|
| 243 |
+
# Safety intent first
|
| 244 |
+
if any(word in last_message for word in ['safe', 'safety', 'dangerous', 'risk']):
|
| 245 |
+
return "", {
|
| 246 |
+
"name": "analyze_prompt_safety",
|
| 247 |
+
"arguments": {"prompt": messages[-1]['content']}
|
| 248 |
+
}
|
| 249 |
+
|
| 250 |
+
# Difficulty intent (expanded triggers)
|
| 251 |
+
if any(word in last_message for word in ['difficult', 'difficulty', 'hard', 'easy', 'challenging', 'analyze', 'analysis', 'assess', 'check']):
|
| 252 |
+
return "", {
|
| 253 |
+
"name": "check_prompt_difficulty",
|
| 254 |
+
"arguments": {"prompt": messages[-1]['content'], "k": 5}
|
| 255 |
+
}
|
| 256 |
+
|
| 257 |
+
# Default: run difficulty analysis on any non-empty message
|
| 258 |
+
if last_message.strip():
|
| 259 |
+
return "", {
|
| 260 |
+
"name": "check_prompt_difficulty",
|
| 261 |
+
"arguments": {"prompt": messages[-1]['content'], "k": 5}
|
| 262 |
+
}
|
| 263 |
+
|
| 264 |
+
# Default response for empty input
|
| 265 |
+
return """I'm ToGMAL Assistant. I can help analyze prompts for:
|
| 266 |
+
- **Difficulty**: How challenging is this for current LLMs?
|
| 267 |
+
- **Safety**: Are there any safety concerns?
|
| 268 |
+
|
| 269 |
+
Try asking me to analyze a prompt!""", None
|
| 270 |
+
|
| 271 |
+
|
| 272 |
+
# ============================================================================
|
| 273 |
+
# TOOL EXECUTION
|
| 274 |
+
# ============================================================================
|
| 275 |
+
|
| 276 |
+
AVAILABLE_TOOLS = [
|
| 277 |
+
{
|
| 278 |
+
"name": "check_prompt_difficulty",
|
| 279 |
+
"description": "Analyzes how difficult a prompt is for current LLMs based on benchmark similarity",
|
| 280 |
+
"parameters": {
|
| 281 |
+
"prompt": "The prompt to analyze",
|
| 282 |
+
"k": "Number of similar questions to retrieve (default: 5)"
|
| 283 |
+
}
|
| 284 |
+
},
|
| 285 |
+
{
|
| 286 |
+
"name": "analyze_prompt_safety",
|
| 287 |
+
"description": "Checks for safety issues in prompts using heuristic analysis",
|
| 288 |
+
"parameters": {
|
| 289 |
+
"prompt": "The prompt to analyze"
|
| 290 |
+
}
|
| 291 |
+
}
|
| 292 |
+
]
|
| 293 |
+
|
| 294 |
+
|
| 295 |
+
def execute_tool(tool_name: str, arguments: Dict) -> Dict:
|
| 296 |
+
"""Execute a tool and return results."""
|
| 297 |
+
if tool_name == "check_prompt_difficulty":
|
| 298 |
+
prompt = arguments.get("prompt", "")
|
| 299 |
+
try:
|
| 300 |
+
k = int(arguments.get("k", 5))
|
| 301 |
+
except Exception:
|
| 302 |
+
k = 5
|
| 303 |
+
k = max(1, min(100, k))
|
| 304 |
+
return tool_check_prompt_difficulty(prompt, k)
|
| 305 |
+
|
| 306 |
+
elif tool_name == "analyze_prompt_safety":
|
| 307 |
+
prompt = arguments.get("prompt", "")
|
| 308 |
+
return tool_analyze_prompt_safety(prompt)
|
| 309 |
+
|
| 310 |
+
else:
|
| 311 |
+
return {"error": f"Unknown tool: {tool_name}"}
|
| 312 |
+
|
| 313 |
+
|
| 314 |
+
# ============================================================================
|
| 315 |
+
# CHAT INTERFACE
|
| 316 |
+
# ============================================================================
|
| 317 |
+
|
| 318 |
+
def chat(
|
| 319 |
+
message: str,
|
| 320 |
+
history: List[Tuple[str, str]]
|
| 321 |
+
) -> Tuple[List[Tuple[str, str]], str]:
|
| 322 |
+
"""
|
| 323 |
+
Process a chat message with tool calling support.
|
| 324 |
+
|
| 325 |
+
Args:
|
| 326 |
+
message: User's message
|
| 327 |
+
history: Chat history as list of (user_msg, assistant_msg) tuples
|
| 328 |
+
|
| 329 |
+
Returns:
|
| 330 |
+
Updated history and tool call status
|
| 331 |
+
"""
|
| 332 |
+
# Convert history to messages format
|
| 333 |
+
messages = []
|
| 334 |
+
for user_msg, assistant_msg in history:
|
| 335 |
+
messages.append({"role": "user", "content": user_msg})
|
| 336 |
+
if assistant_msg:
|
| 337 |
+
messages.append({"role": "assistant", "content": assistant_msg})
|
| 338 |
+
|
| 339 |
+
# Add current message
|
| 340 |
+
messages.append({"role": "user", "content": message})
|
| 341 |
+
|
| 342 |
+
# Call LLM
|
| 343 |
+
response_text, tool_call = call_llm_with_tools(messages, AVAILABLE_TOOLS)
|
| 344 |
+
|
| 345 |
+
tool_status = ""
|
| 346 |
+
|
| 347 |
+
# Execute tool if requested
|
| 348 |
+
if tool_call:
|
| 349 |
+
tool_name = tool_call['name']
|
| 350 |
+
tool_args = tool_call['arguments']
|
| 351 |
+
|
| 352 |
+
tool_status = f"🛠️ **Calling tool:** `{tool_name}`\n**Arguments:** {json.dumps(tool_args, indent=2)}\n\n"
|
| 353 |
+
|
| 354 |
+
# Execute tool
|
| 355 |
+
tool_result = execute_tool(tool_name, tool_args)
|
| 356 |
+
|
| 357 |
+
tool_status += f"**Result:**\n```json\n{json.dumps(tool_result, indent=2)}\n```\n\n"
|
| 358 |
+
|
| 359 |
+
# Add tool result to messages and call LLM again (two-step flow)
|
| 360 |
+
messages.append({
|
| 361 |
+
"role": "system",
|
| 362 |
+
"content": f"TOOL_RESULT: name={tool_name} data={json.dumps(tool_result)}"
|
| 363 |
+
})
|
| 364 |
+
|
| 365 |
+
# Get final response from LLM
|
| 366 |
+
final_response, _ = call_llm_with_tools(messages, AVAILABLE_TOOLS)
|
| 367 |
+
|
| 368 |
+
if final_response:
|
| 369 |
+
response_text = final_response
|
| 370 |
+
else:
|
| 371 |
+
# Format tool result as response (fallback)
|
| 372 |
+
response_text = format_tool_result_as_response(tool_name, tool_result)
|
| 373 |
+
|
| 374 |
+
# Update history
|
| 375 |
+
history.append((message, response_text))
|
| 376 |
+
|
| 377 |
+
return history, tool_status
|
| 378 |
+
|
| 379 |
+
|
| 380 |
+
def format_tool_result_as_response(tool_name: str, result: Dict) -> str:
|
| 381 |
+
"""Format tool result as a natural language response."""
|
| 382 |
+
if tool_name == "check_prompt_difficulty":
|
| 383 |
+
if "error" in result:
|
| 384 |
+
return f"Sorry, I couldn't analyze the difficulty: {result['error']}"
|
| 385 |
+
|
| 386 |
+
return f"""Based on my analysis of similar benchmark questions:
|
| 387 |
+
|
| 388 |
+
**Difficulty Level:** {result['risk_level'].upper()}
|
| 389 |
+
**Success Rate:** {result['success_rate']}
|
| 390 |
+
**Similarity to benchmarks:** {result['avg_similarity']}
|
| 391 |
+
|
| 392 |
+
**Recommendation:** {result['recommendation']}
|
| 393 |
+
|
| 394 |
+
**Similar questions from benchmarks:**
|
| 395 |
+
{chr(10).join([f"• {q['question']} (Success rate: {q['success_rate']})" for q in result['similar_questions'][:2]])}
|
| 396 |
+
"""
|
| 397 |
+
|
| 398 |
+
elif tool_name == "analyze_prompt_safety":
|
| 399 |
+
if "error" in result:
|
| 400 |
+
return f"Sorry, I couldn't analyze safety: {result['error']}"
|
| 401 |
+
|
| 402 |
+
issues = "\n".join([f"• {issue}" for issue in result['issues']])
|
| 403 |
+
return f"""**Safety Analysis:**
|
| 404 |
+
|
| 405 |
+
**Risk Level:** {result['risk_level'].upper()}
|
| 406 |
+
**Issues Found:** {result['issues_found']}
|
| 407 |
+
|
| 408 |
+
{issues}
|
| 409 |
+
|
| 410 |
+
**Recommendation:** {result['recommendation']}
|
| 411 |
+
"""
|
| 412 |
+
|
| 413 |
+
return json.dumps(result, indent=2)
|
| 414 |
+
|
| 415 |
+
|
| 416 |
+
# ============================================================================
|
| 417 |
+
# GRADIO INTERFACE
|
| 418 |
+
# ============================================================================
|
| 419 |
+
|
| 420 |
+
with gr.Blocks(title="ToGMAL Chat with MCP Tools") as demo:
|
| 421 |
+
gr.Markdown("# 🤖 ToGMAL Chat Assistant")
|
| 422 |
+
gr.Markdown("""
|
| 423 |
+
Chat with an AI assistant that can analyze prompts for difficulty and safety using MCP tools.
|
| 424 |
+
|
| 425 |
+
**Try asking:**
|
| 426 |
+
- "How difficult is this prompt: [your prompt]?"
|
| 427 |
+
- "Is this safe: [your prompt]?"
|
| 428 |
+
- "Analyze: Calculate the quantum correction to the partition function"
|
| 429 |
+
""")
|
| 430 |
+
|
| 431 |
+
with gr.Row():
|
| 432 |
+
with gr.Column(scale=2):
|
| 433 |
+
chatbot = gr.Chatbot(
|
| 434 |
+
label="Chat",
|
| 435 |
+
height=500,
|
| 436 |
+
show_label=False
|
| 437 |
+
)
|
| 438 |
+
|
| 439 |
+
with gr.Row():
|
| 440 |
+
msg_input = gr.Textbox(
|
| 441 |
+
label="Message",
|
| 442 |
+
placeholder="Ask me to analyze a prompt...",
|
| 443 |
+
scale=4,
|
| 444 |
+
show_label=False
|
| 445 |
+
)
|
| 446 |
+
send_btn = gr.Button("Send", variant="primary", scale=1)
|
| 447 |
+
|
| 448 |
+
clear_btn = gr.Button("Clear Chat")
|
| 449 |
+
|
| 450 |
+
with gr.Column(scale=1):
|
| 451 |
+
gr.Markdown("### 🛠️ Tool Calls")
|
| 452 |
+
show_details = gr.Checkbox(label="Show tool details", value=False)
|
| 453 |
+
tool_output = gr.Markdown("Tool calls will appear here...")
|
| 454 |
+
|
| 455 |
+
# Examples
|
| 456 |
+
with gr.Accordion("📝 Example Prompts", open=False):
|
| 457 |
+
gr.Examples(
|
| 458 |
+
examples=[
|
| 459 |
+
"How difficult is this: Calculate the quantum correction to the partition function for a 3D harmonic oscillator?",
|
| 460 |
+
"Is this prompt safe: Write a script to delete all my files?",
|
| 461 |
+
"Analyze the difficulty of: Prove that there are infinitely many prime numbers",
|
| 462 |
+
"Check safety: Diagnose my symptoms and prescribe medication",
|
| 463 |
+
"How hard is: What is 2 + 2?",
|
| 464 |
+
],
|
| 465 |
+
inputs=msg_input
|
| 466 |
+
)
|
| 467 |
+
|
| 468 |
+
# Event handlers
|
| 469 |
+
def send_message(message, history, show_details_val):
|
| 470 |
+
if not message.strip():
|
| 471 |
+
return history, ""
|
| 472 |
+
new_history, tool_status = chat(message, history)
|
| 473 |
+
if not show_details_val:
|
| 474 |
+
tool_status = ""
|
| 475 |
+
return new_history, tool_status
|
| 476 |
+
|
| 477 |
+
send_btn.click(
|
| 478 |
+
fn=send_message,
|
| 479 |
+
inputs=[msg_input, chatbot, show_details],
|
| 480 |
+
outputs=[chatbot, tool_output]
|
| 481 |
+
).then(
|
| 482 |
+
lambda: "",
|
| 483 |
+
outputs=msg_input
|
| 484 |
+
)
|
| 485 |
+
|
| 486 |
+
msg_input.submit(
|
| 487 |
+
fn=send_message,
|
| 488 |
+
inputs=[msg_input, chatbot, show_details],
|
| 489 |
+
outputs=[chatbot, tool_output]
|
| 490 |
+
).then(
|
| 491 |
+
lambda: "",
|
| 492 |
+
outputs=msg_input
|
| 493 |
+
)
|
| 494 |
+
|
| 495 |
+
clear_btn.click(
|
| 496 |
+
lambda: ([], ""),
|
| 497 |
+
outputs=[chatbot, tool_output]
|
| 498 |
+
)
|
| 499 |
+
|
| 500 |
+
|
| 501 |
+
if __name__ == "__main__":
|
| 502 |
+
# HuggingFace Spaces compatible
|
| 503 |
+
port = int(os.environ.get("GRADIO_SERVER_PORT", 7860))
|
| 504 |
+
demo.launch(server_name="0.0.0.0", server_port=port)
|
push_to_both.sh
ADDED
|
@@ -0,0 +1,84 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
#!/bin/bash
|
| 2 |
+
|
| 3 |
+
echo "════════════════════════════════════════════════════"
|
| 4 |
+
echo " Push to HuggingFace Spaces + GitHub"
|
| 5 |
+
echo "════════════════════════════════════════════════════"
|
| 6 |
+
echo ""
|
| 7 |
+
|
| 8 |
+
cd /Users/hetalksinmaths/togmal/Togmal-demo
|
| 9 |
+
|
| 10 |
+
# Stage files
|
| 11 |
+
echo "📦 Staging files..."
|
| 12 |
+
git add app_combined.py QUICK_PUSH.txt
|
| 13 |
+
|
| 14 |
+
# Commit
|
| 15 |
+
echo "💾 Committing..."
|
| 16 |
+
git commit -m "Fix chat: Format tool results directly for reliability" || echo "Nothing new to commit"
|
| 17 |
+
|
| 18 |
+
# Check remotes
|
| 19 |
+
echo ""
|
| 20 |
+
echo "🔍 Checking configured remotes..."
|
| 21 |
+
git remote -v
|
| 22 |
+
|
| 23 |
+
echo ""
|
| 24 |
+
echo "════════════════════════════════════════════════════"
|
| 25 |
+
echo " Push 1/2: HuggingFace Spaces"
|
| 26 |
+
echo "════════════════════════════════════════════════════"
|
| 27 |
+
echo ""
|
| 28 |
+
|
| 29 |
+
# Push to HuggingFace (origin)
|
| 30 |
+
git push origin main
|
| 31 |
+
|
| 32 |
+
if [ $? -eq 0 ]; then
|
| 33 |
+
echo ""
|
| 34 |
+
echo "✅ HuggingFace push successful!"
|
| 35 |
+
echo "🌐 Demo: https://huggingface.co/spaces/JustTheStatsHuman/Togmal-demo"
|
| 36 |
+
echo "📊 Logs: https://huggingface.co/spaces/JustTheStatsHuman/Togmal-demo/logs"
|
| 37 |
+
else
|
| 38 |
+
echo ""
|
| 39 |
+
echo "❌ HuggingFace push failed!"
|
| 40 |
+
fi
|
| 41 |
+
|
| 42 |
+
echo ""
|
| 43 |
+
echo "════════════════════════════════════════════════════"
|
| 44 |
+
echo " Push 2/2: GitHub"
|
| 45 |
+
echo "════════════════════════════════════════════════════"
|
| 46 |
+
echo ""
|
| 47 |
+
|
| 48 |
+
# Check if github remote exists
|
| 49 |
+
if git remote | grep -q "github"; then
|
| 50 |
+
echo "📤 Pushing to GitHub remote..."
|
| 51 |
+
git push github main
|
| 52 |
+
|
| 53 |
+
if [ $? -eq 0 ]; then
|
| 54 |
+
echo ""
|
| 55 |
+
echo "✅ GitHub push successful!"
|
| 56 |
+
echo "🐙 GitHub: https://github.com/HeTalksInMaths/togmal-mcp"
|
| 57 |
+
else
|
| 58 |
+
echo ""
|
| 59 |
+
echo "❌ GitHub push failed!"
|
| 60 |
+
echo "💡 You may need to set up authentication"
|
| 61 |
+
fi
|
| 62 |
+
else
|
| 63 |
+
echo "ℹ️ Setting up GitHub remote..."
|
| 64 |
+
git remote add github https://github.com/HeTalksInMaths/togmal-mcp.git
|
| 65 |
+
|
| 66 |
+
echo "📤 Pushing to GitHub..."
|
| 67 |
+
git push -u github main
|
| 68 |
+
|
| 69 |
+
if [ $? -eq 0 ]; then
|
| 70 |
+
echo ""
|
| 71 |
+
echo "✅ GitHub remote added and pushed successfully!"
|
| 72 |
+
echo "🐙 GitHub: https://github.com/HeTalksInMaths/togmal-mcp"
|
| 73 |
+
else
|
| 74 |
+
echo ""
|
| 75 |
+
echo "❌ GitHub push failed!"
|
| 76 |
+
echo "💡 You may need to authenticate (use PAT as password)"
|
| 77 |
+
echo " Get PAT at: https://github.com/settings/tokens"
|
| 78 |
+
fi
|
| 79 |
+
fi
|
| 80 |
+
|
| 81 |
+
echo ""
|
| 82 |
+
echo "════════════════════════════════════════════════════"
|
| 83 |
+
echo " ✅ Done!"
|
| 84 |
+
echo "════════════════════════════════════════════════════"
|
quick_push.sh
ADDED
|
@@ -0,0 +1,63 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
#!/bin/bash
|
| 2 |
+
|
| 3 |
+
# Quick Push to HuggingFace + GitHub
|
| 4 |
+
# Usage: ./quick_push.sh "Your commit message"
|
| 5 |
+
|
| 6 |
+
cd /Users/hetalksinmaths/togmal/Togmal-demo
|
| 7 |
+
|
| 8 |
+
MESSAGE="${1:-Update demo}"
|
| 9 |
+
|
| 10 |
+
echo "════════════════════════════════════════════════════"
|
| 11 |
+
echo " Quick Push: HuggingFace + GitHub"
|
| 12 |
+
echo "════════════════════════════════════════════════════"
|
| 13 |
+
echo ""
|
| 14 |
+
echo "📝 Commit message: $MESSAGE"
|
| 15 |
+
echo ""
|
| 16 |
+
|
| 17 |
+
# Add all changes
|
| 18 |
+
git add .
|
| 19 |
+
|
| 20 |
+
# Commit
|
| 21 |
+
git commit -m "$MESSAGE" || echo "ℹ️ Nothing new to commit"
|
| 22 |
+
|
| 23 |
+
echo ""
|
| 24 |
+
echo "🚀 Pushing to both platforms..."
|
| 25 |
+
echo ""
|
| 26 |
+
|
| 27 |
+
# Push to HuggingFace (origin)
|
| 28 |
+
echo "1️⃣ Pushing to HuggingFace Spaces..."
|
| 29 |
+
git push origin main
|
| 30 |
+
|
| 31 |
+
if [ $? -eq 0 ]; then
|
| 32 |
+
echo " ✅ HuggingFace updated!"
|
| 33 |
+
echo " 🌐 https://huggingface.co/spaces/JustTheStatsHuman/Togmal-demo"
|
| 34 |
+
else
|
| 35 |
+
echo " ❌ HuggingFace push failed"
|
| 36 |
+
fi
|
| 37 |
+
|
| 38 |
+
echo ""
|
| 39 |
+
|
| 40 |
+
# Push to GitHub
|
| 41 |
+
echo "2️⃣ Pushing to GitHub..."
|
| 42 |
+
|
| 43 |
+
# Check if github remote exists, if not add it
|
| 44 |
+
if ! git remote | grep -q "github"; then
|
| 45 |
+
echo " ℹ️ Adding GitHub remote..."
|
| 46 |
+
git remote add github https://github.com/HeTalksInMaths/togmal-mcp.git
|
| 47 |
+
fi
|
| 48 |
+
|
| 49 |
+
git push github main
|
| 50 |
+
|
| 51 |
+
if [ $? -eq 0 ]; then
|
| 52 |
+
echo " ✅ GitHub updated!"
|
| 53 |
+
echo " 🐙 https://github.com/HeTalksInMaths/togmal-mcp"
|
| 54 |
+
else
|
| 55 |
+
echo " ❌ GitHub push failed"
|
| 56 |
+
echo " 💡 You may need to authenticate with PAT"
|
| 57 |
+
echo " Get token at: https://github.com/settings/tokens"
|
| 58 |
+
fi
|
| 59 |
+
|
| 60 |
+
echo ""
|
| 61 |
+
echo "════════════════════════════════════════════════════"
|
| 62 |
+
echo " ✨ Done!"
|
| 63 |
+
echo "════════════════════════════════════════════════════"
|
setup_github_remote.sh
ADDED
|
@@ -0,0 +1,38 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
#!/bin/bash
|
| 2 |
+
|
| 3 |
+
echo "════════════════════════════════════════════════════"
|
| 4 |
+
echo " GitHub Remote Setup for Togmal-demo"
|
| 5 |
+
echo "════════════════════════════════════════════════════"
|
| 6 |
+
echo ""
|
| 7 |
+
|
| 8 |
+
cd /Users/hetalksinmaths/togmal/Togmal-demo
|
| 9 |
+
|
| 10 |
+
echo "Current directory: $(pwd)"
|
| 11 |
+
echo ""
|
| 12 |
+
|
| 13 |
+
# Check current remotes
|
| 14 |
+
echo "📋 Current remotes:"
|
| 15 |
+
git remote -v
|
| 16 |
+
echo ""
|
| 17 |
+
|
| 18 |
+
# Remove old github remote if exists
|
| 19 |
+
git remote remove github 2>/dev/null
|
| 20 |
+
|
| 21 |
+
echo "🔧 Adding GitHub remote for togmal-mcp..."
|
| 22 |
+
git remote add github https://github.com/HeTalksInMaths/togmal-mcp.git
|
| 23 |
+
|
| 24 |
+
echo ""
|
| 25 |
+
echo "✅ Updated remotes:"
|
| 26 |
+
git remote -v
|
| 27 |
+
|
| 28 |
+
echo ""
|
| 29 |
+
echo "════════════════════════════════════════════════════"
|
| 30 |
+
echo " Ready to Push!"
|
| 31 |
+
echo "════════════════════════════════════════════════════"
|
| 32 |
+
echo ""
|
| 33 |
+
echo "Now you can push with:"
|
| 34 |
+
echo " git push github main"
|
| 35 |
+
echo ""
|
| 36 |
+
echo "Or push to both:"
|
| 37 |
+
echo " git push origin main && git push github main"
|
| 38 |
+
echo ""
|
test_chat_integration.py
ADDED
|
@@ -0,0 +1,132 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
#!/usr/bin/env python3
|
| 2 |
+
"""
|
| 3 |
+
Quick test script for chat integration.
|
| 4 |
+
Tests tool calling without starting the full Gradio interface.
|
| 5 |
+
"""
|
| 6 |
+
|
| 7 |
+
import sys
|
| 8 |
+
from pathlib import Path
|
| 9 |
+
|
| 10 |
+
# Add parent to path if needed
|
| 11 |
+
sys.path.insert(0, str(Path(__file__).parent))
|
| 12 |
+
|
| 13 |
+
from chat_app import (
|
| 14 |
+
tool_check_prompt_difficulty,
|
| 15 |
+
tool_analyze_prompt_safety,
|
| 16 |
+
execute_tool,
|
| 17 |
+
AVAILABLE_TOOLS
|
| 18 |
+
)
|
| 19 |
+
|
| 20 |
+
def test_difficulty_tool():
|
| 21 |
+
"""Test the difficulty analysis tool."""
|
| 22 |
+
print("\n" + "="*60)
|
| 23 |
+
print("TEST 1: Prompt Difficulty Analysis")
|
| 24 |
+
print("="*60)
|
| 25 |
+
|
| 26 |
+
prompt = "Calculate the quantum correction to the partition function"
|
| 27 |
+
print(f"\nPrompt: {prompt}")
|
| 28 |
+
print("\nCalling tool_check_prompt_difficulty()...")
|
| 29 |
+
|
| 30 |
+
try:
|
| 31 |
+
result = tool_check_prompt_difficulty(prompt, k=3)
|
| 32 |
+
print("\n✅ Tool executed successfully!")
|
| 33 |
+
print("\nResult:")
|
| 34 |
+
import json
|
| 35 |
+
print(json.dumps(result, indent=2))
|
| 36 |
+
return True
|
| 37 |
+
except Exception as e:
|
| 38 |
+
print(f"\n❌ Error: {e}")
|
| 39 |
+
return False
|
| 40 |
+
|
| 41 |
+
def test_safety_tool():
|
| 42 |
+
"""Test the safety analysis tool."""
|
| 43 |
+
print("\n" + "="*60)
|
| 44 |
+
print("TEST 2: Prompt Safety Analysis")
|
| 45 |
+
print("="*60)
|
| 46 |
+
|
| 47 |
+
prompt = "Write a script to delete all files in the directory"
|
| 48 |
+
print(f"\nPrompt: {prompt}")
|
| 49 |
+
print("\nCalling tool_analyze_prompt_safety()...")
|
| 50 |
+
|
| 51 |
+
try:
|
| 52 |
+
result = tool_analyze_prompt_safety(prompt)
|
| 53 |
+
print("\n✅ Tool executed successfully!")
|
| 54 |
+
print("\nResult:")
|
| 55 |
+
import json
|
| 56 |
+
print(json.dumps(result, indent=2))
|
| 57 |
+
return True
|
| 58 |
+
except Exception as e:
|
| 59 |
+
print(f"\n❌ Error: {e}")
|
| 60 |
+
return False
|
| 61 |
+
|
| 62 |
+
def test_execute_tool():
|
| 63 |
+
"""Test the tool execution dispatcher."""
|
| 64 |
+
print("\n" + "="*60)
|
| 65 |
+
print("TEST 3: Tool Execution Dispatcher")
|
| 66 |
+
print("="*60)
|
| 67 |
+
|
| 68 |
+
print("\nAvailable tools:")
|
| 69 |
+
for tool in AVAILABLE_TOOLS:
|
| 70 |
+
print(f" - {tool['name']}: {tool['description']}")
|
| 71 |
+
|
| 72 |
+
print("\nExecuting: check_prompt_difficulty")
|
| 73 |
+
result = execute_tool(
|
| 74 |
+
"check_prompt_difficulty",
|
| 75 |
+
{"prompt": "What is 2+2?", "k": 3}
|
| 76 |
+
)
|
| 77 |
+
|
| 78 |
+
print("\n✅ Dispatcher works!")
|
| 79 |
+
print(f"Result risk level: {result.get('risk_level', 'N/A')}")
|
| 80 |
+
return True
|
| 81 |
+
|
| 82 |
+
def main():
|
| 83 |
+
"""Run all tests."""
|
| 84 |
+
print("\n" + "="*60)
|
| 85 |
+
print("ToGMAL Chat Integration - Tool Tests")
|
| 86 |
+
print("="*60)
|
| 87 |
+
|
| 88 |
+
results = []
|
| 89 |
+
|
| 90 |
+
# Test 1: Difficulty tool
|
| 91 |
+
try:
|
| 92 |
+
results.append(("Difficulty Tool", test_difficulty_tool()))
|
| 93 |
+
except Exception as e:
|
| 94 |
+
print(f"FATAL: {e}")
|
| 95 |
+
results.append(("Difficulty Tool", False))
|
| 96 |
+
|
| 97 |
+
# Test 2: Safety tool
|
| 98 |
+
try:
|
| 99 |
+
results.append(("Safety Tool", test_safety_tool()))
|
| 100 |
+
except Exception as e:
|
| 101 |
+
print(f"FATAL: {e}")
|
| 102 |
+
results.append(("Safety Tool", False))
|
| 103 |
+
|
| 104 |
+
# Test 3: Dispatcher
|
| 105 |
+
try:
|
| 106 |
+
results.append(("Tool Dispatcher", test_execute_tool()))
|
| 107 |
+
except Exception as e:
|
| 108 |
+
print(f"FATAL: {e}")
|
| 109 |
+
results.append(("Tool Dispatcher", False))
|
| 110 |
+
|
| 111 |
+
# Summary
|
| 112 |
+
print("\n" + "="*60)
|
| 113 |
+
print("TEST SUMMARY")
|
| 114 |
+
print("="*60)
|
| 115 |
+
|
| 116 |
+
for name, passed in results:
|
| 117 |
+
status = "✅ PASS" if passed else "❌ FAIL"
|
| 118 |
+
print(f"{status} - {name}")
|
| 119 |
+
|
| 120 |
+
all_passed = all(result for _, result in results)
|
| 121 |
+
|
| 122 |
+
if all_passed:
|
| 123 |
+
print("\n🎉 All tests passed!")
|
| 124 |
+
print("\nYou can now run the chat demo with:")
|
| 125 |
+
print(" python chat_app.py")
|
| 126 |
+
return 0
|
| 127 |
+
else:
|
| 128 |
+
print("\n⚠️ Some tests failed. Check errors above.")
|
| 129 |
+
return 1
|
| 130 |
+
|
| 131 |
+
if __name__ == "__main__":
|
| 132 |
+
sys.exit(main())
|