File size: 2,886 Bytes
6d6de4b
431a9ef
 
 
 
 
 
299ea90
6d6de4b
431a9ef
6d6de4b
 
431a9ef
6d6de4b
431a9ef
6d6de4b
431a9ef
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
---
title: "Data Analysis App"
emoji: "πŸ“Š"
colorFrom: "indigo"
colorTo: "blue"
sdk: "streamlit"
sdk_version: "1.39.0"
app_file: src/streamlit_app.py
pinned: false
license: "mit"
---

# πŸ“Š Streamlit Data Analysis App (Gemini + Open-Source)

This Streamlit app lets you **upload CSV or Excel datasets**, automatically clean and preprocess them, create **quick visualizations**, and even get **AI-generated insights** powered by Gemini or open-source models.

---

## πŸš€ Features
βœ… Upload `.csv` or `.xlsx` datasets  
βœ… Automatic data cleaning & standardization  
βœ… Preprocessing pipeline (imputation, encoding, scaling)  
βœ… Quick visualizations (histogram, boxplot, correlation heatmap, etc.)  
βœ… Smart dataset summary and preview  
βœ… Optional **Gemini AI insights** for dataset interpretation  

---

## 🧠 LLM Integration (Optional)
You can enable AI-generated insights with **Gemini 2.0 Flash** or your own Hugging Face model.

### πŸ”‘ To configure:
1. Go to your Space’s **Settings β†’ Secrets** tab.  
2. Add the following: GEMINI_API_KEY = your_gemini_api_key
HF_TOKEN = your_huggingface_token # optional
3. Save, then **Restart your Space**.

If you don’t add an API key, the app will still work for data cleaning and visualization.

---

## πŸ› οΈ Deployment Notes
- **Runtime:** Python SDK  
- **SDK:** Streamlit  
- **File formats supported:** `.csv`, `.xlsx`  
- **Maximum file size:** 100 MB  
- **Recommended visibility:** Public (for full file upload support)  

---

## βš™οΈ Troubleshooting

### ❌ AxiosError: Request failed with status code 403
If you encounter this:
- Ensure your Space is **Public** (not Private).  
- Ensure `sdk: streamlit` and `app_file:` are correctly declared in the YAML metadata above.  
- Check that your **runtime** is β€œPython SDK”.  
- Recheck your **Gemini API Key** or token secrets.

### βœ… Fix Checklist
| Issue | Fix |
|-------|------|
| App fails to start | Verify `app_file` matches your actual Python filename |
| 403 Error | Make the Space public |
| API not found | Add key to **Settings β†’ Secrets** |
| File upload broken | Ensure `sdk: streamlit` and `runtime: python` |

---

## πŸ’‘ Example Workflow
1. Upload your dataset (e.g., `global_freelancers_raw.csv`).  
2. View the raw preview and cleaned data table.  
3. Generate preprocessing pipelines (e.g., median imputation + one-hot encoding).  
4. Visualize trends with histograms, boxplots, or heatmaps.  
5. (Optional) Ask Gemini for AI insights about correlations, patterns, or recommendations.

---

## 🧩 Tech Stack
- **Frontend:** Streamlit  
- **Backend:** Python (Pandas, NumPy, Scikit-learn)  
- **AI Models:** Gemini 2.0 Flash / open-source LLMs (Qwen, Mistral, etc.)  
- **Visualization:** Matplotlib, Seaborn  

---

## 🧾 License
MIT License Β© 2025  
You are free to use, modify, and share this app with attribution.

---