data_analysis / README.md
Starberry15's picture
Update README.md
299ea90 verified

A newer version of the Streamlit SDK is available: 1.51.0

Upgrade
metadata
title: Data Analysis App
emoji: πŸ“Š
colorFrom: indigo
colorTo: blue
sdk: streamlit
sdk_version: 1.39.0
app_file: src/streamlit_app.py
pinned: false
license: mit

πŸ“Š Streamlit Data Analysis App (Gemini + Open-Source)

This Streamlit app lets you upload CSV or Excel datasets, automatically clean and preprocess them, create quick visualizations, and even get AI-generated insights powered by Gemini or open-source models.


πŸš€ Features

βœ… Upload .csv or .xlsx datasets
βœ… Automatic data cleaning & standardization
βœ… Preprocessing pipeline (imputation, encoding, scaling)
βœ… Quick visualizations (histogram, boxplot, correlation heatmap, etc.)
βœ… Smart dataset summary and preview
βœ… Optional Gemini AI insights for dataset interpretation


🧠 LLM Integration (Optional)

You can enable AI-generated insights with Gemini 2.0 Flash or your own Hugging Face model.

πŸ”‘ To configure:

  1. Go to your Space’s Settings β†’ Secrets tab.
  2. Add the following: GEMINI_API_KEY = your_gemini_api_key HF_TOKEN = your_huggingface_token # optional
  3. Save, then Restart your Space.

If you don’t add an API key, the app will still work for data cleaning and visualization.


πŸ› οΈ Deployment Notes

  • Runtime: Python SDK
  • SDK: Streamlit
  • File formats supported: .csv, .xlsx
  • Maximum file size: 100 MB
  • Recommended visibility: Public (for full file upload support)

βš™οΈ Troubleshooting

❌ AxiosError: Request failed with status code 403

If you encounter this:

  • Ensure your Space is Public (not Private).
  • Ensure sdk: streamlit and app_file: are correctly declared in the YAML metadata above.
  • Check that your runtime is β€œPython SDK”.
  • Recheck your Gemini API Key or token secrets.

βœ… Fix Checklist

Issue Fix
App fails to start Verify app_file matches your actual Python filename
403 Error Make the Space public
API not found Add key to Settings β†’ Secrets
File upload broken Ensure sdk: streamlit and runtime: python

πŸ’‘ Example Workflow

  1. Upload your dataset (e.g., global_freelancers_raw.csv).
  2. View the raw preview and cleaned data table.
  3. Generate preprocessing pipelines (e.g., median imputation + one-hot encoding).
  4. Visualize trends with histograms, boxplots, or heatmaps.
  5. (Optional) Ask Gemini for AI insights about correlations, patterns, or recommendations.

🧩 Tech Stack

  • Frontend: Streamlit
  • Backend: Python (Pandas, NumPy, Scikit-learn)
  • AI Models: Gemini 2.0 Flash / open-source LLMs (Qwen, Mistral, etc.)
  • Visualization: Matplotlib, Seaborn

🧾 License

MIT License Β© 2025
You are free to use, modify, and share this app with attribution.