Spaces:

userx2000
/

cloudzy_ai_challenge

Sleeping

App Files Files Community

cloudzy_ai_challenge / README.md

matinsn2000

boilerplate for api gateways

c6706bd about 2 months ago

preview code

raw

history blame

7.05 kB

🧭 Cloudzy AI - Cloud Photo Management Service

A FastAPI-based cloud photo management service with AI tagging, captioning, and semantic search using FAISS.

🎯 Features

Photo Upload - Upload images with automatic metadata generation
AI Analysis - Automatic tag and caption generation
Semantic Search - FAISS-powered similarity search on embeddings
Image-to-Image Search - Find similar photos to a reference image
RESTful API - Full REST API with automatic documentation
Docker Support - Production-ready Docker and Docker Compose setup

🛠️ Tech Stack

Backend: FastAPI
Database: SQLModel + SQLite (PostgreSQL ready)
Search Engine: FAISS (Fast Approximate Nearest Neighbors)
Image Processing: Pillow
ORM: SQLModel
API Documentation: Swagger/OpenAPI

📋 Prerequisites

Python 3.10+
Docker & Docker Compose (optional)
2GB+ RAM for FAISS index

⚙️ Installation

Local Development

Clone and setup

cd image_embedder
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install dependencies

pip install -r requirements.txt

Create uploads directory

mkdir -p uploads

Run the server

uvicorn app.main:app --reload --host 0.0.0.0 --port 8000

Server will start at http://localhost:8000

Docker

# Build and run
docker compose up --build

# Run in background
docker compose up -d

# View logs
docker compose logs -f cloudzy_api

# Stop
docker compose down

🚀 API Endpoints

Upload Photo

POST /upload
Content-Type: multipart/form-data

# Returns:
{
  "id": 1,
  "filename": "photo_20231023_120000.jpg",
  "tags": ["nature", "landscape", "mountain"],
  "caption": "A beautiful nature photograph",
  "message": "Photo uploaded successfully with ID 1"
}

Get Photo Metadata

GET /photo/{id}

# Returns:
{
  "id": 1,
  "filename": "photo_20231023_120000.jpg",
  "tags": ["nature", "landscape"],
  "caption": "A beautiful landscape",
  "embedding": [0.123, -0.456, ...],  # 512-dim vector
  "created_at": "2023-10-23T12:00:00"
}

List All Photos

GET /photos?skip=0&limit=10

# Returns: List of photo objects with pagination

Semantic Search

GET /search?q=mountain&top_k=5

# Returns:
{
  "query": "mountain",
  "results": [
    {
      "photo_id": 1,
      "filename": "photo_1.jpg",
      "tags": ["nature", "mountain"],
      "caption": "Mountain landscape",
      "distance": 0.123
    },
    ...
  ],
  "total_results": 5
}

Image-to-Image Search

POST /search/image-to-image?reference_photo_id=1&top_k=5

# Returns similar photos to reference photo 1

Health Check

GET /health

# Returns service status and FAISS index stats

📚 API Documentation

Interactive Docs (Swagger UI):

http://localhost:8000/docs

Alternative Docs (ReDoc):

http://localhost:8000/redoc

🗂️ Project Structure

image_embedder/
├── app/
│   ├── __init__.py
│   ├── main.py                  # FastAPI app entry point
│   ├── database.py              # SQLModel engine + session
│   ├── models.py                # Photo database model
│   ├── schemas.py               # Pydantic response models
│   ├── ai_utils.py              # AI generation (tags, captions, embeddings)
│   ├── search_engine.py         # FAISS index manager
│   │
│   ├── routes/
│   │   ├── __init__.py
│   │   ├── upload.py            # POST /upload endpoint
│   │   ├── photo.py             # GET /photo/:id and /photos endpoints
│   │   └── search.py            # GET /search and image-to-image endpoints
│   │
│   └── utils/
│       ├── __init__.py
│       └── file_utils.py        # File saving and management
│
├── uploads/                     # Stored images (created at runtime)
├── faiss_index.bin              # FAISS index file (created at runtime)
├── photos.db                    # SQLite database (created at runtime)
│
├── requirements.txt             # Python dependencies
├── Dockerfile
├── docker-compose.yml
└── README.md

🔄 Development Workflow

Test Upload

# Use curl
curl -X POST -F "file=@/path/to/image.jpg" http://localhost:8000/upload

# Or use Python
import requests
with open("image.jpg", "rb") as f:
    response = requests.post(
        "http://localhost:8000/upload",
        files={"file": f}
    )
    print(response.json())

Test Search

# Query-based search
curl "http://localhost:8000/search?q=tree&top_k=5"

# Image-to-image search
curl -X POST "http://localhost:8000/search/image-to-image?reference_photo_id=1&top_k=5"

View Database

# Install sqlite3 CLI and view database
sqlite3 photos.db
> .tables
> SELECT * FROM photo;
> .quit

🧠 AI Features (Placeholder Phase)

Currently, AI functions use placeholder implementations:

Tags: Generated from filename patterns + random selection from common tags
Captions: Template-based generation from tags
Embeddings: Deterministic random vectors (reproducible from filename)

Upgrade Path (Production)

CLIP Integration (Recommended)
- Zero-shot image understanding
- Excellent for tagging and search
- ~1-2 sec per image on GPU
BLIP Integration (Alternative)
- Visual question answering
- Better captions
- ~2-3 sec per image on GPU
Fine-tuned Models
- Train on domain-specific data
- Improved accuracy
- Higher latency/complexity

📊 Performance Considerations

FAISS Index: Supports millions of embeddings
Database: SQLite suitable for 100k+ photos; PostgreSQL for larger scale
Embeddings: 512-dim vectors (adjustable)
Search: <100ms for 100k+ embeddings on CPU

🚨 Troubleshooting

FAISS Installation Issues

# If faiss-cpu fails, try:
pip install faiss-cpu==1.7.4 --no-cache-dir

SQLite Lock Error

# Restart the application or remove locked database
rm photos.db

Docker Build Issues

# Rebuild without cache
docker compose build --no-cache

🔐 Security Notes

⚠️ Currently no authentication - add for production
⚠️ CORS allows all origins - restrict for production
⚠️ File upload validation needed - add size limits
⚠️ Use PostgreSQL + proper secrets management for production

📝 Next Steps

✅ Core backend working
⬜ Add authentication (JWT)
⬜ Implement real AI models (CLIP/BLIP)
⬜ Add background job processing (Celery)
⬜ Frontend dashboard
⬜ Production deployment (Railway/AWS)

📄 License

MIT License

🤝 Contributing

Contributions welcome! Please test thoroughly before submitting.

Questions? Check the interactive docs at /docs or review the code comments.