Spaces:
Sleeping
Sleeping
π§ Cloudzy AI - Cloud Photo Management Service
A FastAPI-based cloud photo management service with AI tagging, captioning, and semantic search using FAISS.
π― Features
- Photo Upload - Upload images with automatic metadata generation
- AI Analysis - Automatic tag and caption generation
- Semantic Search - FAISS-powered similarity search on embeddings
- Image-to-Image Search - Find similar photos to a reference image
- RESTful API - Full REST API with automatic documentation
- Docker Support - Production-ready Docker and Docker Compose setup
π οΈ Tech Stack
- Backend: FastAPI
- Database: SQLModel + SQLite (PostgreSQL ready)
- Search Engine: FAISS (Fast Approximate Nearest Neighbors)
- Image Processing: Pillow
- ORM: SQLModel
- API Documentation: Swagger/OpenAPI
π Prerequisites
- Python 3.10+
- Docker & Docker Compose (optional)
- 2GB+ RAM for FAISS index
βοΈ Installation
Local Development
- Clone and setup
cd image_embedder
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
- Install dependencies
pip install -r requirements.txt
- Create uploads directory
mkdir -p uploads
- Run the server
uvicorn app.main:app --reload --host 0.0.0.0 --port 8000
Server will start at http://localhost:8000
Docker
# Build and run
docker compose up --build
# Run in background
docker compose up -d
# View logs
docker compose logs -f cloudzy_api
# Stop
docker compose down
π API Endpoints
Upload Photo
POST /upload
Content-Type: multipart/form-data
# Returns:
{
"id": 1,
"filename": "photo_20231023_120000.jpg",
"tags": ["nature", "landscape", "mountain"],
"caption": "A beautiful nature photograph",
"message": "Photo uploaded successfully with ID 1"
}
Get Photo Metadata
GET /photo/{id}
# Returns:
{
"id": 1,
"filename": "photo_20231023_120000.jpg",
"tags": ["nature", "landscape"],
"caption": "A beautiful landscape",
"embedding": [0.123, -0.456, ...], # 512-dim vector
"created_at": "2023-10-23T12:00:00"
}
List All Photos
GET /photos?skip=0&limit=10
# Returns: List of photo objects with pagination
Semantic Search
GET /search?q=mountain&top_k=5
# Returns:
{
"query": "mountain",
"results": [
{
"photo_id": 1,
"filename": "photo_1.jpg",
"tags": ["nature", "mountain"],
"caption": "Mountain landscape",
"distance": 0.123
},
...
],
"total_results": 5
}
Image-to-Image Search
POST /search/image-to-image?reference_photo_id=1&top_k=5
# Returns similar photos to reference photo 1
Health Check
GET /health
# Returns service status and FAISS index stats
π API Documentation
Interactive Docs (Swagger UI):
http://localhost:8000/docs
Alternative Docs (ReDoc):
http://localhost:8000/redoc
ποΈ Project Structure
image_embedder/
βββ app/
β βββ __init__.py
β βββ main.py # FastAPI app entry point
β βββ database.py # SQLModel engine + session
β βββ models.py # Photo database model
β βββ schemas.py # Pydantic response models
β βββ ai_utils.py # AI generation (tags, captions, embeddings)
β βββ search_engine.py # FAISS index manager
β β
β βββ routes/
β β βββ __init__.py
β β βββ upload.py # POST /upload endpoint
β β βββ photo.py # GET /photo/:id and /photos endpoints
β β βββ search.py # GET /search and image-to-image endpoints
β β
β βββ utils/
β βββ __init__.py
β βββ file_utils.py # File saving and management
β
βββ uploads/ # Stored images (created at runtime)
βββ faiss_index.bin # FAISS index file (created at runtime)
βββ photos.db # SQLite database (created at runtime)
β
βββ requirements.txt # Python dependencies
βββ Dockerfile
βββ docker-compose.yml
βββ README.md
π Development Workflow
Test Upload
# Use curl
curl -X POST -F "file=@/path/to/image.jpg" http://localhost:8000/upload
# Or use Python
import requests
with open("image.jpg", "rb") as f:
response = requests.post(
"http://localhost:8000/upload",
files={"file": f}
)
print(response.json())
Test Search
# Query-based search
curl "http://localhost:8000/search?q=tree&top_k=5"
# Image-to-image search
curl -X POST "http://localhost:8000/search/image-to-image?reference_photo_id=1&top_k=5"
View Database
# Install sqlite3 CLI and view database
sqlite3 photos.db
> .tables
> SELECT * FROM photo;
> .quit
π§ AI Features (Placeholder Phase)
Currently, AI functions use placeholder implementations:
- Tags: Generated from filename patterns + random selection from common tags
- Captions: Template-based generation from tags
- Embeddings: Deterministic random vectors (reproducible from filename)
Upgrade Path (Production)
CLIP Integration (Recommended)
- Zero-shot image understanding
- Excellent for tagging and search
- ~1-2 sec per image on GPU
BLIP Integration (Alternative)
- Visual question answering
- Better captions
- ~2-3 sec per image on GPU
Fine-tuned Models
- Train on domain-specific data
- Improved accuracy
- Higher latency/complexity
π Performance Considerations
- FAISS Index: Supports millions of embeddings
- Database: SQLite suitable for 100k+ photos; PostgreSQL for larger scale
- Embeddings: 512-dim vectors (adjustable)
- Search: <100ms for 100k+ embeddings on CPU
π¨ Troubleshooting
FAISS Installation Issues
# If faiss-cpu fails, try:
pip install faiss-cpu==1.7.4 --no-cache-dir
SQLite Lock Error
# Restart the application or remove locked database
rm photos.db
Docker Build Issues
# Rebuild without cache
docker compose build --no-cache
π Security Notes
- β οΈ Currently no authentication - add for production
- β οΈ CORS allows all origins - restrict for production
- β οΈ File upload validation needed - add size limits
- β οΈ Use PostgreSQL + proper secrets management for production
π Next Steps
- β Core backend working
- β¬ Add authentication (JWT)
- β¬ Implement real AI models (CLIP/BLIP)
- β¬ Add background job processing (Celery)
- β¬ Frontend dashboard
- β¬ Production deployment (Railway/AWS)
π License
MIT License
π€ Contributing
Contributions welcome! Please test thoroughly before submitting.
Questions? Check the interactive docs at /docs or review the code comments.