Spaces:

userx2000
/

cloudzy_ai_challenge

Sleeping

App Files Files Community

matinsn2000 commited on Oct 23

Commit

c6706bd

1 Parent(s): a7942ef

boilerplate for api gateways

Browse files

Files changed (17) hide show

.gitignore +53 -0
README.md +303 -8
app.py +92 -4
cloudzy/__init__.py +1 -0
cloudzy/ai_utils.py +72 -0
cloudzy/database.py +26 -0
cloudzy/models.py +40 -0
cloudzy/routes/__init__.py +1 -0
cloudzy/routes/photo.py +69 -0
cloudzy/routes/search.py +125 -0
cloudzy/routes/upload.py +90 -0
cloudzy/schemas.py +49 -0
cloudzy/search_engine.py +85 -0
cloudzy/utils/__init__.py +1 -0
cloudzy/utils/file_utils.py +59 -0
requirements copy.txt +11 -0
requirements.txt +11 -2

.gitignore ADDED Viewed

	@@ -0,0 +1,53 @@

+# Python
+__pycache__/
+*.py[cod]
+*$py.class
+*.so
+.Python
+env/
+venv/
+ENV/
+build/
+develop-eggs/
+dist/
+downloads/
+eggs/
+.eggs/
+lib/
+lib64/
+parts/
+sdist/
+var/
+wheels/
+*.egg-info/
+.installed.cfg
+*.egg
+# IDEs
+.vscode/
+.idea/
+*.swp
+*.swo
+*~
+.DS_Store
+# Project specific
+uploads/
+photos.db
+faiss_index.bin
+*.db
+*.db-journal
+.env
+.env.local
+# Logs
+*.log
+logs/
+# Testing
+.pytest_cache/
+.coverage
+htmlcov/
+# Docker
+.dockerignore

README.md CHANGED Viewed

@@ -1,10 +1,305 @@
----
-title: Cloudzy Ai Challenge
-emoji: 🦀
-colorFrom: red
-colorTo: gray
-sdk: docker
-pinned: false
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

+# 🧭 Cloudzy AI - Cloud Photo Management Service
+A FastAPI-based cloud photo management service with AI tagging, captioning, and semantic search using FAISS.
+## 🎯 Features
+- **Photo Upload** - Upload images with automatic metadata generation
+- **AI Analysis** - Automatic tag and caption generation
+- **Semantic Search** - FAISS-powered similarity search on embeddings
+- **Image-to-Image Search** - Find similar photos to a reference image
+- **RESTful API** - Full REST API with automatic documentation
+- **Docker Support** - Production-ready Docker and Docker Compose setup
+## 🛠️ Tech Stack
+- **Backend**: FastAPI
+- **Database**: SQLModel + SQLite (PostgreSQL ready)
+- **Search Engine**: FAISS (Fast Approximate Nearest Neighbors)
+- **Image Processing**: Pillow
+- **ORM**: SQLModel
+- **API Documentation**: Swagger/OpenAPI
+## 📋 Prerequisites
+- Python 3.10+
+- Docker & Docker Compose (optional)
+- 2GB+ RAM for FAISS index
+## ⚙️ Installation
+### Local Development
+1. **Clone and setup**
+```bash
+cd image_embedder
+python -m venv venv
+source venv/bin/activate  # On Windows: venv\Scripts\activate
+```
+2. **Install dependencies**
+```bash
+pip install -r requirements.txt
+```
+3. **Create uploads directory**
+```bash
+mkdir -p uploads
+```
+4. **Run the server**
+```bash
+uvicorn app.main:app --reload --host 0.0.0.0 --port 8000
+```
+Server will start at `http://localhost:8000`
+### Docker
+```bash
+# Build and run
+docker compose up --build
+# Run in background
+docker compose up -d
+# View logs
+docker compose logs -f cloudzy_api
+# Stop
+docker compose down
+```
+## 🚀 API Endpoints
+### Upload Photo
+```bash
+POST /upload
+Content-Type: multipart/form-data
+# Returns:
+{
+  "id": 1,
+  "filename": "photo_20231023_120000.jpg",
+  "tags": ["nature", "landscape", "mountain"],
+  "caption": "A beautiful nature photograph",
+  "message": "Photo uploaded successfully with ID 1"
+}
+```
+### Get Photo Metadata
+```bash
+GET /photo/{id}
+# Returns:
+{
+  "id": 1,
+  "filename": "photo_20231023_120000.jpg",
+  "tags": ["nature", "landscape"],
+  "caption": "A beautiful landscape",
+  "embedding": [0.123, -0.456, ...],  # 512-dim vector
+  "created_at": "2023-10-23T12:00:00"
+}
+```
+### List All Photos
+```bash
+GET /photos?skip=0&limit=10
+# Returns: List of photo objects with pagination
+```
+### Semantic Search
+```bash
+GET /search?q=mountain&top_k=5
+# Returns:
+{
+  "query": "mountain",
+  "results": [
+    {
+      "photo_id": 1,
+      "filename": "photo_1.jpg",
+      "tags": ["nature", "mountain"],
+      "caption": "Mountain landscape",
+      "distance": 0.123
+    },
+    ...
+  ],
+  "total_results": 5
+}
+```
+### Image-to-Image Search
+```bash
+POST /search/image-to-image?reference_photo_id=1&top_k=5
+# Returns similar photos to reference photo 1
+```
+### Health Check
+```bash
+GET /health
+# Returns service status and FAISS index stats
+```
+## 📚 API Documentation
+**Interactive Docs (Swagger UI)**:
+```
+http://localhost:8000/docs
+```
+**Alternative Docs (ReDoc)**:
+```
+http://localhost:8000/redoc
+```
+## 🗂️ Project Structure
+```
+image_embedder/
+├── app/
+│   ├── __init__.py
+│   ├── main.py                  # FastAPI app entry point
+│   ├── database.py              # SQLModel engine + session
+│   ├── models.py                # Photo database model
+│   ├── schemas.py               # Pydantic response models
+│   ├── ai_utils.py              # AI generation (tags, captions, embeddings)
+│   ├── search_engine.py         # FAISS index manager
+│   │
+│   ├── routes/
+│   │   ├── __init__.py
+│   │   ├── upload.py            # POST /upload endpoint
+│   │   ├── photo.py             # GET /photo/:id and /photos endpoints
+│   │   └── search.py            # GET /search and image-to-image endpoints
+│   │
+│   └── utils/
+│       ├── __init__.py
+│       └── file_utils.py        # File saving and management
+│
+├── uploads/                     # Stored images (created at runtime)
+├── faiss_index.bin              # FAISS index file (created at runtime)
+├── photos.db                    # SQLite database (created at runtime)
+│
+├── requirements.txt             # Python dependencies
+├── Dockerfile
+├── docker-compose.yml
+└── README.md
+```
+## 🔄 Development Workflow
+### Test Upload
+```bash
+# Use curl
+curl -X POST -F "file=@/path/to/image.jpg" http://localhost:8000/upload
+# Or use Python
+import requests
+with open("image.jpg", "rb") as f:
+    response = requests.post(
+        "http://localhost:8000/upload",
+        files={"file": f}
+    )
+    print(response.json())
+```
+### Test Search
+```bash
+# Query-based search
+curl "http://localhost:8000/search?q=tree&top_k=5"
+# Image-to-image search
+curl -X POST "http://localhost:8000/search/image-to-image?reference_photo_id=1&top_k=5"
+```
+### View Database
+```bash
+# Install sqlite3 CLI and view database
+sqlite3 photos.db
+> .tables
+> SELECT * FROM photo;
+> .quit
+```
+## 🧠 AI Features (Placeholder Phase)
+Currently, AI functions use placeholder implementations:
+- **Tags**: Generated from filename patterns + random selection from common tags
+- **Captions**: Template-based generation from tags
+- **Embeddings**: Deterministic random vectors (reproducible from filename)
+### Upgrade Path (Production)
+1. **CLIP Integration** (Recommended)
+   - Zero-shot image understanding
+   - Excellent for tagging and search
+   - ~1-2 sec per image on GPU
+2. **BLIP Integration** (Alternative)
+   - Visual question answering
+   - Better captions
+   - ~2-3 sec per image on GPU
+3. **Fine-tuned Models**
+   - Train on domain-specific data
+   - Improved accuracy
+   - Higher latency/complexity
+## 📊 Performance Considerations
+- **FAISS Index**: Supports millions of embeddings
+- **Database**: SQLite suitable for 100k+ photos; PostgreSQL for larger scale
+- **Embeddings**: 512-dim vectors (adjustable)
+- **Search**: <100ms for 100k+ embeddings on CPU
+## 🚨 Troubleshooting
+### FAISS Installation Issues
+```bash
+# If faiss-cpu fails, try:
+pip install faiss-cpu==1.7.4 --no-cache-dir
+```
+### SQLite Lock Error
+```bash
+# Restart the application or remove locked database
+rm photos.db
+```
+### Docker Build Issues
+```bash
+# Rebuild without cache
+docker compose build --no-cache
+```
+## 🔐 Security Notes
+- ⚠️ Currently no authentication - add for production
+- ⚠️ CORS allows all origins - restrict for production
+- ⚠️ File upload validation needed - add size limits
+- ⚠️ Use PostgreSQL + proper secrets management for production
+## 📝 Next Steps
+1. ✅ Core backend working
+2. ⬜ Add authentication (JWT)
+3. ⬜ Implement real AI models (CLIP/BLIP)
+4. ⬜ Add background job processing (Celery)
+5. ⬜ Frontend dashboard
+6. ⬜ Production deployment (Railway/AWS)
+## 📄 License
+MIT License
+## 🤝 Contributing
+Contributions welcome! Please test thoroughly before submitting.
 ---
+**Questions?** Check the interactive docs at `/docs` or review the code comments.

app.py CHANGED Viewed

@@ -1,7 +1,95 @@
 from fastapi import FastAPI
-app = FastAPI()
-@app.get("/")
-def greet_json():
-    return {"Hello": "World!"}

+"""FastAPI application entry point"""
 from fastapi import FastAPI
+from fastapi.middleware.cors import CORSMiddleware
+from contextlib import asynccontextmanager
+from cloudzy.database import create_db_and_tables
+from cloudzy.routes import upload, photo, search
+from cloudzy.search_engine import SearchEngine
+# Initialize search engine at startup
+search_engine = None
+@asynccontextmanager
+async def lifespan(app: FastAPI):
+    """Manage app lifecycle - startup and shutdown"""
+    # Startup
+    print("🚀 Starting Cloudzy AI service...")
+    create_db_and_tables()
+    # Initialize search engine
+    global search_engine
+    search_engine = SearchEngine()
+    stats = search_engine.get_stats()
+    print(f"📊 FAISS Index loaded: {stats}")
+    print("✅ Application ready!")
+    yield
+    # Shutdown
+    print("🛑 Shutting down Cloudzy AI service...")
+# Create FastAPI app
+app = FastAPI(
+    title="Cloudzy AI",
+    description="Cloud photo management with AI tagging, captioning, and semantic search",
+    version="1.0.0",
+    lifespan=lifespan,
+)
+# Add CORS middleware
+app.add_middleware(
+    CORSMiddleware,
+    allow_origins=["*"],
+    allow_credentials=True,
+    allow_methods=["*"],
+    allow_headers=["*"],
+)
+# Include routers
+app.include_router(upload.router)
+app.include_router(photo.router)
+app.include_router(search.router)
+@app.get("/", tags=["info"])
+async def root():
+    """Root endpoint - API info"""
+    return {
+        "service": "Cloudzy AI",
+        "version": "1.0.0",
+        "description": "Cloud photo management with AI tagging, captioning, and semantic search",
+        "endpoints": {
+            "upload": "POST /upload - Upload a photo",
+            "get_photo": "GET /photo/{id} - Get photo metadata",
+            "list_photos": "GET /photos - List all photos",
+            "search": "GET /search?q=... - Semantic search",
+            "image_to_image": "POST /search/image-to-image - Similar images",
+            "docs": "/docs - Interactive API documentation",
+        }
+    }
+@app.get("/health", tags=["info"])
+async def health_check():
+    """Health check endpoint"""
+    global search_engine
+    stats = search_engine.get_stats() if search_engine else {}
+    return {
+        "status": "healthy",
+        "service": "Cloudzy AI",
+        "search_engine": stats,
+    }
+if __name__ == "__main__":
+    import uvicorn
+    uvicorn.run(
+        "app:app",
+        host="0.0.0.0",
+        port=8000,
+        reload=True,
+    )

cloudzy/__init__.py ADDED Viewed

	@@ -0,0 +1 @@


1	+ """Cloudzy AI - Cloud photo management service"""

cloudzy/ai_utils.py ADDED Viewed

	@@ -0,0 +1,72 @@

+"""AI utilities for generating tags, captions, and embeddings"""
+import numpy as np
+from typing import List, Tuple
+import random
+def generate_tags(filename: str) -> List[str]:
+    """
+    Generate tags for an image based on filename.
+    In production, this would use CLIP or similar models.
+    Currently using placeholder logic.
+    """
+    # Extract meaningful words from filename
+    name_parts = filename.lower().replace("_", " ").replace("-", " ").split()
+    name_parts = [p.replace(".jpg", "").replace(".png", "").replace(".jpeg", "")
+                  for p in name_parts if p]
+    # Common image tags for demo
+    common_tags = [
+        "photo", "image", "landscape", "portrait", "nature", "architecture",
+        "people", "animal", "food", "object", "abstract", "text", "sunset",
+        "mountain", "beach", "forest", "urban", "indoor", "outdoor"
+    ]
+    # Select random subset of common tags + filename parts
+    tags = list(set(name_parts[:2] + random.sample(common_tags, min(3, len(common_tags)))))
+    return tags[:5]  # Return up to 5 tags
+def generate_caption(filename: str, tags: List[str]) -> str:
+    """
+    Generate a caption for an image.
+    In production, this would use BLIP or similar models.
+    Currently using placeholder logic.
+    """
+    caption_templates = [
+        "A beautiful {tag} photograph",
+        "Captured moment: {tag}",
+        "Scenic view of {tag}",
+        "Amazing {tag} scene",
+        "Photography: {tag} collection",
+    ]
+    tag = tags[0] if tags else "image"
+    template = random.choice(caption_templates)
+    return template.format(tag=tag)
+def generate_embedding(filename: str, tags: List[str], caption: str) -> np.ndarray:
+    """
+    Generate a 512-dimensional embedding for semantic search.
+    In production, this would use CLIP or similar models.
+    Currently using placeholder random embeddings (reproducible from filename).
+    """
+    # Create a reproducible random embedding based on filename
+    # In production: use CLIP or similar to generate real embeddings
+    random.seed(hash(filename) % (2**32))
+    embedding = np.random.randn(512).astype(np.float32)
+    # Normalize to unit vector
+    embedding = embedding / np.linalg.norm(embedding)
+    return embedding
+def generate_filename_embedding(filename: str) -> np.ndarray:
+    """
+    Generate a deterministic embedding from filename for testing.
+    Ensures same filename always gets same embedding.
+    """
+    random.seed(hash(filename) % (2**32))
+    embedding = np.random.randn(512).astype(np.float32)
+    embedding = embedding / np.linalg.norm(embedding)
+    return embedding

cloudzy/database.py ADDED Viewed

	@@ -0,0 +1,26 @@

+"""Database configuration and session management"""
+from sqlmodel import SQLModel, create_engine, Session
+from typing import Generator
+import os
+DATABASE_URL = os.getenv("DATABASE_URL", "sqlite:///./photos.db")
+# SQLite-specific connect_args
+connect_args = {"check_same_thread": False} if "sqlite" in DATABASE_URL else {}
+engine = create_engine(
+    DATABASE_URL,
+    echo=False,
+    connect_args=connect_args,
+)
+def create_db_and_tables():
+    """Create all database tables"""
+    SQLModel.metadata.create_all(engine)
+def get_session() -> Generator[Session, None, None]:
+    """Dependency for getting database session"""
+    with Session(engine) as session:
+        yield session

cloudzy/models.py ADDED Viewed

	@@ -0,0 +1,40 @@

+"""SQLModel database models"""
+from sqlmodel import SQLModel, Field
+from typing import Optional
+from datetime import datetime
+import json
+class Photo(SQLModel, table=True):
+    """Photo metadata model"""
+    id: Optional[int] = Field(default=None, primary_key=True)
+    filename: str = Field(index=True)
+    filepath: str  # Full path to stored image
+    tags: str = Field(default="[]")  # JSON string of tags
+    caption: str = Field(default="")
+    embedding: Optional[str] = Field(default=None)  # JSON string of embedding vector
+    created_at: datetime = Field(default_factory=datetime.utcnow)
+    def get_tags(self) -> list[str]:
+        """Parse tags from JSON string"""
+        try:
+            return json.loads(self.tags)
+        except:
+            return []
+    def set_tags(self, tags: list[str]):
+        """Store tags as JSON string"""
+        self.tags = json.dumps(tags)
+    def get_embedding(self) -> Optional[list[float]]:
+        """Parse embedding from JSON string"""
+        try:
+            if self.embedding:
+                return json.loads(self.embedding)
+        except:
+            pass
+        return None
+    def set_embedding(self, embedding: list[float]):
+        """Store embedding as JSON string"""
+        self.embedding = json.dumps(embedding)

cloudzy/routes/__init__.py ADDED Viewed

	@@ -0,0 +1 @@


1	+ """API routes"""

cloudzy/routes/photo.py ADDED Viewed

	@@ -0,0 +1,69 @@

+"""Photo retrieval endpoints"""
+from fastapi import APIRouter, Depends, HTTPException
+from sqlmodel import Session, select
+from cloudzy.database import get_session
+from cloudzy.models import Photo
+from cloudzy.schemas import PhotoDetailResponse
+router = APIRouter(tags=["photos"])
+@router.get("/photo/{photo_id}", response_model=PhotoDetailResponse)
+async def get_photo(
+    photo_id: int,
+    session: Session = Depends(get_session),
+):
+    """
+    Get photo metadata by ID.
+    Returns: Photo metadata including tags, caption, embedding info
+    """
+    statement = select(Photo).where(Photo.id == photo_id)
+    photo = session.exec(statement).first()
+    if not photo:
+        raise HTTPException(status_code=404, detail=f"Photo {photo_id} not found")
+    return PhotoDetailResponse(
+        id=photo.id,
+        filename=photo.filename,
+        tags=photo.get_tags(),
+        caption=photo.caption,
+        embedding=photo.get_embedding(),
+        created_at=photo.created_at,
+    )
+@router.get("/photos", response_model=list[PhotoDetailResponse])
+async def list_photos(
+    skip: int = 0,
+    limit: int = 10,
+    session: Session = Depends(get_session),
+):
+    """
+    List all photos with pagination.
+    Args:
+        skip: Number of photos to skip (pagination)
+        limit: Max photos to return (default 10)
+    Returns: List of photo metadata
+    """
+    if limit > 100:
+        limit = 100  # Cap limit at 100
+    statement = select(Photo).offset(skip).limit(limit)
+    photos = session.exec(statement).all()
+    return [
+        PhotoDetailResponse(
+            id=photo.id,
+            filename=photo.filename,
+            tags=photo.get_tags(),
+            caption=photo.caption,
+            embedding=photo.get_embedding(),
+            created_at=photo.created_at,
+        )
+        for photo in photos
+    ]

cloudzy/routes/search.py ADDED Viewed

	@@ -0,0 +1,125 @@

+"""Semantic search endpoint using FAISS"""
+from fastapi import APIRouter, Query, Depends, HTTPException
+from sqlmodel import Session, select
+import numpy as np
+from cloudzy.database import get_session
+from cloudzy.models import Photo
+from cloudzy.schemas import SearchResponse, SearchResult
+from cloudzy.search_engine import SearchEngine
+from cloudzy.ai_utils import generate_filename_embedding
+router = APIRouter(tags=["search"])
+@router.get("/search", response_model=SearchResponse)
+async def search_photos(
+    q: str = Query(..., min_length=1, max_length=200, description="Search query"),
+    top_k: int = Query(5, ge=1, le=50, description="Number of results"),
+    session: Session = Depends(get_session),
+):
+    """
+    Semantic search for similar photos using FAISS.
+    Converts query to embedding and finds most similar images.
+    Args:
+        q: Search query (used to generate embedding)
+        top_k: Number of results to return (max 50)
+    Returns: List of similar photos with distance scores
+    """
+    # Generate embedding for query
+    query_embedding = generate_filename_embedding(q)
+    # Search in FAISS
+    search_engine = SearchEngine()
+    search_results = search_engine.search(query_embedding, top_k=top_k)
+    if not search_results:
+        return SearchResponse(
+            query=q,
+            results=[],
+            total_results=0,
+        )
+    # Fetch photo details from database
+    result_objects = []
+    for photo_id, distance in search_results:
+        statement = select(Photo).where(Photo.id == photo_id)
+        photo = session.exec(statement).first()
+        if photo:  # Only include if photo exists in DB
+            result_objects.append(
+                SearchResult(
+                    photo_id=photo.id,
+                    filename=photo.filename,
+                    tags=photo.get_tags(),
+                    caption=photo.caption,
+                    distance=distance,
+                )
+            )
+    return SearchResponse(
+        query=q,
+        results=result_objects,
+        total_results=len(result_objects),
+    )
+@router.post("/search/image-to-image")
+async def image_to_image_search(
+    reference_photo_id: int = Query(..., description="Reference photo ID"),
+    top_k: int = Query(5, ge=1, le=50),
+    session: Session = Depends(get_session),
+):
+    """
+    Find similar images to a reference photo (image-to-image search).
+    Args:
+        reference_photo_id: ID of the reference photo
+        top_k: Number of similar results
+    Returns: Similar photos
+    """
+    # Get reference photo
+    statement = select(Photo).where(Photo.id == reference_photo_id)
+    reference_photo = session.exec(statement).first()
+    if not reference_photo:
+        raise HTTPException(status_code=404, detail=f"Photo {reference_photo_id} not found")
+    # Get reference embedding
+    reference_embedding = reference_photo.get_embedding()
+    if not reference_embedding:
+        raise HTTPException(status_code=400, detail="Photo has no embedding")
+    # Search in FAISS
+    search_engine = SearchEngine()
+    search_results = search_engine.search(
+        np.array(reference_embedding, dtype=np.float32),
+        top_k=top_k + 1  # +1 to skip the reference photo itself
+    )
+    # Build results (skip first result which is the reference photo itself)
+    result_objects = []
+    for photo_id, distance in search_results[1:]:  # Skip first result
+        statement = select(Photo).where(Photo.id == photo_id)
+        photo = session.exec(statement).first()
+        if photo:
+            result_objects.append(
+                SearchResult(
+                    photo_id=photo.id,
+                    filename=photo.filename,
+                    tags=photo.get_tags(),
+                    caption=photo.caption,
+                    distance=distance,
+                )
+            )
+    return SearchResponse(
+        query=f"Similar to photo {reference_photo_id}",
+        results=result_objects[:top_k],
+        total_results=len(result_objects),
+    )

cloudzy/routes/upload.py ADDED Viewed

	@@ -0,0 +1,90 @@

+"""Upload endpoint for photos"""
+from fastapi import APIRouter, UploadFile, File, Depends, HTTPException, BackgroundTasks
+from sqlmodel import Session
+from pathlib import Path
+import numpy as np
+from cloudzy.database import get_session
+from cloudzy.models import Photo
+from cloudzy.schemas import UploadResponse
+from cloudzy.utils.file_utils import save_uploaded_file
+from cloudzy.ai_utils import generate_tags, generate_caption, generate_embedding
+from cloudzy.search_engine import SearchEngine
+router = APIRouter(tags=["photos"])
+# Allowed image extensions
+ALLOWED_EXTENSIONS = {".jpg", ".jpeg", ".png", ".gif", ".webp"}
+def validate_image_file(filename: str) -> bool:
+    """Check if file has valid image extension"""
+    return Path(filename).suffix.lower() in ALLOWED_EXTENSIONS
+@router.post("/upload", response_model=UploadResponse)
+async def upload_photo(
+    file: UploadFile = File(...),
+    session: Session = Depends(get_session),
+    background_tasks: BackgroundTasks = None,
+):
+    """
+    Upload a photo and analyze it with AI.
+    - Validates file type
+    - Saves file to disk
+    - Generates tags, caption, and embedding
+    - Stores metadata in database
+    - Indexes embedding in FAISS
+    Returns: Photo metadata with ID
+    """
+    # Validate file
+    if not file.filename:
+        raise HTTPException(status_code=400, detail="No filename provided")
+    if not validate_image_file(file.filename):
+        raise HTTPException(
+            status_code=400,
+            detail=f"Invalid file type. Allowed: {', '.join(ALLOWED_EXTENSIONS)}"
+        )
+    # Read file content
+    content = await file.read()
+    if not content:
+        raise HTTPException(status_code=400, detail="Empty file")
+    # Save file to disk
+    saved_filename = save_uploaded_file(content, file.filename)
+    filepath = f"uploads/{saved_filename}"
+    # Generate AI analysis
+    tags = generate_tags(file.filename)
+    caption = generate_caption(file.filename, tags)
+    embedding = generate_embedding(file.filename, tags, caption)
+    # Create photo record
+    photo = Photo(
+        filename=saved_filename,
+        filepath=filepath,
+        caption=caption,
+    )
+    photo.set_tags(tags)
+    photo.set_embedding(embedding.tolist())
+    # Save to database
+    session.add(photo)
+    session.commit()
+    session.refresh(photo)
+    # Index in FAISS (in background if needed)
+    search_engine = SearchEngine()
+    search_engine.add_embedding(photo.id, embedding)
+    return UploadResponse(
+        id=photo.id,
+        filename=saved_filename,
+        tags=tags,
+        caption=caption,
+        message=f"Photo uploaded successfully with ID {photo.id}"
+    )

cloudzy/schemas.py ADDED Viewed

	@@ -0,0 +1,49 @@

+"""Pydantic response schemas"""
+from pydantic import BaseModel
+from typing import Optional, List
+from datetime import datetime
+class PhotoResponse(BaseModel):
+    """Response model for photo metadata"""
+    id: int
+    filename: str
+    tags: List[str]
+    caption: str
+    created_at: datetime
+    class Config:
+        from_attributes = True
+class PhotoDetailResponse(PhotoResponse):
+    """Detailed photo response with embedding info"""
+    embedding: Optional[List[float]] = None
+class SearchResult(BaseModel):
+    """Search result with similarity score"""
+    photo_id: int
+    filename: str
+    tags: List[str]
+    caption: str
+    distance: float  # L2 distance (lower is more similar)
+    class Config:
+        from_attributes = True
+class SearchResponse(BaseModel):
+    """Response for search endpoint"""
+    query: str
+    results: List[SearchResult]
+    total_results: int
+class UploadResponse(BaseModel):
+    """Response after uploading a photo"""
+    id: int
+    filename: str
+    tags: List[str]
+    caption: str
+    message: str

cloudzy/search_engine.py ADDED Viewed

	@@ -0,0 +1,85 @@

+"""FAISS-based semantic search engine"""
+import faiss
+import numpy as np
+from typing import List, Tuple, Optional
+import os
+class SearchEngine:
+    """FAISS-based search engine for image embeddings"""
+    def __init__(self, dim: int = 512, index_path: str = "faiss_index.bin"):
+        self.dim = dim
+        self.index_path = index_path
+        self.id_map: List[int] = []  # Map FAISS indices to photo IDs
+        # Load existing index or create new one
+        if os.path.exists(index_path):
+            self.index = faiss.read_index(index_path)
+        else:
+            self.index = faiss.IndexFlatL2(dim)
+    def add_embedding(self, photo_id: int, embedding: np.ndarray) -> None:
+        """
+        Add an embedding to the index.
+        Args:
+            photo_id: Unique photo identifier
+            embedding: 1D numpy array of shape (dim,)
+        """
+        # Ensure embedding is float32 and correct shape
+        embedding = embedding.astype(np.float32).reshape(1, -1)
+        # Add to FAISS index
+        self.index.add(embedding)
+        # Track photo ID
+        self.id_map.append(photo_id)
+        # Save index to disk
+        self.save()
+    def search(self, query_embedding: np.ndarray, top_k: int = 5) -> List[Tuple[int, float]]:
+        """
+        Search for similar embeddings.
+        Args:
+            query_embedding: 1D numpy array of shape (dim,)
+            top_k: Number of results to return
+        Returns:
+            List of (photo_id, distance) tuples
+        """
+        if self.index.ntotal == 0:
+            return []
+        # Ensure query is float32 and correct shape
+        query_embedding = query_embedding.astype(np.float32).reshape(1, -1)
+        # Search in FAISS index
+        distances, indices = self.index.search(query_embedding, min(top_k, self.index.ntotal))
+        # Map back to photo IDs
+        results = [
+            (self.id_map[int(idx)], float(distance))
+            for distance, idx in zip(distances[0], indices[0])
+        ]
+        return results
+    def save(self) -> None:
+        """Save index to disk"""
+        faiss.write_index(self.index, self.index_path)
+    def load(self) -> None:
+        """Load index from disk"""
+        if os.path.exists(self.index_path):
+            self.index = faiss.read_index(self.index_path)
+    def get_stats(self) -> dict:
+        """Get index statistics"""
+        return {
+            "total_embeddings": self.index.ntotal,
+            "dimension": self.dim,
+            "id_map_size": len(self.id_map)
+        }

cloudzy/utils/__init__.py ADDED Viewed

	@@ -0,0 +1 @@


1	+ """Utility modules"""

cloudzy/utils/file_utils.py ADDED Viewed

	@@ -0,0 +1,59 @@

+"""File handling utilities"""
+import os
+import shutil
+from pathlib import Path
+from datetime import datetime
+UPLOAD_DIR = "uploads"
+def ensure_upload_dir():
+    """Ensure uploads directory exists"""
+    Path(UPLOAD_DIR).mkdir(exist_ok=True)
+def save_uploaded_file(file_content: bytes, original_filename: str) -> str:
+    """
+    Save uploaded file with timestamp to ensure uniqueness.
+    Args:
+        file_content: File bytes
+        original_filename: Original filename
+    Returns:
+        Saved filename
+    """
+    ensure_upload_dir()
+    # Generate unique filename with timestamp
+    timestamp = datetime.utcnow().strftime("%Y%m%d_%H%M%S_%f")[:-3]
+    name, ext = os.path.splitext(original_filename)
+    saved_filename = f"{name}_{timestamp}{ext}"
+    filepath = os.path.join(UPLOAD_DIR, saved_filename)
+    # Write file
+    with open(filepath, "wb") as f:
+        f.write(file_content)
+    return saved_filename
+def get_file_path(filename: str) -> str:
+    """Get full path for a saved file"""
+    return os.path.join(UPLOAD_DIR, filename)
+def file_exists(filename: str) -> bool:
+    """Check if a saved file exists"""
+    return os.path.exists(get_file_path(filename))
+def delete_file(filename: str) -> bool:
+    """Delete a saved file"""
+    filepath = get_file_path(filename)
+    if os.path.exists(filepath):
+        os.remove(filepath)
+        return True
+    return False

requirements copy.txt ADDED Viewed

	@@ -0,0 +1,11 @@

+fastapi==0.109.0
+uvicorn[standard]==0.27.0
+sqlmodel==0.0.16
+pillow==10.1.0
+numpy==1.26.3
+scikit-learn==1.3.2
+faiss-cpu==1.8.0
+python-multipart==0.0.6
+pydantic==2.6.1
+pydantic-settings==2.1.0
+setuptools>=68.0

requirements.txt CHANGED Viewed

@@ -1,2 +1,11 @@
-fastapi
-uvicorn[standard]

+fastapi==0.109.0
+uvicorn[standard]==0.27.0
+sqlmodel==0.0.16
+pillow==10.1.0
+numpy==1.26.3
+scikit-learn==1.3.2
+faiss-cpu==1.8.0
+python-multipart==0.0.6
+pydantic==2.6.1
+pydantic-settings==2.1.0
+setuptools>=68.0