Spaces:
Running
Running
File size: 7,049 Bytes
57860a9 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 |
# π§ Cloudzy AI - Cloud Photo Management Service
A FastAPI-based cloud photo management service with AI tagging, captioning, and semantic search using FAISS.
## π― Features
- **Photo Upload** - Upload images with automatic metadata generation
- **AI Analysis** - Automatic tag and caption generation
- **Semantic Search** - FAISS-powered similarity search on embeddings
- **Image-to-Image Search** - Find similar photos to a reference image
- **RESTful API** - Full REST API with automatic documentation
- **Docker Support** - Production-ready Docker and Docker Compose setup
## π οΈ Tech Stack
- **Backend**: FastAPI
- **Database**: SQLModel + SQLite (PostgreSQL ready)
- **Search Engine**: FAISS (Fast Approximate Nearest Neighbors)
- **Image Processing**: Pillow
- **ORM**: SQLModel
- **API Documentation**: Swagger/OpenAPI
## π Prerequisites
- Python 3.10+
- Docker & Docker Compose (optional)
- 2GB+ RAM for FAISS index
## βοΈ Installation
### Local Development
1. **Clone and setup**
```bash
cd image_embedder
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
```
2. **Install dependencies**
```bash
pip install -r requirements.txt
```
3. **Create uploads directory**
```bash
mkdir -p uploads
```
4. **Run the server**
```bash
uvicorn app.main:app --reload --host 0.0.0.0 --port 8000
```
Server will start at `http://localhost:8000`
### Docker
```bash
# Build and run
docker compose up --build
# Run in background
docker compose up -d
# View logs
docker compose logs -f cloudzy_api
# Stop
docker compose down
```
## π API Endpoints
### Upload Photo
```bash
POST /upload
Content-Type: multipart/form-data
# Returns:
{
"id": 1,
"filename": "photo_20231023_120000.jpg",
"tags": ["nature", "landscape", "mountain"],
"caption": "A beautiful nature photograph",
"message": "Photo uploaded successfully with ID 1"
}
```
### Get Photo Metadata
```bash
GET /photo/{id}
# Returns:
{
"id": 1,
"filename": "photo_20231023_120000.jpg",
"tags": ["nature", "landscape"],
"caption": "A beautiful landscape",
"embedding": [0.123, -0.456, ...], # 512-dim vector
"created_at": "2023-10-23T12:00:00"
}
```
### List All Photos
```bash
GET /photos?skip=0&limit=10
# Returns: List of photo objects with pagination
```
### Semantic Search
```bash
GET /search?q=mountain&top_k=5
# Returns:
{
"query": "mountain",
"results": [
{
"photo_id": 1,
"filename": "photo_1.jpg",
"tags": ["nature", "mountain"],
"caption": "Mountain landscape",
"distance": 0.123
},
...
],
"total_results": 5
}
```
### Image-to-Image Search
```bash
POST /search/image-to-image?reference_photo_id=1&top_k=5
# Returns similar photos to reference photo 1
```
### Health Check
```bash
GET /health
# Returns service status and FAISS index stats
```
## π API Documentation
**Interactive Docs (Swagger UI)**:
```
http://localhost:8000/docs
```
**Alternative Docs (ReDoc)**:
```
http://localhost:8000/redoc
```
## ποΈ Project Structure
```
image_embedder/
βββ app/
β βββ __init__.py
β βββ main.py # FastAPI app entry point
β βββ database.py # SQLModel engine + session
β βββ models.py # Photo database model
β βββ schemas.py # Pydantic response models
β βββ ai_utils.py # AI generation (tags, captions, embeddings)
β βββ search_engine.py # FAISS index manager
β β
β βββ routes/
β β βββ __init__.py
β β βββ upload.py # POST /upload endpoint
β β βββ photo.py # GET /photo/:id and /photos endpoints
β β βββ search.py # GET /search and image-to-image endpoints
β β
β βββ utils/
β βββ __init__.py
β βββ file_utils.py # File saving and management
β
βββ uploads/ # Stored images (created at runtime)
βββ faiss_index.bin # FAISS index file (created at runtime)
βββ photos.db # SQLite database (created at runtime)
β
βββ requirements.txt # Python dependencies
βββ Dockerfile
βββ docker-compose.yml
βββ README.md
```
## π Development Workflow
### Test Upload
```bash
# Use curl
curl -X POST -F "file=@/path/to/image.jpg" http://localhost:8000/upload
# Or use Python
import requests
with open("image.jpg", "rb") as f:
response = requests.post(
"http://localhost:8000/upload",
files={"file": f}
)
print(response.json())
```
### Test Search
```bash
# Query-based search
curl "http://localhost:8000/search?q=tree&top_k=5"
# Image-to-image search
curl -X POST "http://localhost:8000/search/image-to-image?reference_photo_id=1&top_k=5"
```
### View Database
```bash
# Install sqlite3 CLI and view database
sqlite3 photos.db
> .tables
> SELECT * FROM photo;
> .quit
```
## π§ AI Features (Placeholder Phase)
Currently, AI functions use placeholder implementations:
- **Tags**: Generated from filename patterns + random selection from common tags
- **Captions**: Template-based generation from tags
- **Embeddings**: Deterministic random vectors (reproducible from filename)
### Upgrade Path (Production)
1. **CLIP Integration** (Recommended)
- Zero-shot image understanding
- Excellent for tagging and search
- ~1-2 sec per image on GPU
2. **BLIP Integration** (Alternative)
- Visual question answering
- Better captions
- ~2-3 sec per image on GPU
3. **Fine-tuned Models**
- Train on domain-specific data
- Improved accuracy
- Higher latency/complexity
## π Performance Considerations
- **FAISS Index**: Supports millions of embeddings
- **Database**: SQLite suitable for 100k+ photos; PostgreSQL for larger scale
- **Embeddings**: 512-dim vectors (adjustable)
- **Search**: <100ms for 100k+ embeddings on CPU
## π¨ Troubleshooting
### FAISS Installation Issues
```bash
# If faiss-cpu fails, try:
pip install faiss-cpu==1.7.4 --no-cache-dir
```
### SQLite Lock Error
```bash
# Restart the application or remove locked database
rm photos.db
```
### Docker Build Issues
```bash
# Rebuild without cache
docker compose build --no-cache
```
## π Security Notes
- β οΈ Currently no authentication - add for production
- β οΈ CORS allows all origins - restrict for production
- β οΈ File upload validation needed - add size limits
- β οΈ Use PostgreSQL + proper secrets management for production
## π Next Steps
1. β
Core backend working
2. β¬ Add authentication (JWT)
3. β¬ Implement real AI models (CLIP/BLIP)
4. β¬ Add background job processing (Celery)
5. β¬ Frontend dashboard
6. β¬ Production deployment (Railway/AWS)
## π License
MIT License
## π€ Contributing
Contributions welcome! Please test thoroughly before submitting.
---
**Questions?** Check the interactive docs at `/docs` or review the code comments. |