Spaces:

Fraser
/

piclets-server

Running

App Files Files Community

piclets-server / CLAUDE.md

Fraser

cool stuff

0d9403e about 1 month ago

preview code

raw

history blame contribute delete

10.1 kB

A newer version of the Gradio SDK is available: 5.49.1

Upgrade

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Overview

This is a Hugging Face Space that serves as the complete backend for the Piclets Discovery game. It orchestrates AI services, handles Piclet generation, and manages persistent storage.

Core Concept: Each real-world object has ONE canonical Piclet! Players scan objects with photos, and the server generates Pokemon-style creatures using AI, tracking canonical discoveries and variations (e.g., "velvet pillow" is a variation of the canonical "pillow").

Architecture Philosophy: The server handles ALL AI orchestration securely. The frontend is a pure UI that makes a single API call. This prevents client-side manipulation and ensures fair play.

Architecture

Storage System

HuggingFace Dataset: Fraser/piclets (public dataset repository)

Structure:

piclets/
  {normalized_object_name}.json  # e.g., pillow.json
users/
  {username}.json                 # User profiles
metadata/
  stats.json                      # Global statistics
  leaderboard.json               # Top discoverers

Object Normalization

Objects are normalized for consistent storage:

Convert to lowercase
Remove articles (the, a, an)
Handle pluralization (pillows → pillow)
Replace spaces with underscores
Remove special characters

Examples:

"The Blue Pillow" → pillow
"wooden chairs" → wooden_chair
"glasses" → glass (special case handling)

Piclet Data Structure

{
  "canonical": {
    "objectName": "pillow",
    "typeId": "pillow_canonical",
    "discoveredBy": "username",
    "discoveredAt": "2024-07-26T10:30:00",
    "scanCount": 42,
    "picletData": {
      // Full Piclet instance data
    }
  },
  "variations": [
    {
      "typeId": "pillow_001",
      "attributes": ["velvet", "blue"],
      "discoveredBy": "username2",
      "discoveredAt": "2024-07-26T11:00:00",
      "scanCount": 5,
      "picletData": {
        // Full variation data
      }
    }
  ]
}

API Endpoints

The frontend only needs these 5 public endpoints:

1. generate_piclet (Scanner)

Complete Piclet generation workflow - the main endpoint.

Input:
- image: User's photo (File)
- hf_token: User's HuggingFace OAuth token (string)
Process:
1. Verifies hf_token → gets user info
2. Uses token to connect to JoyCaption → generates detailed image description
3. Uses token to call GPT-OSS-120B → generates Pokemon concept (object, variation, stats, description)
4. Parses concept to extract structured data
5. Uses token to call Flux-Schnell → generates Piclet image
6. Checks dataset for canonical/variation match
7. Saves to dataset with user attribution
8. Updates user profile (discoveries, rarity score)

Returns:

{
  "success": true,
  "piclet": {/* complete Piclet data */},
  "discoveryStatus": "new" | "variation" | "existing",
  "canonicalId": "pillow_canonical",
  "message": "Congratulations! You discovered the first pillow Piclet!"
}

Security: Uses user's token to call AI services, consuming THEIR GPU quota (not the server's)

2. get_user_piclets (User Collection)

Get user's discovered Piclets and stats.

Input: hf_token (string)

Returns:

{
  "success": true,
  "piclets": [{/* list of discoveries */}],
  "stats": {
    "username": "...",
    "totalFinds": 42,
    "uniqueFinds": 15,
    "rarityScore": 1250
  }
}

3. get_object_details (Object Data)

Get complete object information (canonical + all variations).

Input: object_name (string, e.g., "pillow", "macbook")

Returns:

{
  "success": true,
  "objectName": "pillow",
  "canonical": {/* canonical data */},
  "variations": [{/* variation 1 */}, {/* variation 2 */}],
  "totalScans": 157,
  "variationCount": 8
}

4. get_recent_activity (Activity Feed)

Recent discoveries across all users.

Input: limit (int, default 20)
Returns: List of recent discoveries with timestamps

5. get_leaderboard (Top Users)

Top discoverers by rarity score.

Input: limit (int, default 10)
Returns: Ranked users with stats

Internal Functions (not exposed to frontend):

search_piclet(), create_canonical(), create_variation(), increment_scan_count() - Used internally by generate_piclet()

Rarity System

Scan count determines rarity:

Legendary: ≤ 5 scans
Epic: 6-20 scans
Rare: 21-50 scans
Uncommon: 51-100 scans
Common: > 100 scans

Rarity scoring for leaderboard:

Canonical discovery: +100 points
Variation discovery: +50 points
Additional bonuses based on rarity tier

Authentication Strategy

Web UI Authentication:

Gradio auth protects web interface from casual access
Requires username="admin" and password from ADMIN_PASSWORD env var
Prevents random users from manually creating piclets via UI
Does NOT affect API access - programmatic clients bypass this

API-Level Authentication:

OAuth token verification for user attribution
Tokens verified via https://huggingface.co/oauth/userinfo
User profiles keyed by stable HF sub (user ID)
All discovery data is public (embracing open discovery)

Integration with Frontend

The frontend (../piclets/) uses these 5 simple API calls:

// Connect to server
const client = await window.gradioClient.Client.connect("Fraser/piclets-server");

// 1. Scanner - Generate complete Piclet (ONE CALL - server does everything!)
const scanResult = await client.predict("/generate_piclet", {
  image: imageFile,
  hf_token: userToken
});
const { success, piclet, discoveryStatus, message } = scanResult.data[0];

// 2. User Collection - Get user's Piclets + stats
const myPiclets = await client.predict("/get_user_piclets", {
  hf_token: userToken
});
const { piclets, stats } = myPiclets.data[0];

// 3. Object Details - Get object info (canonical + variations)
const objectInfo = await client.predict("/get_object_details", {
  object_name: "pillow"
});
const { canonical, variations, totalScans } = objectInfo.data[0];

// 4. Activity Feed - Get recent discoveries
const activity = await client.predict("/get_recent_activity", {
  limit: 20
});

// 5. Leaderboard - Get top users
const leaders = await client.predict("/get_leaderboard", {
  limit: 10
});

Why This Design?

Clean API: Only 5 endpoints, each with a clear purpose
Security: All AI orchestration happens server-side (can't be manipulated)
Simplicity: Frontend is pure UI, no complex orchestration logic
Fairness: Uses user's GPU quota, not server's
Reliability: Server handles retries and error recovery

Development

Local Testing

pip install -r requirements.txt
python app.py
# Access at http://localhost:7860

Deployment

Push to HuggingFace Space repository:

git add -A && git commit -m "Update" && git push

Environment Variables

HF_TOKEN: Required - HuggingFace write token for dataset operations (set in Space Secrets)
ADMIN_PASSWORD: Optional - Password for web UI access (set in Space Secrets)
DATASET_REPO: Target dataset (default: "Fraser/piclets")

Note: Users' hf_token (passed in API calls) is separate from server's HF_TOKEN (for dataset writes).

Key Implementation Details

AI Service Integration

The server uses gradio_client to call external AI services with the user's token:

JoyCaption (fancyfeast/joy-caption-alpha-two): Detailed image captioning with brand/model recognition
GPT-OSS-120B (amd/gpt-oss-120b-chatbot): Concept generation and parsing
Flux-Schnell (black-forest-labs/FLUX.1-schnell): Anime-style Piclet image generation

Each service is called with the user's hf_token, consuming their GPU quota.

Concept Parsing

GPT-OSS generates structured markdown with sections:

Canonical Object (specific brand/model, not generic)
Variation (distinctive attribute or "canonical")
Object Rarity (determines tier)
Monster Name, Type, Stats
Physical Stats (height, weight)
Personality, Description
Monster Image Prompt

The parser uses regex to extract each section and clean the data.

Variation Matching

Uses set intersection to find attribute overlap
50% match threshold for variations
Attributes are normalized and trimmed

Caching Strategy

Local cache in cache/ directory
HuggingFace hub caching for downloads
Temporary files for uploads

Error Handling

Token verification before any operations
Graceful fallbacks for missing data
Default user profiles for new users
Try-catch blocks around all operations
Detailed logging for debugging

Future Enhancements

Background Removal: Add server-side background removal (currently done on frontend)
Activity Log: Separate timeline file for better performance
Image Storage: Store Piclet images directly in dataset (currently stores URLs)
Badges/Achievements: Track discovery milestones
Trading System: Allow users to trade variations
Seasonal Events: Time-limited discoveries
Rate Limiting: Per-user rate limits to prevent abuse
Caching: Cache AI responses for identical images

Security Considerations

Token Verification: All operations verify HF OAuth tokens via https://huggingface.co/oauth/userinfo
User Attribution: Discoveries tracked by stable HF sub (user ID), not username
Fair GPU Usage: Users consume their own GPU quota, not server's
Public Data: All discovery data is public by design (embracing open discovery)
No Client Manipulation: AI orchestration happens server-side only
Input Validation: File uploads and token formats validated
No Sensitive Data: No passwords or private info stored
Future: Rate limiting per user to prevent abuse