Spaces:
Runtime error
Runtime error
File size: 5,243 Bytes
7cb11f5 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 |
# Hugging Face Dataset Integration
The benchmark server can automatically upload results to a Hugging Face Dataset repository for centralized storage and sharing.
## Features
- **Automatic Upload**: Results are automatically pushed to HF Dataset when benchmarks complete
- **File Structure Preservation**: Uses the same path structure: `{task}/{org}/{model}/{params}.json`
- **JSON Format**: Results are stored as JSON (not JSONL) for better Dataset compatibility
- **Overwrite Strategy**: Each configuration gets a single file that is overwritten with the latest result
- **Error Tracking**: Failed benchmarks are also uploaded to track issues
## Setup
### 1. Create a Hugging Face Dataset
1. Go to https://huggingface.co/new-dataset
2. Create a new dataset (e.g., `username/transformersjs-benchmark-results`)
3. Keep it public or private based on your needs
### 2. Get Your HF Token
1. Go to https://huggingface.co/settings/tokens
2. Create a new token with `write` permissions
3. Copy the token
### 3. Configure Environment Variables
Create or update `.env` file in the `bench` directory:
```bash
# Hugging Face Dataset Configuration
HF_DATASET_REPO=whitphx/transformersjs-performance-leaderboard-results-dev
HF_TOKEN=hf_xxxxxxxxxxxxxxxxxxxxxxxxxxxxx
# Optional: Local storage directory
BENCHMARK_RESULTS_DIR=./benchmark-results
# Optional: Server port
PORT=7860
```
**Important**: Never commit `.env` to git. It's already in `.gitignore`.
## Usage
Once configured, the server will automatically upload results:
```bash
# Start the server
npm run server
# You should see:
# π€ HF Dataset upload enabled: username/transformersjs-benchmark-results
```
When benchmarks complete, you'll see:
```
β
Completed: abc-123 in 5.2s
β Benchmark abc-123 saved to file
β Uploaded to HF Dataset: feature-extraction/Xenova/all-MiniLM-L6-v2/node_warm_cpu_fp32_b1.json
```
## File Structure in HF Dataset
The dataset will have the same structure as local storage:
```
feature-extraction/
βββ Xenova/
β βββ all-MiniLM-L6-v2/
β β βββ node_warm_cpu_fp32_b1.json
β β βββ node_warm_webgpu_fp16_b1.json
β β βββ web_warm_wasm_b1_chromium.json
β βββ distilbert-base-uncased/
β βββ node_warm_cpu_fp32_b1.json
text-classification/
βββ Xenova/
βββ distilbert-base-uncased/
βββ node_warm_cpu_fp32_b1.json
```
## JSON Format
Each file contains a single benchmark result (not multiple runs):
```json
{
"id": "abc-123-456",
"platform": "node",
"modelId": "Xenova/all-MiniLM-L6-v2",
"task": "feature-extraction",
"mode": "warm",
"repeats": 3,
"dtype": "fp32",
"batchSize": 1,
"device": "cpu",
"timestamp": 1234567890,
"status": "completed",
"result": {
"metrics": { ... },
"environment": { ... }
}
}
```
## Behavior
### Overwriting Results
- Each benchmark configuration maps to a single file
- New results **overwrite** the existing file
- Only the **latest** result is kept per configuration
- This ensures the dataset always has current data
### Local vs Remote Storage
- **Local (JSONL)**: Keeps history of all runs (append-only)
- **Remote (JSON)**: Keeps only latest result (overwrite)
This dual approach allows:
- Local: Full history for analysis
- Remote: Clean, current results for leaderboards
### Failed Benchmarks
Failed benchmarks are also uploaded to track:
- Which models/configs have issues
- Error types (memory errors, etc.)
- Environmental context
Example failed result:
```json
{
"id": "def-456-789",
"status": "failed",
"error": "Benchmark failed with code 1: ...",
"result": {
"error": {
"type": "memory_error",
"message": "Aborted(). Build with -sASSERTIONS for more info.",
"stage": "load"
},
"environment": { ... }
}
}
```
## Git Commits
Each upload creates a git commit in the dataset with:
```
Update benchmark: Xenova/all-MiniLM-L6-v2 (node/feature-extraction)
Benchmark ID: abc-123-456
Status: completed
Timestamp: 2025-10-13T06:48:57.481Z
```
## Disabling Upload
To disable HF Dataset upload:
1. Remove `HF_TOKEN` from `.env`, or
2. Remove both `HF_DATASET_REPO` and `HF_TOKEN`
The server will show:
```
π€ HF Dataset upload disabled (set HF_DATASET_REPO and HF_TOKEN to enable)
```
## Error Handling
If HF upload fails:
- The error is logged but doesn't fail the benchmark
- Local storage still succeeds
- You can retry manually or fix configuration
Example error:
```
β Failed to upload benchmark abc-123 to HF Dataset: Authentication failed
```
## API Endpoint (Future)
Currently uploads happen automatically. In the future, we could add:
```bash
# Manually trigger upload of a specific result
POST /api/benchmark/:id/upload
# Re-upload all local results to HF Dataset
POST /api/benchmarks/sync
```
## Development vs Production
Use different dataset repositories for development and production:
**Development** (`.env`):
```bash
HF_DATASET_REPO=whitphx/transformersjs-performance-leaderboard-results-dev
```
**Production** (deployed environment):
```bash
HF_DATASET_REPO=whitphx/transformersjs-performance-leaderboard-results
```
This allows testing without polluting production data.
|