Commit
Β·
a23b7ec
1
Parent(s):
ec31c3e
update root html for better explains
Browse files- documentation.html +40 -26
documentation.html
CHANGED
|
@@ -99,13 +99,26 @@
|
|
| 99 |
</head>
|
| 100 |
<body>
|
| 101 |
<div class="header">
|
| 102 |
-
<h1
|
| 103 |
<p class="muted"><strong>AI Music Generation API</strong> β’ Real-time streaming β’ Custom fine-tune support</p>
|
| 104 |
<span class="badge">Research Project</span>
|
| 105 |
</div>
|
| 106 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 107 |
<section id="env-vars" style="margin-top: 24px;">
|
| 108 |
-
<h3
|
| 109 |
<p>
|
| 110 |
You can boot this Space directly into your own finetune by setting the variables below in
|
| 111 |
<em>Settings β Variables and secrets β Variables</em>. If you don't set them, you can still
|
|
@@ -182,7 +195,7 @@
|
|
| 182 |
</p>
|
| 183 |
|
| 184 |
<div class="demo-placeholder">
|
| 185 |
-
<h3
|
| 186 |
<video controls preload="metadata" playsinline style="width:100%; border-radius:8px; max-width:540px; display:block; margin:0 auto">
|
| 187 |
<source src="./lil_demo_540p.mp4" type="video/mp4">
|
| 188 |
Your browser does not support the video tag.
|
|
@@ -191,19 +204,15 @@
|
|
| 191 |
</div>
|
| 192 |
|
| 193 |
<div class="section">
|
| 194 |
-
<h2>
|
| 195 |
<p>This API powers AI music generation using Google's MagentaRT, designed for real-time audio streaming using finetunes hosted on HF. Built for iOS app integration with WebSocket streaming support.</p>
|
| 196 |
-
|
| 197 |
-
<div class="info">
|
| 198 |
-
<strong>Hardware Requirements:</strong> Optimal performance requires an L40S GPU (48GB VRAM) for real-time streaming. L4 24GB almost works but will not achieve real-time performance (if someone knows an optimization that will solve this, please let me know).
|
| 199 |
-
</div>
|
| 200 |
</div>
|
| 201 |
|
| 202 |
<div class="section">
|
| 203 |
-
<h2>
|
| 204 |
<p>Connect to <code>wss://<your-space>/ws/jam</code> for real-time audio generation:</p>
|
| 205 |
|
| 206 |
-
<h3>
|
| 207 |
<pre><button class="copy-btn" onclick="copyCode(this)">Copy</button>{
|
| 208 |
"type": "start",
|
| 209 |
"mode": "rt",
|
|
@@ -221,7 +230,7 @@
|
|
| 221 |
}
|
| 222 |
}</pre>
|
| 223 |
|
| 224 |
-
<h3>
|
| 225 |
<pre><button class="copy-btn" onclick="copyCode(this)">Copy</button>{
|
| 226 |
"type": "update",
|
| 227 |
"styles": "jazz, hiphop",
|
|
@@ -233,12 +242,12 @@
|
|
| 233 |
"centroid_weights": "0.1, 0.3, 0.0"
|
| 234 |
}</pre>
|
| 235 |
|
| 236 |
-
<h3>
|
| 237 |
<pre><button class="copy-btn" onclick="copyCode(this)">Copy</button>{"type": "stop"}</pre>
|
| 238 |
</div>
|
| 239 |
|
| 240 |
<div class="section">
|
| 241 |
-
<h2>API
|
| 242 |
|
| 243 |
<div class="endpoint">
|
| 244 |
<strong>POST /generate</strong> - Generate 4β8 bars of music with input audio
|
|
@@ -274,14 +283,14 @@
|
|
| 274 |
</div>
|
| 275 |
|
| 276 |
<div class="section">
|
| 277 |
-
<h2>
|
| 278 |
<p>Train your own MagentaRT models and use them with this API and the iOS app.</p>
|
| 279 |
|
| 280 |
<div class="grid">
|
| 281 |
<div class="card">
|
| 282 |
-
<h3>1.
|
| 283 |
<p>Use the official MagentaRT fine-tuning notebook:</p>
|
| 284 |
-
<p><a href="https://github.com/magenta-realtime/notebooks/blob/main/Magenta_RT_Finetune.ipynb" target="_blank"
|
| 285 |
<p>This will create checkpoint folders like:</p>
|
| 286 |
<ul>
|
| 287 |
<li><code>checkpoint_1861001/</code></li>
|
|
@@ -291,7 +300,7 @@
|
|
| 291 |
</div>
|
| 292 |
|
| 293 |
<div class="card">
|
| 294 |
-
<h3>2.
|
| 295 |
<p>Checkpoints must be compressed as .tgz files to preserve .zarray files correctly.</p>
|
| 296 |
<div class="warning">
|
| 297 |
<strong>Important:</strong> Do not download checkpoint folders directly from Google Drive - the .zarray files won't transfer properly.
|
|
@@ -299,7 +308,7 @@
|
|
| 299 |
</div>
|
| 300 |
</div>
|
| 301 |
|
| 302 |
-
<h3>
|
| 303 |
<p>Use this in a Colab cell to properly package your checkpoints:</p>
|
| 304 |
<pre><button class="copy-btn" onclick="copyCode(this)">Copy</button># Mount Drive to access your trained checkpoints
|
| 305 |
from google.colab import drive
|
|
@@ -325,7 +334,7 @@ CKPT_SRC = '/content/drive/MyDrive/thepatch/checkpoint_1862001' # Adjust path
|
|
| 325 |
from google.colab import files
|
| 326 |
files.download('/content/checkpoint_1862001.tgz')</pre>
|
| 327 |
|
| 328 |
-
<h3>3.
|
| 329 |
<p>Create a model repository and upload:</p>
|
| 330 |
<ul>
|
| 331 |
<li>Your <code>.tgz</code> checkpoint files</li>
|
|
@@ -338,12 +347,12 @@ files.download('/content/checkpoint_1862001.tgz')</pre>
|
|
| 338 |
Shows the correct file structure with .tgz files and .npy steering assets in the root directory.
|
| 339 |
</div>
|
| 340 |
|
| 341 |
-
<h3>4.
|
| 342 |
<p>In the iOS app's model selector, point to your Hugging Face repository URL. The app will automatically discover available checkpoints and allow switching between them.</p>
|
| 343 |
</div>
|
| 344 |
|
| 345 |
<div class="section">
|
| 346 |
-
<h2>
|
| 347 |
<ul>
|
| 348 |
<li><strong>Audio Format:</strong> 48 kHz stereo, ~2.0s chunks with ~40ms crossfade</li>
|
| 349 |
<li><strong>Model Sizes:</strong> Base and Large variants available</li>
|
|
@@ -358,7 +367,7 @@ files.download('/content/checkpoint_1862001.tgz')</pre>
|
|
| 358 |
</div>
|
| 359 |
|
| 360 |
<div class="section">
|
| 361 |
-
<h2>
|
| 362 |
<p>This API is designed to work seamlessly with our iOS music generation app:</p>
|
| 363 |
<ul>
|
| 364 |
<li>Real-time audio streaming via WebSockets</li>
|
|
@@ -369,7 +378,7 @@ files.download('/content/checkpoint_1862001.tgz')</pre>
|
|
| 369 |
</div>
|
| 370 |
|
| 371 |
<div class="section">
|
| 372 |
-
<h2>
|
| 373 |
<p>To run your own instance:</p>
|
| 374 |
<ol>
|
| 375 |
<li>Duplicate this Hugging Face Space</li>
|
|
@@ -380,7 +389,7 @@ files.download('/content/checkpoint_1862001.tgz')</pre>
|
|
| 380 |
</div>
|
| 381 |
|
| 382 |
<div class="section">
|
| 383 |
-
<h2>
|
| 384 |
<p>This is an active research project. For questions, technical support, or collaboration:</p>
|
| 385 |
<p><strong>Email:</strong> <a href="mailto:kev@thecollabagepatch.com">kev@thecollabagepatch.com</a></p>
|
| 386 |
|
|
@@ -390,9 +399,14 @@ files.download('/content/checkpoint_1862001.tgz')</pre>
|
|
| 390 |
</div>
|
| 391 |
|
| 392 |
<div class="section">
|
| 393 |
-
<h2>
|
| 394 |
<p>Built on Google's MagentaRT (Apache 2.0 + CC-BY 4.0). Users are responsible for their generated outputs and ensuring compliance with applicable laws and platform policies.</p>
|
| 395 |
-
<p><a href="/docs"
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 396 |
</div>
|
| 397 |
|
| 398 |
<script>
|
|
|
|
| 99 |
</head>
|
| 100 |
<body>
|
| 101 |
<div class="header">
|
| 102 |
+
<h1>MagentaRT Research API</h1>
|
| 103 |
<p class="muted"><strong>AI Music Generation API</strong> β’ Real-time streaming β’ Custom fine-tune support</p>
|
| 104 |
<span class="badge">Research Project</span>
|
| 105 |
</div>
|
| 106 |
|
| 107 |
+
<div class="section">
|
| 108 |
+
<h2>what this is</h2>
|
| 109 |
+
<p>This API serves Google's <a href="https://huggingface.co/google/magenta-realtime" target="_blank">MagentaRT</a> in two distinct ways. First, as a backend for our iOS app (the untitled jamming app) where users create initial loops with Stability AI's <a href="https://huggingface.co/stabilityai/stable-audio-open-small" target="_blank">stable-audio-open-small</a> and then MagentaRT jams on top of that audio context. Second, as a standalone web interface that connects directly to MagentaRT via WebSockets without any audio context.</p>
|
| 110 |
+
|
| 111 |
+
<p>Both modes support switching between base models and custom fine-tunes hosted on Hugging Face. This is designed as a template space for duplication, letting you experiment with real-time music generation outside of Google Colab.</p>
|
| 112 |
+
|
| 113 |
+
<p>This is meant to be duplicated to your own GPU-enabled space since the iOS app is still in active development and doesn't have funding to support multiple concurrent users yet.</p>
|
| 114 |
+
|
| 115 |
+
<div class="info">
|
| 116 |
+
<strong>Hardware Requirements:</strong> Optimal performance requires an L40S GPU (48GB VRAM) for real-time streaming. L4 24GB almost works but will not achieve real-time performance (if someone knows an optimization that will solve this, please let me know).
|
| 117 |
+
</div>
|
| 118 |
+
</div>
|
| 119 |
+
|
| 120 |
<section id="env-vars" style="margin-top: 24px;">
|
| 121 |
+
<h3>environment variables (optional, but helpful)</h3>
|
| 122 |
<p>
|
| 123 |
You can boot this Space directly into your own finetune by setting the variables below in
|
| 124 |
<em>Settings β Variables and secrets β Variables</em>. If you don't set them, you can still
|
|
|
|
| 195 |
</p>
|
| 196 |
|
| 197 |
<div class="demo-placeholder">
|
| 198 |
+
<h3>app demo video</h3>
|
| 199 |
<video controls preload="metadata" playsinline style="width:100%; border-radius:8px; max-width:540px; display:block; margin:0 auto">
|
| 200 |
<source src="./lil_demo_540p.mp4" type="video/mp4">
|
| 201 |
Your browser does not support the video tag.
|
|
|
|
| 204 |
</div>
|
| 205 |
|
| 206 |
<div class="section">
|
| 207 |
+
<h2>overview</h2>
|
| 208 |
<p>This API powers AI music generation using Google's MagentaRT, designed for real-time audio streaming using finetunes hosted on HF. Built for iOS app integration with WebSocket streaming support.</p>
|
|
|
|
|
|
|
|
|
|
|
|
|
| 209 |
</div>
|
| 210 |
|
| 211 |
<div class="section">
|
| 212 |
+
<h2>quick start - WebSocket streaming</h2>
|
| 213 |
<p>Connect to <code>wss://<your-space>/ws/jam</code> for real-time audio generation:</p>
|
| 214 |
|
| 215 |
+
<h3>start real-time generation</h3>
|
| 216 |
<pre><button class="copy-btn" onclick="copyCode(this)">Copy</button>{
|
| 217 |
"type": "start",
|
| 218 |
"mode": "rt",
|
|
|
|
| 230 |
}
|
| 231 |
}</pre>
|
| 232 |
|
| 233 |
+
<h3>update parameters live</h3>
|
| 234 |
<pre><button class="copy-btn" onclick="copyCode(this)">Copy</button>{
|
| 235 |
"type": "update",
|
| 236 |
"styles": "jazz, hiphop",
|
|
|
|
| 242 |
"centroid_weights": "0.1, 0.3, 0.0"
|
| 243 |
}</pre>
|
| 244 |
|
| 245 |
+
<h3>stop generation</h3>
|
| 246 |
<pre><button class="copy-btn" onclick="copyCode(this)">Copy</button>{"type": "stop"}</pre>
|
| 247 |
</div>
|
| 248 |
|
| 249 |
<div class="section">
|
| 250 |
+
<h2>API endpoints</h2>
|
| 251 |
|
| 252 |
<div class="endpoint">
|
| 253 |
<strong>POST /generate</strong> - Generate 4β8 bars of music with input audio
|
|
|
|
| 283 |
</div>
|
| 284 |
|
| 285 |
<div class="section">
|
| 286 |
+
<h2>custom fine-tuning</h2>
|
| 287 |
<p>Train your own MagentaRT models and use them with this API and the iOS app.</p>
|
| 288 |
|
| 289 |
<div class="grid">
|
| 290 |
<div class="card">
|
| 291 |
+
<h3>1. train your model</h3>
|
| 292 |
<p>Use the official MagentaRT fine-tuning notebook:</p>
|
| 293 |
+
<p><a href="https://github.com/magenta-realtime/notebooks/blob/main/Magenta_RT_Finetune.ipynb" target="_blank">MagentaRT Fine-tuning Colab</a></p>
|
| 294 |
<p>This will create checkpoint folders like:</p>
|
| 295 |
<ul>
|
| 296 |
<li><code>checkpoint_1861001/</code></li>
|
|
|
|
| 300 |
</div>
|
| 301 |
|
| 302 |
<div class="card">
|
| 303 |
+
<h3>2. package checkpoints</h3>
|
| 304 |
<p>Checkpoints must be compressed as .tgz files to preserve .zarray files correctly.</p>
|
| 305 |
<div class="warning">
|
| 306 |
<strong>Important:</strong> Do not download checkpoint folders directly from Google Drive - the .zarray files won't transfer properly.
|
|
|
|
| 308 |
</div>
|
| 309 |
</div>
|
| 310 |
|
| 311 |
+
<h3>checkpoint packaging script</h3>
|
| 312 |
<p>Use this in a Colab cell to properly package your checkpoints:</p>
|
| 313 |
<pre><button class="copy-btn" onclick="copyCode(this)">Copy</button># Mount Drive to access your trained checkpoints
|
| 314 |
from google.colab import drive
|
|
|
|
| 334 |
from google.colab import files
|
| 335 |
files.download('/content/checkpoint_1862001.tgz')</pre>
|
| 336 |
|
| 337 |
+
<h3>3. upload to hugging face</h3>
|
| 338 |
<p>Create a model repository and upload:</p>
|
| 339 |
<ul>
|
| 340 |
<li>Your <code>.tgz</code> checkpoint files</li>
|
|
|
|
| 347 |
Shows the correct file structure with .tgz files and .npy steering assets in the root directory.
|
| 348 |
</div>
|
| 349 |
|
| 350 |
+
<h3>4. use in the app</h3>
|
| 351 |
<p>In the iOS app's model selector, point to your Hugging Face repository URL. The app will automatically discover available checkpoints and allow switching between them.</p>
|
| 352 |
</div>
|
| 353 |
|
| 354 |
<div class="section">
|
| 355 |
+
<h2>technical specifications</h2>
|
| 356 |
<ul>
|
| 357 |
<li><strong>Audio Format:</strong> 48 kHz stereo, ~2.0s chunks with ~40ms crossfade</li>
|
| 358 |
<li><strong>Model Sizes:</strong> Base and Large variants available</li>
|
|
|
|
| 367 |
</div>
|
| 368 |
|
| 369 |
<div class="section">
|
| 370 |
+
<h2>integration with iOS app</h2>
|
| 371 |
<p>This API is designed to work seamlessly with our iOS music generation app:</p>
|
| 372 |
<ul>
|
| 373 |
<li>Real-time audio streaming via WebSockets</li>
|
|
|
|
| 378 |
</div>
|
| 379 |
|
| 380 |
<div class="section">
|
| 381 |
+
<h2>deployment</h2>
|
| 382 |
<p>To run your own instance:</p>
|
| 383 |
<ol>
|
| 384 |
<li>Duplicate this Hugging Face Space</li>
|
|
|
|
| 389 |
</div>
|
| 390 |
|
| 391 |
<div class="section">
|
| 392 |
+
<h2>support & contact</h2>
|
| 393 |
<p>This is an active research project. For questions, technical support, or collaboration:</p>
|
| 394 |
<p><strong>Email:</strong> <a href="mailto:kev@thecollabagepatch.com">kev@thecollabagepatch.com</a></p>
|
| 395 |
|
|
|
|
| 399 |
</div>
|
| 400 |
|
| 401 |
<div class="section">
|
| 402 |
+
<h2>licensing</h2>
|
| 403 |
<p>Built on Google's MagentaRT (Apache 2.0 + CC-BY 4.0). Users are responsible for their generated outputs and ensuring compliance with applicable laws and platform policies.</p>
|
| 404 |
+
<p><a href="/docs">API Reference Documentation</a></p>
|
| 405 |
+
</div>
|
| 406 |
+
|
| 407 |
+
<div class="section">
|
| 408 |
+
<h2>contributors</h2>
|
| 409 |
+
<p>Kevin Griffing and Andrew Luck</p>
|
| 410 |
</div>
|
| 411 |
|
| 412 |
<script>
|