File size: 1,262 Bytes
c0a0e96
 
3854d28
c0a0e96
3854d28
c0a0e96
 
 
 
3854d28
c0a0e96
 
 
3854d28
c0a0e96
3854d28
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
<div style="border: 1px solid #e2e8f0; border-radius: 8px; background: white; margin: 1.5rem 0;">
    <div style="padding: 1rem; border-bottom: 1px solid #e2e8f0; background: #f8f9fa;">
        <h4 style="margin: 0 0 0.5rem 0; color: #495057;">πŸš€ CUDA Warmup Efficiency Benchmark</h4>
        <p style="margin: 0; font-size: 0.9em; color: #6c757d;">
            Real CUDA warmup benchmarking with actual Transformers models. Measure the performance impact of the caching_allocator_warmup function.
        </p>
    </div>
    
    <div style="padding: 1rem;">
        <iframe src=https://molbap-cuda-warmup-transformers.hf.space width=100% height=800px frameborder=0 style="border-radius: 8px; background: white;"></iframe>
    </div>
    
    <div style="padding: 1rem; border-top: 1px solid #e2e8f0; background: #f8f9fa; font-size: 0.9em; color: #6c757d;">
        Real CUDA warmup benchmarking with actual Transformers models. Measure the performance impact of the <code>caching_allocator_warmup</code> function at <code>transformers/src/transformers/modeling_utils.py:6186</code>. This interactive tool loads models twice - once with warmup disabled and once with warmup enabled - to demonstrate the significant loading time improvements.
    </div>
</div>