TuRTLe-Leaderboard / static /html_content.py
arnauad3's picture
November Release
b4e02ad
"""HTML content for the TuRTLe leaderboard."""
HEADER_HTML = """
<div align="center">
<img src='/gradio_api/file=logo_new.png' alt='TuRTLe Logo' width='220'/>
</div>
"""
NAV_BUTTONS_HTML = """
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.0.0/css/all.min.css">
<script defer src="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.0.0/js/all.min.js"></script>
<div style="text-align: center; margin-bottom: 0px; margin-top: 0px;">
<a href="https://github.com/HPAI-BSC/TuRTLe" target="_blank" style="text-decoration: none; margin-right: 10px;">
<button style="background: #333; color: white; padding: 10px 14px; border-radius: 8px; border: none; font-size: 16px; cursor: pointer;">
GitHub Repo
</button>
</a>
<a href="http://arxiv.org/abs/2504.01986" target="_blank" style="text-decoration: none; margin-right: 10px;">
<button style="background: #b31b1b; color: white; padding: 10px 14px; border-radius: 8px; border: none; font-size: 16px; cursor: pointer;">
arXiv MLCAD 2025
</button>
</a>
<a href="mailto:hpai@bsc.es?subject=TuRTLe%20leaderboard%20new%20entry&body=Link%20to%20HuggingFace%20Model:" style="text-decoration: none;">
<button style="background: #00674F; color: white; padding: 10px 14px; border-radius: 8px; border: none; font-size: 16px; cursor: pointer;">
How to submit
</button>
</a>
<p style="margin-top: 15px;">If you have any inquiries or wish to collaborate:
<a href="mailto:hpai@bsc.es">hpai@bsc.es</a>
</p>
</div>
"""
INTRO_HTML = """
<div style=" margin-top:-10px !important;">
<p style="margin-bottom: 15px; text-align: start !important;">
Welcome to the TuRTLe Model Leaderboard! TuRTLe is a
<b>unified evaluation framework designed to systematically assess Large Language Models (LLMs) in RTL (Register-Transfer Level) generation</b>
for hardware design.
Evaluation criteria include <b>syntax correctness, functional accuracy, synthesizability, and post-synthesis quality</b>
(PPA: Power, Performance, Area). TuRTLe integrates multiple benchmarks to highlight strengths and weaknesses of available LLMs.
Use the filters below to explore different RTL benchmarks, simulators and models.
</p>
<p style="margin-top:10px; text-align:start !important;">
<span style="font-variant:small-caps; font-weight:bold;">UPDATE (OCT 2025):</span> Added <span>Hermes-4-14B</span>, <span>Qwen3-8B</span>, and <span>Seed-OSS-36B</span> to the leaderboard. Implemented Other Models tab and moved models to it
</p>
<p style="margin-top:-6px; text-align:start !important;">
<span style="font-variant:small-caps; font-weight:bold;">UPDATE (SEPT 2025):</span> Added <span>gpt-oss-20b</span> and <span>gpt-oss-120b</span> to the leaderboard
</p>
<p style="margin-top:-6px; text-align:start !important;">
<span style="font-variant:small-caps; font-weight:bold;">UPDATE (JULY 2025):</span> Our TuRTLe paper was accepted to
<a href="https://mlcad.org/symposium/2025/" target="_blank">MLCAD 2025</a> in September (Santa Cruz, CA), plus we've added Verilator as a new simulator alongside Icarus Verilog
</p>
<p style="margin-top: -6px; text-align: start !important;">
<span style="font-variant: small-caps; font-weight: bold;">UPDATE (JUNE 2025):</span> We make our framework open-source on GitHub and we add 7 new recent models! For a total of 40 base and instruct models and 5 RTL benchmarks
</p>
</div>
"""
LC_FOOTNOTE_HTML = """
<div id="lc-footnote" style="font-size: 13px; opacity: 0.6; margin-top: -5px; z-index:999; text-align: left;">
<span style="font-weight: 600; opacity: 1;">†</span>
<em>Line Completion</em> excludes "reasoning" models since this task targets quick auto-completion<br/>
Additionally, for <em>Line Completion</em> and <em>Code Completion</em> benchmarks we use <b>Base</b> model variant (if available), and for <em>Spec-to-RTL</em> we use <b>Instruct</b> model variant
</div>
"""
ABOUT_US_HTML = """
<div style="max-width: 800px; margin: auto; padding: 20px; border: 1px solid #ccc; border-radius: 10px;">
<div style="display: flex; justify-content: center; align-items: center; gap: 5%; margin-bottom: 20px;">
<img src='/gradio_api/file=hpai_logo_grad.png' alt='HPAI Group Logo' style="width: 45%;"/>
<img src='/gradio_api/file=bsc-logo.png' alt='BSC Logo' style="width: 25%;"/>
</div>
<p style="font-size: 16px; text-align: start;">
The <b>High-Performance Artificial Intelligence (HPAI)</b> group is part of the
<a href="https://bsc.es/" target="_blank">Barcelona Supercomputing Center (BSC)</a>.
This leaderboard is maintained by HPAI as part of our commitment to <b>open science</b>.
</p>
<ul style="font-size: 16px; margin-bottom: 20px; margin-top: 20px;">
<li><a href="https://hpai.bsc.es/" target="_blank">HPAI Website</a></li>
<li><a href="https://github.com/HPAI-BSC/" target="_blank">HPAI GitHub Organization Page</a></li>
<li><a href="https://huggingface.co/HPAI-BSC/" target="_blank">HPAI Hugging Face Organization Page</a></li>
</ul>
<p style="font-size: 16px; margin-top: 15px;">
Feel free to contact us:
</p>
<p style="font-size: 16px;">Email: <a href="mailto:hpai@bsc.es"><b>hpai@bsc.es</b></a></p>
</div>
"""
REFERENCES_HTML = """
<div style="max-width: 800px; margin: auto; padding: 20px; border: 1px solid #ccc; border-radius: 10px;">
<ul style="font-size: 16px; margin-bottom: 20px; margin-top: 20px;">
<li><a href="https://github.com/bigcode-project/bigcode-evaluation-harness" target="_blank">Code Generation LM Evaluation Harness</a></li>
<li>Williams, S. Icarus Verilog [Computer software]. <a href="https://github.com/steveicarus/iverilog" target="_blank">https://github.com/steveicarus/iverilog</a></li>
<li>Snyder, W., Wasson, P., Galbi, D., & et al. Verilator [Computer software]. <a href="https://github.com/verilator/verilator" target="_blank">https://github.com/verilator/verilator</a></li>
<li>RTL-Repo: Allam and M. Shalan, "Rtl-repo: A benchmark for evaluating llms on large-scale rtl design projects," in 2024 IEEE LLM Aided Design Workshop (LAD). IEEE, 2024, pp. 1–5.</li>
<li>VeriGen: S. Thakur, B. Ahmad, H. Pearce, B. Tan, B. Dolan-Gavitt, R. Karri, and S. Garg, "Verigen: A large language model for verilog code generation," ACM Transactions on Design Automation of Electronic Systems, vol. 29, no. 3, pp. 1–31, 2024. </li>
<li>VerilogEval (I): M. Liu, N. Pinckney, B. Khailany, and H. Ren, "Verilogeval: Evaluating large language models for verilog code generation," in 2023 IEEE/ACM International Conference on Computer Aided Design (ICCAD). IEEE, 2023, pp. 1–8.</li>
<li>VerilogEval (II): N. Pinckney, C. Batten, M. Liu, H. Ren, and B. Khailany, "Revisiting VerilogEval: A Year of Improvements in Large-Language Models for Hardware Code Generation," ACM Trans. Des. Autom. Electron. Syst., feb 2025. https://doi.org/10.1145/3718088</li>
<li>RTLLM: Y. Lu, S. Liu, Q. Zhang, and Z. Xie, "Rtllm: An open-source benchmark for design rtl generation with large language model," in 2024 29th Asia and South Pacific Design Automation Conference (ASP-DAC). IEEE, 2024, pp. 722–727.</li>
</ul>
</div>
"""
OTHER_MODELS_HTML = """
<div style="max-width: 800px; margin: auto; padding: 20px; border: 1px solid #ccc; border-radius: 10px;">
<p style="font-size: 16px; text-align: start;">
These models were previously listed on the main leaderboard, evaluated with a potentially deprecated version of TuRTLe, and will no longer be updated.
</p>
</div>
"""