sglang-EAGLE3-Qwen3-235B-A22B

Model Introduction

The Eagle3 draft model was trained using the SpecForge framework for the Qwen3-235B-A22B model, leveraging a combination of UltraChat and ShareGPT datasets.

Benchmark Results

gsm8k (200 questions)

Output throughput: 224.168 token/s

Accept length: 3.538

mtbench (80 questions)

Output throughput: 241.5 token/s

Accept length: 3.019

Usage

You can use this Eagle3 draft model in SGLang with the following command:

python3 -m sglang.launch_server \
    --model <Qwen3-235B-A22B> \
    --speculative-algorithm EAGLE3 \
    --speculative-draft-model-path <EAGLE3-Qwen3-235B-A22B> \
    --speculative-num-steps 5 \
    --speculative-eagle-topk 8 \
    --speculative-num-draft-tokens 32 \
    --mem-fraction-static 0.75 \
    --tp 8 \
    --enable-ep-moe \
    --context-length 8192 \
    --trust-remote-code \
    --host 0.0.0.0 \
    --port 30000 \
    --dtype bfloat16

Downloads last month: 549

Safetensors

Model size

1B params

Tensor type

I64

BF16

BOOL

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including lmsys/Qwen3-235B-A22B-EAGLE3

EAGLE 3

Collection

Train Eagle 3 for SGLang with SpecForge • 3 items • Updated Aug 15 • 2