Commit History

Remove cancel generation feature as it didn't work
efa3ae6

Luigi commited on

Restore proper cancel functionality
afa1066

Luigi commited on

Test stopping criteria effectiveness
f0d2ab7

Luigi commited on

Improve cancel generation responsiveness
48e50b8

Luigi commited on

Fix cancel generation to gracefully stop ongoing response generation
a73d8f4

Luigi commited on

remove qwen3 30b-a3b and qwen3 next 80b-a3b
4af617b

Luigi commited on

Remove AOT indication from UI duration estimate
f97cbfc

Luigi commited on

Remove AOT compilation completely and enable use_cache
426163f

Luigi commited on

Fix dynamic_shapes kwargs to match inputs structure for AOT compilation
273acf8

Luigi commited on

Fix AOT compilation dynamic_shapes to match expected arg names for torch.export.export
0a99dfc

Luigi commited on

Add dynamic GPU time estimate indicator to UI
fc989b4

Luigi commited on

Improve model size detection: replace ad-hoc string parsing with reliable params_b field in MODELS dict
ab92e0d

Luigi commited on

Add qwen 80b-a3b
8cdf3e1

Luigi commited on

Set better defaults for free-tier users: Qwen3-1.7B model, 1024 max tokens, search disabled
2cae073

Luigi commited on

Adjust duration estimation for H200 performance - reduce conservative estimates
de766da

Luigi commited on

Use actual parameter count for AOT decision instead of string matching
e3e334f

Luigi commited on

Make AOT compilation conditional for models >= 2B parameters to optimize free tier usage
4500f92

Luigi commited on

Add AOT compilation optimization for ZeroGPU acceleration
a7866ff

Luigi commited on

add 4 20b+ models after enabling dynamic gpu duration
fea2910
verified

Luigi commited on

Add dynamic duration calculation for ZeroGPU acceleration
6073cc2

Luigi commited on

make qwen-4b default
d3726c6
verified

Luigi commited on

disable two models that cannot run or too run too slowly on hf spaces with zerogpu
3dc7ced

Luigi commited on

Update app.py
f1fa55c
verified

Luigi commited on

Update app.py
f07d6ab
verified

Luigi commited on

Update app.py
0992852
verified

Luigi commited on

add original apriel 15b
2b25033
verified

Luigi commited on

use apriel 8bit
a4681bd
verified

Luigi commited on

run Apriel on 4bit
2cadf8a
verified

Luigi commited on

Add Apriel-1.5-15b-Thinker
3665b54
verified

Luigi commited on

Update app.py
7f654b2
verified

Luigi commited on

Update app.py
15b78c7
verified

Luigi commited on

Update app.py
7356fa6
verified

Luigi commited on

add 4 models from qwen3 family
048cfc4
verified

Luigi commited on

add qwen3 32b awq
b9efb74
verified

Luigi commited on

Update app.py
5e03586
verified

Luigi commited on

Update app.py
e5a1663
verified

Luigi commited on

Update app.py
de64679
verified

Luigi commited on

Update app.py
4418827
verified

Luigi commited on

Update app.py
f2f4310
verified

Luigi commited on

Update app.py
42db70b
verified

Luigi commited on

feat(models): add Granite-4.0-Micro and Qwen3-4B-Instruct-2507 to MODELS registry
c30a7f7
verified

Luigi commited on

feat(models): Added three new models
3c22497
verified

Luigi commited on

add 5 models from liquid ai
8eefe94

Luigi commited on

add smollm2 135m multilingual
ac20174
verified

Luigi commited on

add parser_model_ner_gemma_v0 based on gemma 3 370m it
bc1bd75
verified

Luigi commited on

Add Gemma-3-270m Taiwan
74c66e7
verified

Luigi commited on

add gemma-3-270m-it
3995aec

Luigi commited on

remove prevously added breeze models (as it didn't work), add smollm 135m taiwan
b3fd72e

Luigi commited on

add breeze models
88f3bc6

Luigi commited on

add 3 sub-1B TW models from ShengweiPeng
ddfffab

Luigi commited on