Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
Charlie81
/
ThinExperts
like
0
Safetensors
License:
mit
Model card
Files
Files and versions
xet
Community
main
ThinExperts
Commit History
layer id init
4f3261b
Charlie81
commited on
Jun 9, 2025
args routers
4091143
Charlie81
commited on
Jun 9, 2025
corrected depth routing
dff49bf
Charlie81
commited on
Jun 9, 2025
2 new depth routing types
e868dbf
Charlie81
commited on
Jun 9, 2025
add different routing types
c085dea
Charlie81
commited on
Jun 9, 2025
set to multinomial top k
3aa53b4
Charlie81
commited on
Jun 9, 2025
change back to topk
4d16af6
Charlie81
commited on
Jun 9, 2025
remove experts override
858f5a5
Charlie81
commited on
Jun 9, 2025
multinomial2
72726ed
Charlie81
commited on
Jun 9, 2025
4multi
f1227e7
Charlie81
commited on
Jun 9, 2025
multinomial8
67ed347
Charlie81
commited on
Jun 9, 2025
ex1
4803c83
Charlie81
commited on
Jun 9, 2025
2 ex
0396d8f
Charlie81
commited on
Jun 9, 2025
4exp
63a872f
Charlie81
commited on
Jun 9, 2025
8exp
c784f88
Charlie81
commited on
Jun 9, 2025
16exp
df8b796
Charlie81
commited on
Jun 9, 2025
32 exp
6066d79
Charlie81
commited on
Jun 9, 2025
experts per token to 64
41d8418
Charlie81
commited on
Jun 9, 2025
fix imports
8dd44dc
Charlie81
commited on
Jun 9, 2025
safetensors index
5b40103
Charlie81
commited on
Jun 8, 2025
change name casings
dc61de6
Charlie81
commited on
Jun 8, 2025
all thing
518168c
Charlie81
commited on
Jun 8, 2025
classname
dce42d4
Charlie81
commited on
Jun 8, 2025
remove docstrings
38cf6f0
Charlie81
commited on
Jun 8, 2025
fix issues
7c1f019
Charlie81
commited on
Jun 8, 2025
total repo overhaul
39baffa
Charlie81
commited on
Jun 8, 2025
match transformers sparse block
2daadcc
Charlie81
commited on
Jun 7, 2025
Revert "refactor sparse"
7bf23fe
Charlie81
commited on
Jun 7, 2025
refactor sparse
170c7d7
Charlie81
commited on
Jun 7, 2025
debug
be9d959
Charlie81
commited on
Jun 7, 2025
revert variable assignment hidden states shape
3325c29
Charlie81
commited on
Jun 7, 2025
debug router shape
d121ac8
Charlie81
commited on
Jun 7, 2025
re-add debugs
d2be9a4
Charlie81
commited on
Jun 7, 2025
remove printfs
4db3f58
Charlie81
commited on
Jun 7, 2025
consistent dense moe routers
6e3bd26
Charlie81
commited on
Jun 7, 2025
add debugs and attempt fix
dcf107e
Charlie81
commited on
Jun 7, 2025
add debug
02557de
Charlie81
commited on
Jun 7, 2025
debug statements
dddc08e
Charlie81
commited on
Jun 7, 2025
remove load balancing comment
757a1cb
Charlie81
commited on
Jun 7, 2025
load balancig loss to modeling
aa9c31c
Charlie81
commited on
Jun 7, 2025
claude attempt 2
19f9418
Charlie81
commited on
Jun 7, 2025
attempt claude fixes
c03d64a
Charlie81
commited on
Jun 7, 2025
remove commented out
cc26fce
Charlie81
commited on
Jun 7, 2025
attempt attn forward fix
90dbea8
Charlie81
commited on
Jun 7, 2025
undo sparse router removal
2b07c4b
Charlie81
commited on
Jun 7, 2025
allign with transformers code
6481d24
Charlie81
commited on
Jun 7, 2025
rotary forward
404cc11
Charlie81
commited on
Jun 7, 2025
comment out generation mixin
36b7e2b
Charlie81
commited on
Jun 7, 2025
modeling rope
affa284
Charlie81
commited on
Jun 7, 2025
olmoe rotary embedding
96145ba
Charlie81
commited on
Jun 7, 2025
Previous
1
2
Next