Commit History

layer id init
4f3261b

Charlie81 commited on

args routers
4091143

Charlie81 commited on

corrected depth routing
dff49bf

Charlie81 commited on

2 new depth routing types
e868dbf

Charlie81 commited on

add different routing types
c085dea

Charlie81 commited on

set to multinomial top k
3aa53b4

Charlie81 commited on

change back to topk
4d16af6

Charlie81 commited on

remove experts override
858f5a5

Charlie81 commited on

multinomial2
72726ed

Charlie81 commited on

multinomial8
67ed347

Charlie81 commited on

experts per token to 64
41d8418

Charlie81 commited on

fix imports
8dd44dc

Charlie81 commited on

safetensors index
5b40103

Charlie81 commited on

change name casings
dc61de6

Charlie81 commited on

all thing
518168c

Charlie81 commited on

classname
dce42d4

Charlie81 commited on

remove docstrings
38cf6f0

Charlie81 commited on

fix issues
7c1f019

Charlie81 commited on

total repo overhaul
39baffa

Charlie81 commited on

match transformers sparse block
2daadcc

Charlie81 commited on

Revert "refactor sparse"
7bf23fe

Charlie81 commited on

refactor sparse
170c7d7

Charlie81 commited on

revert variable assignment hidden states shape
3325c29

Charlie81 commited on

debug router shape
d121ac8

Charlie81 commited on

re-add debugs
d2be9a4

Charlie81 commited on

remove printfs
4db3f58

Charlie81 commited on

consistent dense moe routers
6e3bd26

Charlie81 commited on

add debugs and attempt fix
dcf107e

Charlie81 commited on

add debug
02557de

Charlie81 commited on

debug statements
dddc08e

Charlie81 commited on

remove load balancing comment
757a1cb

Charlie81 commited on

load balancig loss to modeling
aa9c31c

Charlie81 commited on

claude attempt 2
19f9418

Charlie81 commited on

attempt claude fixes
c03d64a

Charlie81 commited on

remove commented out
cc26fce

Charlie81 commited on

attempt attn forward fix
90dbea8

Charlie81 commited on

undo sparse router removal
2b07c4b

Charlie81 commited on

allign with transformers code
6481d24

Charlie81 commited on

rotary forward
404cc11

Charlie81 commited on

comment out generation mixin
36b7e2b

Charlie81 commited on

modeling rope
affa284

Charlie81 commited on

olmoe rotary embedding
96145ba

Charlie81 commited on