tommulder's picture
perf(attn): default to SDPA; gracefully fallback when flash_attn missing; use dtype arg
420a04f