File size: 987 Bytes
6c26802 0cc32d2 6c26802 0cc32d2 6c26802 0cc32d2 13c2c77 0cc32d2 13c2c77 0cc32d2 13c2c77 0cc32d2 a9ec422 0cc32d2 13c2c77 0cc32d2 7e2a1a9 0cc32d2 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 |
---
title: B2NL v6.2.1 - Byte-to-Natural Language Tokenizer π
emoji: π
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 4.19.2
app_file: app.py
pinned: true
license: apache-2.0
models:
- ggunio/B2NL-IntelligentTokenizer-v6.2.1
---
# B2NL v6.2.1 - Byte-to-Natural Language Tokenizer π
**Compress and reconstruct text with token boundaries**
β οΈ **IMPORTANT: Currently in AUTOREGRESSIVE MODE**
- Current: ~500ms inference (Teacher Forcing training)
- Coming Soon (November 2025): Non-autoregressive training (<50ms)
## π What's New in v6.2.1
- **204 languages** support (up from 6)
- **16:1 fixed compression** ratio
- **Multi-Query Attention** (8x memory reduction)
- Model: [ggunio/B2NL-IntelligentTokenizer-v6.2.1](https://huggingface.co/ggunio/B2NL-IntelligentTokenizer-v6.2.1)
## Author
**Jinhyun Woo**
- GitHub: [Woojiggun/intelligent-tokenizer](https://github.com/Woojiggun/intelligent-tokenizer)
- Paper: [Zenodo](https://zenodo.org/records/17116281)
|