LFM2-350M-Math / README.md

mlabonne

Update README.md

27b199a verified 13 days ago

preview code

raw

history blame contribute delete

10.8 kB

metadata

library_name: transformers
license: other
license_name: lfm1.0
license_link: LICENSE
language:
  - en
pipeline_tag: text-generation
tags:
  - liquid
  - lfm2
  - edge
base_model: LiquidAI/LFM2-350M

LFM2-350M-Math

Based on LFM2-350M, LFM2-350M-Math is a tiny reasoning model designed for tackling tricky math problems.

You can find more information about other task-specific models in this blog post.

📄 Model details

Generation parameters: We strongly recommend using greedy decoding with a temperature=0.6, top_p=0.95, min_p=0.1, repetition_penalty=1.05.

System prompt: We recommend not using any system prompt.

Supported languages: English only.

Chat template: LFM2 uses a ChatML-like chat template as follows:

<|startoftext|><|im_start|>user
Find the sum of all integer bases $b>9$ for which $17_{b}$ is a divisor of $97_{b}$.<|im_end|>
<|im_start|>assistant
<|cot_start|>First, we need to convert $17_{b}$ and $97_{b}$ into base 10. [...]<|im_end|>

You can automatically apply it using the dedicated .apply_chat_template() function from Hugging Face transformers.

⚠️ The model is intended for single-turn conversations.

📈 Performance

Reasoning enables models to better structure their thought process, explore multiple solution strategies, and self-verify their final responses. Augmenting tiny models with extensive test-time compute in this way allows them to even solve challenging competition-level math problems. Our benchmark evaluations demonstrate that LFM2-350M-Math is highly capable for its size.

As we are excited about edge deployment, our goal is to limit memory consumption and latency. Our post-training recipe leverages reinforcement learning to explicitly bring down response verbosity where it is not desirable. To this end, we combine explicit reasoning budgets with difficulty-aware advantage re-weighting. Please refer to our separate blog post for a detailed post-training recipe.

🏃 How to run

Hugging Face: LFM2-350M
llama.cpp: LFM2-350M-Math-GGUF
LEAP: LEAP model library

You can use the following Colab notebooks for easy inference and fine-tuning:

Notebook	Description	Link
Inference	Run the model with Hugging Face's transformers library.
SFT (TRL)	Supervised Fine-Tuning (SFT) notebook with a LoRA adapter using TRL.
DPO (TRL)	Preference alignment with Direct Preference Optimization (DPO) using TRL.
SFT (Axolotl)	Supervised Fine-Tuning (SFT) notebook with a LoRA adapter using Axolotl.
SFT (Unsloth)	Supervised Fine-Tuning (SFT) notebook with a LoRA adapter using Unsloth.

📬 Contact

If you are interested in custom solutions with edge deployment, please contact our sales team.