LFM2-350M-Math / README.md
mlabonne's picture
Update README.md
27b199a verified
metadata
library_name: transformers
license: other
license_name: lfm1.0
license_link: LICENSE
language:
  - en
pipeline_tag: text-generation
tags:
  - liquid
  - lfm2
  - edge
base_model: LiquidAI/LFM2-350M
Liquid AI

LFM2-350M-Math

Based on LFM2-350M, LFM2-350M-Math is a tiny reasoning model designed for tackling tricky math problems.

You can find more information about other task-specific models in this blog post.

πŸ“„ Model details

Generation parameters: We strongly recommend using greedy decoding with a temperature=0.6, top_p=0.95, min_p=0.1, repetition_penalty=1.05.

System prompt: We recommend not using any system prompt.

Supported languages: English only.

Chat template: LFM2 uses a ChatML-like chat template as follows:

<|startoftext|><|im_start|>user
Find the sum of all integer bases $b>9$ for which $17_{b}$ is a divisor of $97_{b}$.<|im_end|>
<|im_start|>assistant
<|cot_start|>First, we need to convert $17_{b}$ and $97_{b}$ into base 10. [...]<|im_end|>

You can automatically apply it using the dedicated .apply_chat_template() function from Hugging Face transformers.

⚠️ The model is intended for single-turn conversations.

πŸ“ˆ Performance

Reasoning enables models to better structure their thought process, explore multiple solution strategies, and self-verify their final responses. Augmenting tiny models with extensive test-time compute in this way allows them to even solve challenging competition-level math problems. Our benchmark evaluations demonstrate that LFM2-350M-Math is highly capable for its size.

68d41660ccb9b4bb78d0ad93_Response Accuracy - dark mode

As we are excited about edge deployment, our goal is to limit memory consumption and latency. Our post-training recipe leverages reinforcement learning to explicitly bring down response verbosity where it is not desirable. To this end, we combine explicit reasoning budgets with difficulty-aware advantage re-weighting. Please refer to our separate blog post for a detailed post-training recipe.

68d4166ef8b3f7322f15c8cb_Response Length - dark mode

πŸƒ How to run

You can use the following Colab notebooks for easy inference and fine-tuning:

Notebook Description Link
Inference Run the model with Hugging Face's transformers library. Colab link
SFT (TRL) Supervised Fine-Tuning (SFT) notebook with a LoRA adapter using TRL. Colab link
DPO (TRL) Preference alignment with Direct Preference Optimization (DPO) using TRL. Colab link
SFT (Axolotl) Supervised Fine-Tuning (SFT) notebook with a LoRA adapter using Axolotl. Colab link
SFT (Unsloth) Supervised Fine-Tuning (SFT) notebook with a LoRA adapter using Unsloth. Colab link

πŸ“¬ Contact

If you are interested in custom solutions with edge deployment, please contact our sales team.