MachineLearningLM
/

MachineLearningLM-7B-v1

Text Generation

Tabular Classification

text-generation-inference

Model card Files Files and versions

MachineLearningLM commited on Sep 10

Commit

dd7c905

·

verified ·

1 Parent(s): 883d5c4

Update README.md

Files changed (1) hide show

README.md +33 -0

README.md CHANGED Viewed

@@ -5,3 +5,36 @@ base_model:
 - Qwen/Qwen2.5-7B-Instruct
 ---

 - Qwen/Qwen2.5-7B-Instruct
 ---
+# MachineLearningLM
+## model summary
+Can LLMs learn from 1,000 in-context examples?
+Introducing **MachineLearningLM** 🧪📊 — a model continuously pretrained on millions of synthetic tabular ML tasks, enabling robust many-shot in-context learning.
+📈 **Scales from 8 to 1,024 examples**
+📈 **~15% improvement** on unseen tabular tasks compared to o3-mini / GPT-5-mini / Qwen-2.5-7B
+🌲 **Random-Forest–level robustness**
+🧠 **MMLU score: 75.4%**
+📄 Read the paper:  https://arxiv.org/abs/2509.06806
+ <img src="https://github.githubassets.com/favicons/favicon.svg" width="16" height="16"> GitHub: https://github.com/HaoAreYuDong/MachineLearningLM
+## evaluation and validation
+We have developed an automated evaluation framework — simply configure the parameters to easily perform validation and evaluation. The code is now open-sourced at our GitHub.
+### **Quick Start**
+```bash
+pip install -r requirement.txt
+python ./src/evaluation/model_pred/dl_model_pred.py \
+  --input_dir ./demo_input.jsonl \
+  --output_dir ./demo_output.jsonl \
+  --model_name hf_repo/model_name
+```
+For more usage details, please visit our GitHub.