File size: 2,556 Bytes
6adee94
 
 
 
 
82743e4
7eb6b15
 
 
 
 
 
 
 
 
 
 
 
 
a7d0adf
 
 
 
96d695a
a7d0adf
 
 
 
 
 
7eb6b15
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
82743e4
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
---
datasets:
- XenArcAI/MathX-5M
base_model:
- google/gemma-3-1b-it
pipeline_tag: text-generation
---

# Model Card: Parveshiiii/M1-MathX

## Model Details
- **Model Name:** Parveshiiii/M1-MathX  
- **Base Architecture:** Gemma (1B parameters)  
- **Model Type:** Causal Language Model (text-generation)  
- **Training Framework:** Hugging Face Transformers  
- **Precision:** fp16  
- **Attention Mechanism:** Hybrid sliding-window and full attention layers  
- **Tokenizer:** Gemma tokenizer (vocab size 262,144)

## Usage
```python
from transformers import pipeline, TextStreamer

pipe = pipeline("text-generation", model="Parveshiiii/M1-MathX")
messages = [
    {"role": "user", "content": "Who are you?"},
]
streamer = TextStreamer(pipe.tokenizer)
pipe(messages, streamer=streamer, max_new_tokens=10000)
```
## Intended Use
- Designed for mathematical reasoning tasks, including problem solving, equation manipulation, and step-by-step derivations.  
- Suitable for educational contexts, math tutoring, and research experiments in reasoning alignment.  
- Not intended for general-purpose conversation or sensitive domains outside mathematics.

## Training Data
- **Dataset:** MathX (curated mathematical reasoning dataset)  
- **Samples Used:** ~300  
- **Training Steps:** 50  
- **Method:** GRPO (Group Relative Policy Optimization) fine-tuning  
- **Objective:** Reinforcement-style alignment for improved reasoning clarity and correctness.

## Performance
- Demonstrated strong performance on small-scale math problems and symbolic reasoning tasks.  
- Early benchmarks suggest improved accuracy compared to the base Gemma 1B model on math-specific datasets.  
- Requires formal evaluation on GSM8K, MATH, and other benchmarks for quantitative comparison.

## Limitations
- Small dataset and limited training steps mean coverage is narrow.  
- May overfit to MathX patterns and fail on broader or more complex problems.  
- Not guaranteed to generalize outside mathematical reasoning.  
- As a 1B model, capacity is limited compared to larger LLMs.

## Ethical Considerations
- Intended for safe educational use.  
- Should not be deployed in high-stakes environments without further validation.  
- Outputs may contain errors; human oversight is required.

## Citation
If you use this model, please cite as:
```
@misc{Parvesh2025M1MathX,
  author = {Parvesh Rawal},
  title = {Parveshiiii/M1-MathX: A Gemma-1B model fine-tuned on MathX with GRPO},
  year = {2025},
  howpublished = {\url{https://huggingface.co/Parveshiiii/M1-MathX}}
}
```

---