Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
Nirav-Madhani
/
gemma3-270m-grpo-math
like
0
Text Generation
Transformers
Safetensors
English
gemma3_text
gemma3
trl
grpo
rlhf
sft
math
reasoning
chain-of-thought
experimental
colab
kaggle
text-generation-inference
License:
gemma
Model card
Files
Files and versions
xet
Community
Train
Deploy
Use this model
main
gemma3-270m-grpo-math
1.11 GB
1 contributor
History:
3 commits
Nirav-Madhani
Create README.md
8905eb3
verified
2 months ago
.gitattributes
Safe
1.57 kB
Upload rl_checkpoint snapshot
2 months ago
README.md
Safe
6.1 kB
Create README.md
2 months ago
config.json
Safe
1.35 kB
Upload rl_checkpoint snapshot
2 months ago
generation_config.json
Safe
128 Bytes
Upload rl_checkpoint snapshot
2 months ago
model.safetensors
1.07 GB
xet
Upload rl_checkpoint snapshot
2 months ago
special_tokens_map.json
Safe
662 Bytes
Upload rl_checkpoint snapshot
2 months ago
tokenizer.json
Safe
33.4 MB
xet
Upload rl_checkpoint snapshot
2 months ago
tokenizer_config.json
Safe
1.16 MB
Upload rl_checkpoint snapshot
2 months ago