VellumMini-0.1-Qwen3-14B

Just a sneak peek of what I'm cooking in a little project called Vellum. This model was made to evaluate the quality of the CreativeGPT dataset, and how well Qwen3 trains on it. This is just one of many datasets that the final model will be trained on (which will also be using a different base model).

This got pretty good results compared to the regular instruct in my testing so thought I would share. I trained for 3 epochs, but both checkpoints at 2 epoch and 3 epoch were too overbaked. This checkpoint, at 1 epoch performed best.

I'm pretty surprised how decent this came out since Qwen models aren't that great at writing, especially at this size.

Usage

Use with thinking/chain-of-thought disabled. Use ChatML prompt format.

Qwen suggested sampler settings are recommended.

Temperature: 0.7

Top_P: 0.8

Top_K: 20

Min_P: 0

Quants

GGUFs

iMatrix

These are reccommended.

Static

Special Thanks

Big thanks to everyone over at the KoboldAI discord. The members there have helped me a ton with various things over the long while I've been there.

Training Details

Parent Model

https://huggingface.co/Qwen/Qwen3-14B

Training Method

Full fine-tune - SFT

Dataset(s)

https://huggingface.co/datasets/N8Programs/CreativeGPT

Training Hyperparameters

Batch size
4

Learning rate
0.00001

Number of epochs
3

Warmup ratio
0.05

Weight decay
0.02

Max gradient norm
1

Packing
No

Training Results

Screenshot_20251005_020153

Downloads last month
27
Safetensors
Model size
15B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for lemon07r/VellumMini-0.1-Qwen3-14B

Finetuned
Qwen/Qwen3-14B
Finetuned
(134)
this model
Quantizations
4 models

Dataset used to train lemon07r/VellumMini-0.1-Qwen3-14B