Issues with Fine Tuning

#37

by rirv938 - opened 8 days ago

8 days ago

Hi. Great work on the models. Qwen team always produces great models.

I am running into an issue when fine tuning this model with Transformers Trainer. Basically both deepspeed stage 3 OR FSDP libraries result in a hanging on the first training step.

It may be that I am configuring something wrong during training (but I use the same script for many other models). So perhaps an example fine tuning script would be useful here. This is similar to GPTOSS which provides an example script "https://cookbook.openai.com/articles/gpt-oss/fine-tune-transfomers".

Thanks for any help here.

Robert.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment