testing__pvv2_lora / training_artifacts /logs /pipeline_cleaned.txt

Upload folder using huggingface_hub

2203975 verified 13 days ago

41.2 kB

	[2025-10-24 23:55:28] ========================================
	[2025-10-24 23:55:28] Job Name: testing__pvv2_lora
	[2025-10-24 23:55:28] Hostname: gl007.hpc.nyu.edu
	[2025-10-24 23:55:28] Number of nodes: 1
	[2025-10-24 23:55:28] GPUs per node: 2
	[2025-10-24 23:55:28] Start Time: Fri Oct 24 11:55:28 PM EDT 2025
	[2025-10-24 23:55:28] Log file: /scratch/zrs2020/LlamaFactoryHelper/experiments/testing__pvv2_lora/logs/pipeline.log
	[2025-10-24 23:55:28] ========================================
	[2025-10-24 23:55:28] Sourcing secrets from: /scratch/zrs2020/LlamaFactoryHelper/secrets.env
	[2025-10-24 23:55:30]
	[2025-10-24 23:55:30] ========================================
	[2025-10-24 23:55:30] Configuration Paths
	[2025-10-24 23:55:30] ========================================
	[2025-10-24 23:55:30] Train Config: /scratch/zrs2020/LlamaFactoryHelper/experiments/testing__pvv2_lora/configs/train_config.yaml
	[2025-10-24 23:55:30] Merge Config: /scratch/zrs2020/LlamaFactoryHelper/experiments/testing__pvv2_lora/configs/merge_config.yaml
	[2025-10-24 23:55:30] Dataset Info: /scratch/zrs2020/LlamaFactoryHelper/LLaMA-Factory/data/dataset_info.json
	[2025-10-24 23:55:30] Output Dir: /scratch/zrs2020/LlamaFactoryHelper/experiments/testing__pvv2_lora/checkpoints
	[2025-10-24 23:55:30] Export Dir: /scratch/zrs2020/LlamaFactoryHelper/experiments/testing__pvv2_lora/merged
	[2025-10-24 23:55:30] HF Repo ID: TAUR-dev/testing__pvv2_lora
	[2025-10-24 23:55:30]
	[make-effective-cfg] tokenized_path: /scratch/zrs2020/.cache/hf_cache/home/llamafactory/tokenized/TAUR_dev_D_SFT_C_ours_cd3arg_10responses_reflections10_formats_C_full_fb94f2a3
	[make-effective-cfg] wrote: /scratch/zrs2020/LlamaFactoryHelper/experiments/testing__pvv2_lora/logs/train_config.effective.yaml
	[2025-10-24 23:55:30]
	[2025-10-24 23:55:30] ========================================
	[2025-10-24 23:55:30] STAGE 0: Downloading Dataset
	[2025-10-24 23:55:30] Dataset: TAUR-dev/D-SFT_C-ours_cd3arg_10responses_reflections10_formats-C_full
	[2025-10-24 23:55:30] Start Time: Fri Oct 24 11:55:30 PM EDT 2025
	[2025-10-24 23:55:30] ========================================
	[dataset-download] Loading dataset from: TAUR-dev/D-SFT_C-ours_cd3arg_10responses_reflections10_formats-C_full
	[dataset-download] Dataset loaded successfully
	[dataset-download] Dataset info: DatasetDict({
	train: Dataset({
	features: ['conversations', 'sft_template_type_idx'],
	num_rows: 29130
	})
	})
	[2025-10-24 23:55:32]
	[2025-10-24 23:55:32] ========================================
	[2025-10-24 23:55:32] Dataset download completed
	[2025-10-24 23:55:32] End Time: Fri Oct 24 11:55:32 PM EDT 2025
	[2025-10-24 23:55:32] ========================================
	[2025-10-24 23:55:32]
	[2025-10-24 23:55:32] ========================================
	[2025-10-24 23:55:32] STAGE 1: Training Model
	[2025-10-24 23:55:32] Start Time: Fri Oct 24 11:55:32 PM EDT 2025
	[2025-10-24 23:55:32] ========================================
	[2025-10-24 23:55:32] Job: testing__pvv2_lora
	[2025-10-24 23:55:32] Nodes: 1 \| GPUs/node: 2
	[2025-10-24 23:55:32] Master: 127.0.0.1:29500
	[2025-10-24 23:55:32] LLaMA-Factory: /scratch/zrs2020/LlamaFactoryHelper/LLaMA-Factory
	[2025-10-24 23:55:32] Train cfg (effective): /scratch/zrs2020/LlamaFactoryHelper/experiments/testing__pvv2_lora/logs/train_config.effective.yaml
	[2025-10-24 23:55:32] HF cache: /scratch/zrs2020/.cache/hf_cache/home/datasets
	[2025-10-24 23:55:32] Launcher: torchrun
	[2025-10-24 23:55:32]
	[2025-10-24 23:55:32] Single-node training (2 GPU(s))
	[2025-10-24 23:55:32] Executing command: llamafactory-cli train /scratch/zrs2020/LlamaFactoryHelper/experiments/testing__pvv2_lora/logs/train_config.effective.yaml
	/scratch/zrs2020/miniconda/miniconda3/envs/llamafactory/lib/python3.12/site-packages/transformers/utils/hub.py:110: FutureWarning: Using `TRANSFORMERS_CACHE` is deprecated and will be removed in v5 of Transformers. Use `HF_HOME` instead.
	warnings.warn(
	[INFO\|2025-10-24 23:55:40] llamafactory.launcher:143 >> Initializing 2 distributed tasks at: 127.0.0.1:29500
	W1024 23:55:41.864000 3022854 site-packages/torch/distributed/run.py:803]
	W1024 23:55:41.864000 3022854 site-packages/torch/distributed/run.py:803] *****************************************
	W1024 23:55:41.864000 3022854 site-packages/torch/distributed/run.py:803] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
	W1024 23:55:41.864000 3022854 site-packages/torch/distributed/run.py:803] *****************************************
	/scratch/zrs2020/miniconda/miniconda3/envs/llamafactory/lib/python3.12/site-packages/transformers/utils/hub.py:110: FutureWarning: Using `TRANSFORMERS_CACHE` is deprecated and will be removed in v5 of Transformers. Use `HF_HOME` instead.
	warnings.warn(
	/scratch/zrs2020/miniconda/miniconda3/envs/llamafactory/lib/python3.12/site-packages/transformers/utils/hub.py:110: FutureWarning: Using `TRANSFORMERS_CACHE` is deprecated and will be removed in v5 of Transformers. Use `HF_HOME` instead.
	warnings.warn(
	/scratch/zrs2020/miniconda/miniconda3/envs/llamafactory/lib/python3.12/site-packages/jieba/_compat.py:18: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.
	import pkg_resources
	/scratch/zrs2020/miniconda/miniconda3/envs/llamafactory/lib/python3.12/site-packages/jieba/_compat.py:18: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.
	import pkg_resources
	[W1024 23:55:50.757874363 ProcessGroupNCCL.cpp:924] Warning: TORCH_NCCL_AVOID_RECORD_STREAMS is the default now, this environment variable is thus deprecated. (function operator())
	[W1024 23:55:50.757887679 ProcessGroupNCCL.cpp:924] Warning: TORCH_NCCL_AVOID_RECORD_STREAMS is the default now, this environment variable is thus deprecated. (function operator())
	[INFO\|2025-10-24 23:55:50] llamafactory.hparams.parser:143 >> Set `ddp_find_unused_parameters` to False in DDP training since LoRA is enabled.
	[INFO\|2025-10-24 23:55:50] llamafactory.hparams.parser:423 >> Process rank: 0, world size: 2, device: cuda:0, distributed training: True, compute dtype: torch.bfloat16
	[INFO\|2025-10-24 23:55:50] llamafactory.hparams.parser:423 >> Process rank: 1, world size: 2, device: cuda:1, distributed training: True, compute dtype: torch.bfloat16
	[INFO\|tokenization_utils_base.py:2095] 2025-10-24 23:55:50,441 >> loading file vocab.json from cache at /scratch/zrs2020/.cache/hf_cache/home/hub/models--Qwen--Qwen2.5-1.5B-Instruct/snapshots/989aa7980e4cf806f80c7fef2b1adb7bc71aa306/vocab.json
	[INFO\|tokenization_utils_base.py:2095] 2025-10-24 23:55:50,442 >> loading file merges.txt from cache at /scratch/zrs2020/.cache/hf_cache/home/hub/models--Qwen--Qwen2.5-1.5B-Instruct/snapshots/989aa7980e4cf806f80c7fef2b1adb7bc71aa306/merges.txt
	[INFO\|tokenization_utils_base.py:2095] 2025-10-24 23:55:50,442 >> loading file tokenizer.json from cache at /scratch/zrs2020/.cache/hf_cache/home/hub/models--Qwen--Qwen2.5-1.5B-Instruct/snapshots/989aa7980e4cf806f80c7fef2b1adb7bc71aa306/tokenizer.json
	[INFO\|tokenization_utils_base.py:2095] 2025-10-24 23:55:50,442 >> loading file added_tokens.json from cache at None
	[INFO\|tokenization_utils_base.py:2095] 2025-10-24 23:55:50,442 >> loading file special_tokens_map.json from cache at None
	[INFO\|tokenization_utils_base.py:2095] 2025-10-24 23:55:50,442 >> loading file tokenizer_config.json from cache at /scratch/zrs2020/.cache/hf_cache/home/hub/models--Qwen--Qwen2.5-1.5B-Instruct/snapshots/989aa7980e4cf806f80c7fef2b1adb7bc71aa306/tokenizer_config.json
	[INFO\|tokenization_utils_base.py:2095] 2025-10-24 23:55:50,442 >> loading file chat_template.jinja from cache at None
	[INFO\|tokenization_utils_base.py:2364] 2025-10-24 23:55:50,609 >> Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
	[INFO\|configuration_utils.py:765] 2025-10-24 23:55:50,826 >> loading configuration file config.json from cache at /scratch/zrs2020/.cache/hf_cache/home/hub/models--Qwen--Qwen2.5-1.5B-Instruct/snapshots/989aa7980e4cf806f80c7fef2b1adb7bc71aa306/config.json
	[INFO\|configuration_utils.py:839] 2025-10-24 23:55:50,828 >> Model config Qwen2Config {
	"architectures": [
	"Qwen2ForCausalLM"
	],
	"attention_dropout": 0.0,
	"bos_token_id": 151643,
	"dtype": "bfloat16",
	"eos_token_id": 151645,
	"hidden_act": "silu",
	"hidden_size": 1536,
	"initializer_range": 0.02,
	"intermediate_size": 8960,
	"layer_types": [
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention"
	],
	"max_position_embeddings": 32768,
	"max_window_layers": 21,
	"model_type": "qwen2",
	"num_attention_heads": 12,
	"num_hidden_layers": 28,
	"num_key_value_heads": 2,
	"rms_norm_eps": 1e-06,
	"rope_scaling": null,
	"rope_theta": 1000000.0,
	"sliding_window": null,
	"tie_word_embeddings": true,
	"transformers_version": "4.57.1",
	"use_cache": true,
	"use_sliding_window": false,
	"vocab_size": 151936
	}

	[INFO\|tokenization_utils_base.py:2095] 2025-10-24 23:55:50,899 >> loading file vocab.json from cache at /scratch/zrs2020/.cache/hf_cache/home/hub/models--Qwen--Qwen2.5-1.5B-Instruct/snapshots/989aa7980e4cf806f80c7fef2b1adb7bc71aa306/vocab.json
	[INFO\|tokenization_utils_base.py:2095] 2025-10-24 23:55:50,899 >> loading file merges.txt from cache at /scratch/zrs2020/.cache/hf_cache/home/hub/models--Qwen--Qwen2.5-1.5B-Instruct/snapshots/989aa7980e4cf806f80c7fef2b1adb7bc71aa306/merges.txt
	[INFO\|tokenization_utils_base.py:2095] 2025-10-24 23:55:50,899 >> loading file tokenizer.json from cache at /scratch/zrs2020/.cache/hf_cache/home/hub/models--Qwen--Qwen2.5-1.5B-Instruct/snapshots/989aa7980e4cf806f80c7fef2b1adb7bc71aa306/tokenizer.json
	[INFO\|tokenization_utils_base.py:2095] 2025-10-24 23:55:50,899 >> loading file added_tokens.json from cache at None
	[INFO\|tokenization_utils_base.py:2095] 2025-10-24 23:55:50,899 >> loading file special_tokens_map.json from cache at None
	[INFO\|tokenization_utils_base.py:2095] 2025-10-24 23:55:50,899 >> loading file tokenizer_config.json from cache at /scratch/zrs2020/.cache/hf_cache/home/hub/models--Qwen--Qwen2.5-1.5B-Instruct/snapshots/989aa7980e4cf806f80c7fef2b1adb7bc71aa306/tokenizer_config.json
	[INFO\|tokenization_utils_base.py:2095] 2025-10-24 23:55:50,899 >> loading file chat_template.jinja from cache at None
	[INFO\|tokenization_utils_base.py:2364] 2025-10-24 23:55:51,063 >> Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
	[WARNING\|2025-10-24 23:55:51] llamafactory.data.loader:148 >> Loading dataset from disk will ignore other data arguments.
	[INFO\|2025-10-24 23:55:51] llamafactory.data.loader:143 >> Loaded tokenized dataset from /scratch/zrs2020/.cache/hf_cache/home/llamafactory/tokenized/TAUR_dev_D_SFT_C_ours_cd3arg_10responses_reflections10_formats_C_full_fb94f2a3.
	[INFO\|configuration_utils.py:765] 2025-10-24 23:55:51,138 >> loading configuration file config.json from cache at /scratch/zrs2020/.cache/hf_cache/home/hub/models--Qwen--Qwen2.5-1.5B-Instruct/snapshots/989aa7980e4cf806f80c7fef2b1adb7bc71aa306/config.json
	[INFO\|configuration_utils.py:839] 2025-10-24 23:55:51,138 >> Model config Qwen2Config {
	"architectures": [
	"Qwen2ForCausalLM"
	],
	"attention_dropout": 0.0,
	"bos_token_id": 151643,
	"dtype": "bfloat16",
	"eos_token_id": 151645,
	"hidden_act": "silu",
	"hidden_size": 1536,
	"initializer_range": 0.02,
	"intermediate_size": 8960,
	"layer_types": [
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention"
	],
	"max_position_embeddings": 32768,
	"max_window_layers": 21,
	"model_type": "qwen2",
	"num_attention_heads": 12,
	"num_hidden_layers": 28,
	"num_key_value_heads": 2,
	"rms_norm_eps": 1e-06,
	"rope_scaling": null,
	"rope_theta": 1000000.0,
	"sliding_window": null,
	"tie_word_embeddings": true,
	"transformers_version": "4.57.1",
	"use_cache": true,
	"use_sliding_window": false,
	"vocab_size": 151936
	}

	[INFO\|2025-10-24 23:55:51] llamafactory.model.model_utils.kv_cache:143 >> KV cache is disabled during training.
	[WARNING\|logging.py:328] 2025-10-24 23:55:51,492 >> `torch_dtype` is deprecated! Use `dtype` instead!
	[INFO\|modeling_utils.py:1172] 2025-10-24 23:55:51,493 >> loading weights file model.safetensors from cache at /scratch/zrs2020/.cache/hf_cache/home/hub/models--Qwen--Qwen2.5-1.5B-Instruct/snapshots/989aa7980e4cf806f80c7fef2b1adb7bc71aa306/model.safetensors
	[INFO\|modeling_utils.py:2341] 2025-10-24 23:55:51,494 >> Instantiating Qwen2ForCausalLM model under default dtype torch.bfloat16.
	[INFO\|configuration_utils.py:986] 2025-10-24 23:55:51,495 >> Generate config GenerationConfig {
	"bos_token_id": 151643,
	"eos_token_id": 151645,
	"use_cache": false
	}

	`torch_dtype` is deprecated! Use `dtype` instead!
	[INFO\|configuration_utils.py:941] 2025-10-24 23:55:52,421 >> loading configuration file generation_config.json from cache at /scratch/zrs2020/.cache/hf_cache/home/hub/models--Qwen--Qwen2.5-1.5B-Instruct/snapshots/989aa7980e4cf806f80c7fef2b1adb7bc71aa306/generation_config.json
	[INFO\|configuration_utils.py:986] 2025-10-24 23:55:52,421 >> Generate config GenerationConfig {
	"bos_token_id": 151643,
	"do_sample": true,
	"eos_token_id": [
	151645,
	151643
	],
	"pad_token_id": 151643,
	"repetition_penalty": 1.1,
	"temperature": 0.7,
	"top_k": 20,
	"top_p": 0.8
	}

	[INFO\|dynamic_module_utils.py:423] 2025-10-24 23:55:52,453 >> Could not locate the custom_generate/generate.py inside Qwen/Qwen2.5-1.5B-Instruct.
	[INFO\|2025-10-24 23:55:52] llamafactory.model.model_utils.checkpointing:143 >> Gradient checkpointing enabled.
	[INFO\|2025-10-24 23:55:52] llamafactory.model.model_utils.attention:143 >> Using torch SDPA for faster training and inference.
	[INFO\|2025-10-24 23:55:52] llamafactory.model.adapter:143 >> Upcasting trainable params to float32.
	[INFO\|2025-10-24 23:55:52] llamafactory.model.adapter:143 >> Fine-tuning method: LoRA
	[INFO\|2025-10-24 23:55:52] llamafactory.model.model_utils.misc:143 >> Found linear modules: o_proj,gate_proj,q_proj,down_proj,v_proj,k_proj,up_proj
	[INFO\|2025-10-24 23:55:52] llamafactory.model.loader:143 >> trainable params: 9,232,384 \|\| all params: 1,552,946,688 \|\| trainable%: 0.5945
	[WARNING\|trainer.py:906] 2025-10-24 23:55:52,738 >> The model is already on multiple devices. Skipping the move to device specified in `args`.
	[INFO\|trainer.py:699] 2025-10-24 23:55:52,740 >> max_steps is given, it will override any value given in num_train_epochs
	[INFO\|trainer.py:749] 2025-10-24 23:55:52,740 >> Using auto half precision backend
	[WARNING\|trainer.py:982] 2025-10-24 23:55:52,742 >> The tokenizer has new PAD/BOS/EOS tokens that differ from the model config and generation config. The model config and generation config were aligned accordingly, being updated with the tokenizer's values. Updated tokens: {'bos_token_id': None, 'pad_token_id': 151643}.
	The model is already on multiple devices. Skipping the move to device specified in `args`.
	The tokenizer has new PAD/BOS/EOS tokens that differ from the model config and generation config. The model config and generation config were aligned accordingly, being updated with the tokenizer's values. Updated tokens: {'bos_token_id': None, 'pad_token_id': 151643}.
	NCCL version 2.27.5+cuda12.9
	[INFO\|trainer.py:2519] 2025-10-24 23:55:53,120 >> *** Running training ***
	[INFO\|trainer.py:2520] 2025-10-24 23:55:53,120 >> Num examples = 29,130
	[INFO\|trainer.py:2521] 2025-10-24 23:55:53,120 >> Num Epochs = 1
	[INFO\|trainer.py:2522] 2025-10-24 23:55:53,120 >> Instantaneous batch size per device = 1
	[INFO\|trainer.py:2525] 2025-10-24 23:55:53,120 >> Total train batch size (w. parallel, distributed & accumulation) = 2
	[INFO\|trainer.py:2526] 2025-10-24 23:55:53,120 >> Gradient Accumulation steps = 1
	[INFO\|trainer.py:2527] 2025-10-24 23:55:53,120 >> Total optimization steps = 10
	[INFO\|trainer.py:2528] 2025-10-24 23:55:53,122 >> Number of trainable parameters = 9,232,384
	[INFO\|integration_utils.py:867] 2025-10-24 23:55:53,220 >> Automatic Weights & Biases logging enabled, to disable set os.environ["WANDB_DISABLED"] = "true"
	wandb: Currently logged in as: zsprague (ut_nlp_deduce) to https://api.wandb.ai. Use `wandb login --relogin` to force relogin
	wandb: Tracking run with wandb version 0.22.2
	wandb: Run data is saved locally in /scratch/zrs2020/LlamaFactoryHelper/wandb/run-20251024_235553-oqx8ngeo
	wandb: Run `wandb offline` to turn off syncing.
	wandb: Syncing run testing__pvv2_lora
	wandb: View project at https://wandb.ai/ut_nlp_deduce/llamafactory
	wandb: View run at https://wandb.ai/ut_nlp_deduce/llamafactory/runs/oqx8ngeo
	0%\| \| 0/10 [00:00<?, ?it/s] 10%\| \| 1/10 [00:01<00:10, 1.22s/it] 20%\| \| 2/10 [00:01<00:06, 1.27it/s] 30%\| \| 3/10 [00:02<00:04, 1.68it/s] 40%\| \| 4/10 [00:03<00:04, 1.37it/s] 50%\| \| 5/10 [00:03<00:03, 1.49it/s][INFO\|trainer.py:4309] 2025-10-24 23:55:57,737 >> Saving model checkpoint to /scratch/zrs2020/LlamaFactoryHelper/experiments/testing__pvv2_lora/checkpoints/checkpoint-5
	[INFO\|configuration_utils.py:765] 2025-10-24 23:55:57,839 >> loading configuration file config.json from cache at /scratch/zrs2020/.cache/hf_cache/home/hub/models--Qwen--Qwen2.5-1.5B-Instruct/snapshots/989aa7980e4cf806f80c7fef2b1adb7bc71aa306/config.json
	[INFO\|configuration_utils.py:839] 2025-10-24 23:55:57,840 >> Model config Qwen2Config {
	"architectures": [
	"Qwen2ForCausalLM"
	],
	"attention_dropout": 0.0,
	"bos_token_id": 151643,
	"dtype": "bfloat16",
	"eos_token_id": 151645,
	"hidden_act": "silu",
	"hidden_size": 1536,
	"initializer_range": 0.02,
	"intermediate_size": 8960,
	"layer_types": [
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention"
	],
	"max_position_embeddings": 32768,
	"max_window_layers": 21,
	"model_type": "qwen2",
	"num_attention_heads": 12,
	"num_hidden_layers": 28,
	"num_key_value_heads": 2,
	"rms_norm_eps": 1e-06,
	"rope_scaling": null,
	"rope_theta": 1000000.0,
	"sliding_window": null,
	"tie_word_embeddings": true,
	"transformers_version": "4.57.1",
	"use_cache": true,
	"use_sliding_window": false,
	"vocab_size": 151936
	}

	[INFO\|tokenization_utils_base.py:2421] 2025-10-24 23:55:58,067 >> chat template saved in /scratch/zrs2020/LlamaFactoryHelper/experiments/testing__pvv2_lora/checkpoints/checkpoint-5/chat_template.jinja
	[INFO\|tokenization_utils_base.py:2590] 2025-10-24 23:55:58,072 >> tokenizer config file saved in /scratch/zrs2020/LlamaFactoryHelper/experiments/testing__pvv2_lora/checkpoints/checkpoint-5/tokenizer_config.json
	[INFO\|tokenization_utils_base.py:2599] 2025-10-24 23:55:58,076 >> Special tokens file saved in /scratch/zrs2020/LlamaFactoryHelper/experiments/testing__pvv2_lora/checkpoints/checkpoint-5/special_tokens_map.json
	60%\| \| 6/10 [00:05<00:04, 1.18s/it] 70%\| \| 7/10 [00:06<00:02, 1.11it/s] 80%\| \| 8/10 [00:06<00:01, 1.23it/s] 90%\| \| 9/10 [00:07<00:00, 1.45it/s]100%\|\| 10/10 [00:08<00:00, 1.23it/s] {'loss': 0.7188, 'grad_norm': 0.2177160233259201, 'learning_rate': 3.015368960704584e-08, 'epoch': 0.0}
	100%\|\| 10/10 [00:08<00:00, 1.23it/s][INFO\|trainer.py:4309] 2025-10-24 23:56:02,371 >> Saving model checkpoint to /scratch/zrs2020/LlamaFactoryHelper/experiments/testing__pvv2_lora/checkpoints/checkpoint-10
	[INFO\|configuration_utils.py:765] 2025-10-24 23:56:02,490 >> loading configuration file config.json from cache at /scratch/zrs2020/.cache/hf_cache/home/hub/models--Qwen--Qwen2.5-1.5B-Instruct/snapshots/989aa7980e4cf806f80c7fef2b1adb7bc71aa306/config.json
	[INFO\|configuration_utils.py:839] 2025-10-24 23:56:02,491 >> Model config Qwen2Config {
	"architectures": [
	"Qwen2ForCausalLM"
	],
	"attention_dropout": 0.0,
	"bos_token_id": 151643,
	"dtype": "bfloat16",
	"eos_token_id": 151645,
	"hidden_act": "silu",
	"hidden_size": 1536,
	"initializer_range": 0.02,
	"intermediate_size": 8960,
	"layer_types": [
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention"
	],
	"max_position_embeddings": 32768,
	"max_window_layers": 21,
	"model_type": "qwen2",
	"num_attention_heads": 12,
	"num_hidden_layers": 28,
	"num_key_value_heads": 2,
	"rms_norm_eps": 1e-06,
	"rope_scaling": null,
	"rope_theta": 1000000.0,
	"sliding_window": null,
	"tie_word_embeddings": true,
	"transformers_version": "4.57.1",
	"use_cache": true,
	"use_sliding_window": false,
	"vocab_size": 151936
	}

	[INFO\|tokenization_utils_base.py:2421] 2025-10-24 23:56:02,701 >> chat template saved in /scratch/zrs2020/LlamaFactoryHelper/experiments/testing__pvv2_lora/checkpoints/checkpoint-10/chat_template.jinja
	[INFO\|tokenization_utils_base.py:2590] 2025-10-24 23:56:02,706 >> tokenizer config file saved in /scratch/zrs2020/LlamaFactoryHelper/experiments/testing__pvv2_lora/checkpoints/checkpoint-10/tokenizer_config.json
	[INFO\|tokenization_utils_base.py:2599] 2025-10-24 23:56:02,710 >> Special tokens file saved in /scratch/zrs2020/LlamaFactoryHelper/experiments/testing__pvv2_lora/checkpoints/checkpoint-10/special_tokens_map.json
	[INFO\|trainer.py:2810] 2025-10-24 23:56:03,258 >>

	Training completed. Do not forget to share your model on huggingface.co/models =)


	{'train_runtime': 10.137, 'train_samples_per_second': 1.973, 'train_steps_per_second': 0.986, 'train_loss': 0.718793535232544, 'epoch': 0.0}
	100%\|\| 10/10 [00:09<00:00, 1.23it/s]100%\|\| 10/10 [00:09<00:00, 1.10it/s]
	[INFO\|trainer.py:4309] 2025-10-24 23:56:03,267 >> Saving model checkpoint to /scratch/zrs2020/LlamaFactoryHelper/experiments/testing__pvv2_lora/checkpoints
	[INFO\|configuration_utils.py:765] 2025-10-24 23:56:03,356 >> loading configuration file config.json from cache at /scratch/zrs2020/.cache/hf_cache/home/hub/models--Qwen--Qwen2.5-1.5B-Instruct/snapshots/989aa7980e4cf806f80c7fef2b1adb7bc71aa306/config.json
	[INFO\|configuration_utils.py:839] 2025-10-24 23:56:03,357 >> Model config Qwen2Config {
	"architectures": [
	"Qwen2ForCausalLM"
	],
	"attention_dropout": 0.0,
	"bos_token_id": 151643,
	"dtype": "bfloat16",
	"eos_token_id": 151645,
	"hidden_act": "silu",
	"hidden_size": 1536,
	"initializer_range": 0.02,
	"intermediate_size": 8960,
	"layer_types": [
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention"
	],
	"max_position_embeddings": 32768,
	"max_window_layers": 21,
	"model_type": "qwen2",
	"num_attention_heads": 12,
	"num_hidden_layers": 28,
	"num_key_value_heads": 2,
	"rms_norm_eps": 1e-06,
	"rope_scaling": null,
	"rope_theta": 1000000.0,
	"sliding_window": null,
	"tie_word_embeddings": true,
	"transformers_version": "4.57.1",
	"use_cache": true,
	"use_sliding_window": false,
	"vocab_size": 151936
	}

	[INFO\|tokenization_utils_base.py:2421] 2025-10-24 23:56:03,588 >> chat template saved in /scratch/zrs2020/LlamaFactoryHelper/experiments/testing__pvv2_lora/checkpoints/chat_template.jinja
	[INFO\|tokenization_utils_base.py:2590] 2025-10-24 23:56:03,592 >> tokenizer config file saved in /scratch/zrs2020/LlamaFactoryHelper/experiments/testing__pvv2_lora/checkpoints/tokenizer_config.json
	[INFO\|tokenization_utils_base.py:2599] 2025-10-24 23:56:03,596 >> Special tokens file saved in /scratch/zrs2020/LlamaFactoryHelper/experiments/testing__pvv2_lora/checkpoints/special_tokens_map.json
	*** train metrics ***
	epoch = 0.0007
	total_flos = 414519GF
	train_loss = 0.7188
	train_runtime = 0:00:10.13
	train_samples_per_second = 1.973
	train_steps_per_second = 0.986
	[INFO\|modelcard.py:456] 2025-10-24 23:56:03,838 >> Dropping the following result as it does not have all the necessary fields:
	{'task': {'name': 'Causal Language Modeling', 'type': 'text-generation'}}
	[W1024 23:56:04.029787829 AllocatorConfig.cpp:28] Warning: PYTORCH_CUDA_ALLOC_CONF is deprecated, use PYTORCH_ALLOC_CONF instead (function operator())
	[1;34mwandb[0m:
	[1;34mwandb[0m: View run [33mtesting__pvv2_lora[0m at: [34m[0m
	[1;34mwandb[0m: Find logs at: [1;35mwandb/run-20251024_235553-oqx8ngeo/logs[0m
	[W1024 23:56:05.730735839 AllocatorConfig.cpp:28] Warning: PYTORCH_CUDA_ALLOC_CONF is deprecated, use PYTORCH_ALLOC_CONF instead (function operator())
	[W1024 23:56:05.132682733 AllocatorConfig.cpp:28] Warning: PYTORCH_CUDA_ALLOC_CONF is deprecated, use PYTORCH_ALLOC_CONF instead (function operator())
	[W1024 23:56:05.555229777 AllocatorConfig.cpp:28] Warning: PYTORCH_CUDA_ALLOC_CONF is deprecated, use PYTORCH_ALLOC_CONF instead (function operator())
	[2025-10-24 23:56:06]
	[2025-10-24 23:56:06] ========================================
	[2025-10-24 23:56:06] Training completed successfully
	[2025-10-24 23:56:06] End Time: Fri Oct 24 11:56:06 PM EDT 2025
	[2025-10-24 23:56:06] ========================================
	[2025-10-24 23:56:06]
	[2025-10-24 23:56:06] ========================================
	[2025-10-24 23:56:06] STAGE 2: Merging/Exporting Model
	[2025-10-24 23:56:06] Start Time: Fri Oct 24 11:56:06 PM EDT 2025
	[2025-10-24 23:56:06] ========================================
	[2025-10-24 23:56:06] Looking for checkpoints in: /scratch/zrs2020/LlamaFactoryHelper/experiments/testing__pvv2_lora/checkpoints
	[2025-10-24 23:56:06] Analyzing checkpoints to find the one from current training run...
	[2025-10-24 23:56:06] - checkpoint-10: trainer_state.json modified at Fri Oct 24 11:56:03 PM EDT 2025
	[2025-10-24 23:56:06] - checkpoint-5: trainer_state.json modified at Fri Oct 24 11:55:58 PM EDT 2025
	[2025-10-24 23:56:06]
	[2025-10-24 23:56:06] Selected checkpoint: /scratch/zrs2020/LlamaFactoryHelper/experiments/testing__pvv2_lora/checkpoints/checkpoint-10
	[2025-10-24 23:56:06] This checkpoint has the most recently updated trainer_state.json
	[2025-10-24 23:56:06] Checkpoint details:
	[2025-10-24 23:56:06] Path: /scratch/zrs2020/LlamaFactoryHelper/experiments/testing__pvv2_lora/checkpoints/checkpoint-10
	[2025-10-24 23:56:06] Last modified: 2025-10-24 23:56:03.255712120 -0400
	[2025-10-24 23:56:06] Training step: 10
	[2025-10-24 23:56:06] Updating merge config to point to checkpoint...
	Successfully updated merge config
	[2025-10-24 23:56:06] Updated merge config to use: /scratch/zrs2020/LlamaFactoryHelper/experiments/testing__pvv2_lora/checkpoints/checkpoint-10
	[2025-10-24 23:56:06]
	[2025-10-24 23:56:06] Merge config contents:
	[2025-10-24 23:56:06] template: qwen
	[2025-10-24 23:56:06] trust_remote_code: true
	[2025-10-24 23:56:06] export_dir: /scratch/zrs2020/LlamaFactoryHelper/experiments/testing__pvv2_lora/merged
	[2025-10-24 23:56:06] model_name_or_path: Qwen/Qwen2.5-1.5B-Instruct
	[2025-10-24 23:56:06] adapter_name_or_path: /scratch/zrs2020/LlamaFactoryHelper/experiments/testing__pvv2_lora/checkpoints/checkpoint-10
	[2025-10-24 23:56:06]
	[2025-10-24 23:56:06] Executing command: llamafactory-cli export /scratch/zrs2020/LlamaFactoryHelper/experiments/testing__pvv2_lora/configs/merge_config.yaml
	/scratch/zrs2020/miniconda/miniconda3/envs/llamafactory/lib/python3.12/site-packages/transformers/utils/hub.py:110: FutureWarning: Using `TRANSFORMERS_CACHE` is deprecated and will be removed in v5 of Transformers. Use `HF_HOME` instead.
	warnings.warn(
	/scratch/zrs2020/miniconda/miniconda3/envs/llamafactory/lib/python3.12/site-packages/jieba/_compat.py:18: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.
	import pkg_resources
	[INFO\|tokenization_utils_base.py:2095] 2025-10-24 23:56:13,985 >> loading file vocab.json from cache at /scratch/zrs2020/.cache/hf_cache/home/hub/models--Qwen--Qwen2.5-1.5B-Instruct/snapshots/989aa7980e4cf806f80c7fef2b1adb7bc71aa306/vocab.json
	[INFO\|tokenization_utils_base.py:2095] 2025-10-24 23:56:13,986 >> loading file merges.txt from cache at /scratch/zrs2020/.cache/hf_cache/home/hub/models--Qwen--Qwen2.5-1.5B-Instruct/snapshots/989aa7980e4cf806f80c7fef2b1adb7bc71aa306/merges.txt
	[INFO\|tokenization_utils_base.py:2095] 2025-10-24 23:56:13,986 >> loading file tokenizer.json from cache at /scratch/zrs2020/.cache/hf_cache/home/hub/models--Qwen--Qwen2.5-1.5B-Instruct/snapshots/989aa7980e4cf806f80c7fef2b1adb7bc71aa306/tokenizer.json
	[INFO\|tokenization_utils_base.py:2095] 2025-10-24 23:56:13,986 >> loading file added_tokens.json from cache at None
	[INFO\|tokenization_utils_base.py:2095] 2025-10-24 23:56:13,986 >> loading file special_tokens_map.json from cache at None
	[INFO\|tokenization_utils_base.py:2095] 2025-10-24 23:56:13,986 >> loading file tokenizer_config.json from cache at /scratch/zrs2020/.cache/hf_cache/home/hub/models--Qwen--Qwen2.5-1.5B-Instruct/snapshots/989aa7980e4cf806f80c7fef2b1adb7bc71aa306/tokenizer_config.json
	[INFO\|tokenization_utils_base.py:2095] 2025-10-24 23:56:13,986 >> loading file chat_template.jinja from cache at None
	[INFO\|tokenization_utils_base.py:2364] 2025-10-24 23:56:14,157 >> Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
	[INFO\|configuration_utils.py:765] 2025-10-24 23:56:14,372 >> loading configuration file config.json from cache at /scratch/zrs2020/.cache/hf_cache/home/hub/models--Qwen--Qwen2.5-1.5B-Instruct/snapshots/989aa7980e4cf806f80c7fef2b1adb7bc71aa306/config.json
	[INFO\|configuration_utils.py:839] 2025-10-24 23:56:14,374 >> Model config Qwen2Config {
	"architectures": [
	"Qwen2ForCausalLM"
	],
	"attention_dropout": 0.0,
	"bos_token_id": 151643,
	"dtype": "bfloat16",
	"eos_token_id": 151645,
	"hidden_act": "silu",
	"hidden_size": 1536,
	"initializer_range": 0.02,
	"intermediate_size": 8960,
	"layer_types": [
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention"
	],
	"max_position_embeddings": 32768,
	"max_window_layers": 21,
	"model_type": "qwen2",
	"num_attention_heads": 12,
	"num_hidden_layers": 28,
	"num_key_value_heads": 2,
	"rms_norm_eps": 1e-06,
	"rope_scaling": null,
	"rope_theta": 1000000.0,
	"sliding_window": null,
	"tie_word_embeddings": true,
	"transformers_version": "4.57.1",
	"use_cache": true,
	"use_sliding_window": false,
	"vocab_size": 151936
	}

	[INFO\|tokenization_utils_base.py:2095] 2025-10-24 23:56:14,443 >> loading file vocab.json from cache at /scratch/zrs2020/.cache/hf_cache/home/hub/models--Qwen--Qwen2.5-1.5B-Instruct/snapshots/989aa7980e4cf806f80c7fef2b1adb7bc71aa306/vocab.json
	[INFO\|tokenization_utils_base.py:2095] 2025-10-24 23:56:14,443 >> loading file merges.txt from cache at /scratch/zrs2020/.cache/hf_cache/home/hub/models--Qwen--Qwen2.5-1.5B-Instruct/snapshots/989aa7980e4cf806f80c7fef2b1adb7bc71aa306/merges.txt
	[INFO\|tokenization_utils_base.py:2095] 2025-10-24 23:56:14,443 >> loading file tokenizer.json from cache at /scratch/zrs2020/.cache/hf_cache/home/hub/models--Qwen--Qwen2.5-1.5B-Instruct/snapshots/989aa7980e4cf806f80c7fef2b1adb7bc71aa306/tokenizer.json
	[INFO\|tokenization_utils_base.py:2095] 2025-10-24 23:56:14,443 >> loading file added_tokens.json from cache at None
	[INFO\|tokenization_utils_base.py:2095] 2025-10-24 23:56:14,443 >> loading file special_tokens_map.json from cache at None
	[INFO\|tokenization_utils_base.py:2095] 2025-10-24 23:56:14,443 >> loading file tokenizer_config.json from cache at /scratch/zrs2020/.cache/hf_cache/home/hub/models--Qwen--Qwen2.5-1.5B-Instruct/snapshots/989aa7980e4cf806f80c7fef2b1adb7bc71aa306/tokenizer_config.json
	[INFO\|tokenization_utils_base.py:2095] 2025-10-24 23:56:14,443 >> loading file chat_template.jinja from cache at None
	[INFO\|tokenization_utils_base.py:2364] 2025-10-24 23:56:14,608 >> Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
	[INFO\|configuration_utils.py:765] 2025-10-24 23:56:14,663 >> loading configuration file config.json from cache at /scratch/zrs2020/.cache/hf_cache/home/hub/models--Qwen--Qwen2.5-1.5B-Instruct/snapshots/989aa7980e4cf806f80c7fef2b1adb7bc71aa306/config.json
	[INFO\|configuration_utils.py:839] 2025-10-24 23:56:14,663 >> Model config Qwen2Config {
	"architectures": [
	"Qwen2ForCausalLM"
	],
	"attention_dropout": 0.0,
	"bos_token_id": 151643,
	"dtype": "bfloat16",
	"eos_token_id": 151645,
	"hidden_act": "silu",
	"hidden_size": 1536,
	"initializer_range": 0.02,
	"intermediate_size": 8960,
	"layer_types": [
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention",
	"full_attention"
	],
	"max_position_embeddings": 32768,
	"max_window_layers": 21,
	"model_type": "qwen2",
	"num_attention_heads": 12,
	"num_hidden_layers": 28,
	"num_key_value_heads": 2,
	"rms_norm_eps": 1e-06,
	"rope_scaling": null,
	"rope_theta": 1000000.0,
	"sliding_window": null,
	"tie_word_embeddings": true,
	"transformers_version": "4.57.1",
	"use_cache": true,
	"use_sliding_window": false,
	"vocab_size": 151936
	}

	[WARNING\|logging.py:328] 2025-10-24 23:56:14,663 >> `torch_dtype` is deprecated! Use `dtype` instead!
	[INFO\|2025-10-24 23:56:14] llamafactory.model.model_utils.kv_cache:143 >> KV cache is enabled for faster generation.
	[WARNING\|logging.py:328] 2025-10-24 23:56:15,013 >> `torch_dtype` is deprecated! Use `dtype` instead!
	[INFO\|modeling_utils.py:1172] 2025-10-24 23:56:15,014 >> loading weights file model.safetensors from cache at /scratch/zrs2020/.cache/hf_cache/home/hub/models--Qwen--Qwen2.5-1.5B-Instruct/snapshots/989aa7980e4cf806f80c7fef2b1adb7bc71aa306/model.safetensors
	[INFO\|modeling_utils.py:2341] 2025-10-24 23:56:15,015 >> Instantiating Qwen2ForCausalLM model under default dtype torch.bfloat16.
	[INFO\|configuration_utils.py:986] 2025-10-24 23:56:15,016 >> Generate config GenerationConfig {
	"bos_token_id": 151643,
	"eos_token_id": 151645
	}

	[INFO\|configuration_utils.py:941] 2025-10-24 23:56:15,118 >> loading configuration file generation_config.json from cache at /scratch/zrs2020/.cache/hf_cache/home/hub/models--Qwen--Qwen2.5-1.5B-Instruct/snapshots/989aa7980e4cf806f80c7fef2b1adb7bc71aa306/generation_config.json
	[INFO\|configuration_utils.py:986] 2025-10-24 23:56:15,119 >> Generate config GenerationConfig {
	"bos_token_id": 151643,
	"do_sample": true,
	"eos_token_id": [
	151645,
	151643
	],
	"pad_token_id": 151643,
	"repetition_penalty": 1.1,
	"temperature": 0.7,
	"top_k": 20,
	"top_p": 0.8
	}

	[INFO\|dynamic_module_utils.py:423] 2025-10-24 23:56:15,148 >> Could not locate the custom_generate/generate.py inside Qwen/Qwen2.5-1.5B-Instruct.
	[INFO\|2025-10-24 23:56:15] llamafactory.model.model_utils.attention:143 >> Using torch SDPA for faster training and inference.
	[INFO\|2025-10-24 23:56:17] llamafactory.model.adapter:143 >> Merged 1 adapter(s).
	[INFO\|2025-10-24 23:56:17] llamafactory.model.adapter:143 >> Loaded adapter(s): /scratch/zrs2020/LlamaFactoryHelper/experiments/testing__pvv2_lora/checkpoints/checkpoint-10
	[INFO\|2025-10-24 23:56:17] llamafactory.model.loader:143 >> all params: 1,543,714,304
	[INFO\|2025-10-24 23:56:17] llamafactory.train.tuner:143 >> Convert model dtype to: torch.bfloat16.
	[INFO\|configuration_utils.py:491] 2025-10-24 23:56:17,909 >> Configuration saved in /scratch/zrs2020/LlamaFactoryHelper/experiments/testing__pvv2_lora/merged/config.json
	[INFO\|configuration_utils.py:757] 2025-10-24 23:56:17,914 >> Configuration saved in /scratch/zrs2020/LlamaFactoryHelper/experiments/testing__pvv2_lora/merged/generation_config.json
	[INFO\|modeling_utils.py:4181] 2025-10-24 23:56:21,705 >> Model weights saved in /scratch/zrs2020/LlamaFactoryHelper/experiments/testing__pvv2_lora/merged/model.safetensors
	[INFO\|tokenization_utils_base.py:2421] 2025-10-24 23:56:21,725 >> chat template saved in /scratch/zrs2020/LlamaFactoryHelper/experiments/testing__pvv2_lora/merged/chat_template.jinja
	[INFO\|tokenization_utils_base.py:2590] 2025-10-24 23:56:21,745 >> tokenizer config file saved in /scratch/zrs2020/LlamaFactoryHelper/experiments/testing__pvv2_lora/merged/tokenizer_config.json
	[INFO\|tokenization_utils_base.py:2599] 2025-10-24 23:56:21,765 >> Special tokens file saved in /scratch/zrs2020/LlamaFactoryHelper/experiments/testing__pvv2_lora/merged/special_tokens_map.json
	[INFO\|2025-10-24 23:56:21] llamafactory.train.tuner:143 >> Ollama modelfile saved in /scratch/zrs2020/LlamaFactoryHelper/experiments/testing__pvv2_lora/merged/Modelfile
	[2025-10-24 23:56:22]
	[2025-10-24 23:56:22] ========================================
	[2025-10-24 23:56:22] Merge/Export completed successfully
	[2025-10-24 23:56:22] End Time: Fri Oct 24 11:56:22 PM EDT 2025
	[2025-10-24 23:56:22] ========================================
	[2025-10-24 23:56:22]
	[2025-10-24 23:56:22] ========================================
	[2025-10-24 23:56:22] Preparing Training Artifacts
	[2025-10-24 23:56:22] ========================================
	[2025-10-24 23:56:22] Copying configuration files...
	[2025-10-24 23:56:22] Copying and cleaning training logs...