Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
1
Nguyen Vy
ntthuyvy73
Follow
0 followers
·
1 following
AI & ML interests
None yet
Recent Activity
upvoted
a
paper
16 days ago
DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search
published
a model
29 days ago
ntthuyvy73/Qwen3-4B_SFT-MCQ-v1
published
a model
about 1 month ago
ntthuyvy73/Qwen3-4B-RLHF-GRPO_v7_lora_merge
View all activity
Organizations
ntthuyvy73
's models
20
Sort: Recently updated
ntthuyvy73/Qwen3-4B_SFT-MCQ-v1
Updated
29 days ago
ntthuyvy73/Qwen3-4B-RLHF-GRPO_v7_lora_merge
Updated
Nov 14
ntthuyvy73/Qwen3-4B-RLHF-DPO_v7_lora_merge
Updated
Nov 14
ntthuyvy73/Qwen3-4B-RLHF-GRPO_v7
4B
•
Updated
Nov 13
•
21
ntthuyvy73/Qwen3-4B-RLHF-DPO_v7
Updated
Nov 13
ntthuyvy73/Qwen3-4B_RLHF-SFT-v7
Text Generation
•
4B
•
Updated
Nov 11
•
12
ntthuyvy73/Qwen3-4B-RLHF-SFT_v6
Text Generation
•
4B
•
Updated
Nov 10
•
5
ntthuyvy73/Qwen3-1.7B_RLHF_SFT_full
2B
•
Updated
Nov 10
•
4
ntthuyvy73/Qwen3-1.7B_RLHF_SFT
Updated
Nov 10
ntthuyvy73/Qwen3-4B-RLHF-SFT_v4
Text Generation
•
4B
•
Updated
Nov 9
•
4
ntthuyvy73/Qwen3-4B-RLHF-DPO_v3_reasoning
Text Generation
•
4B
•
Updated
Nov 8
•
5
ntthuyvy73/Qwen3-4B-RLHF-SFT_v3_reasoning
Text Generation
•
4B
•
Updated
Nov 8
•
4
ntthuyvy73/Qwen3-4B-RLHF-DPO_v2
4B
•
Updated
Nov 7
•
5
ntthuyvy73/Qwen3-4B-RLHF-SFT_v2
Text Generation
•
4B
•
Updated
Nov 7
•
4
ntthuyvy73/Qwen3-4B-RLHF-DPO
Text Generation
•
Updated
Nov 5
•
2
ntthuyvy73/Qwen3-4B-RLHF-SFT
Text Generation
•
4B
•
Updated
Nov 4
•
5
ntthuyvy73/Qwen3-4B-base-CPT-DTC-full
Text Generation
•
4B
•
Updated
Nov 3
•
5
ntthuyvy73/Qwen3-1.7B-RLHF-DPO
Text Generation
•
Updated
Nov 3
ntthuyvy73/Qwen3-1.7B-RLHF-GRPO
Text Generation
•
Updated
Nov 3
ntthuyvy73/Qwen3-1.7B-base-CPT-DTC-full
Text Generation
•
2B
•
Updated
Oct 30
•
5