shuoxing/llama3-8b-full-pretrain-mix-low-tweet-1m-en-reproduce-bs8 Text Generation • 266k • Updated about 4 hours ago
shuoxing/llama3-8b-full-pretrain-mix-low-tweet-1m-en-reproduce-bs8 Text Generation • 266k • Updated about 4 hours ago
shuoxing/llama3-8b-full-pretrain-junk-tweet-1m-en-reproduce-bs8 Text Generation • 266k • Updated 1 day ago • 20
shuoxing/llama3-8b-full-pretrain-junk-tweet-1m-en-reproduce-bs8 Text Generation • 266k • Updated 1 day ago • 20
shuoxing/llama3-8b-full-pretrain-junk-tweet-1m-en-reproduce Text Generation • 8B • Updated 2 days ago • 91
shuoxing/llama3-8b-full-pretrain-junk-tweet-1m-en-reproduce Text Generation • 8B • Updated 2 days ago • 91
MLLM Reasoning, Rewarding, and Understanding Collection Papers on the reasoning, rewarding, and understanding of the MLLMs and LLMs • 28 items • Updated 7 days ago • 1
shuoxing/llama3-8b-full-pretrain-control-tweet-1m-en-no-packing-new-sft-bs128 Text Generation • 266k • Updated 14 days ago • 37
shuoxing/llama3-8b-full-pretrain-control-tweet-1m-en-no-packing-new-sft-bs128 Text Generation • 266k • Updated 14 days ago • 37
shuoxing/llama3-8b-full-pretrain-mix-high-tweet-1m-en-no-packing-new-sft-bs128 Text Generation • 266k • Updated 14 days ago • 41
shuoxing/llama3-8b-full-pretrain-mix-high-tweet-1m-en-no-packing-new-sft-bs128 Text Generation • 266k • Updated 14 days ago • 41
shuoxing/llama3-8b-full-pretrain-mix-mid-tweet-1m-en-no-packing-new-sft-bs128 Text Generation • 266k • Updated 14 days ago • 39
shuoxing/llama3-8b-full-pretrain-mix-mid-tweet-1m-en-no-packing-new-sft-bs128 Text Generation • 266k • Updated 14 days ago • 39
shuoxing/llama3-8b-full-pretrain-mix-low-tweet-1m-en-no-packing-new-sft-bs128 Text Generation • 266k • Updated 14 days ago • 32