phuongntc/qwen3_06b_grpo_noSFT_multievalsumviet2_nopenalty Text Generation • Updated 18 days ago • 16
phuongntc/qwen3_0.6b_ppo_penalty_multievalsumviet2_fix1000 Text Generation • 0.6B • Updated 21 days ago • 17
phuongntc/qwen3_0.6b_ppo_penalty_multievalsumviet2_final Text Generation • 0.6B • Updated 22 days ago • 21