AI & ML interests
None defined yet.
models 29
AIPlans/Qwen3-0.6B-ReMax
Reinforcement Learning • 0.6B • Updated
• 6 • 2
AIPlans/Qwen3-0.6B-GRPO-RM_NVIDIA
Text Generation • 0.6B • Updated
• 8
AIPlans/Qwen3-0.6B-GRPO_Epoch2
Text Generation • 0.6B • Updated
• 2
AIPlans/Qwen3-0.6B-GRPO_Epoch1
Text Generation • 0.6B • Updated
• 3
AIPlans/Qwen3-0.6B-GRPO
Updated
AIPlans/Qwen3-0.6B-IPO
Reinforcement Learning • 0.6B • Updated
• 34 • 1
AIPlans/qwen3-0.6b-base-PPO-hs2
Updated
AIPlans/Qwen3-0.6B-DPO_Epoch_1
Text Generation • 0.6B • Updated
• 3
AIPlans/Qwen3-0.6B-PPO
Updated
AIPlans/Qwen3-0.6B-PPO1
Updated
datasets 17
AIPlans/Helpsteer2-helpfulness-prompts
Viewer
• Updated
• 7.22k • 26
AIPlans/helpsteer2-helpfulness-preference-cleaned
Viewer
• Updated
• 6.99k • 22
AIPlans/trackio-experiments
Updated
• 5
AIPlans/ultrafeedback_binarized_chinese
Viewer
• Updated
• 14k • 19
AIPlans/ultrafeedback_binarized
Viewer
• Updated
• 14k • 20
AIPlans/FilteredPKU-SafeRLHF_chinese
Viewer
• Updated
• 12k • 11
AIPlans/FilteredPKU-SafeRLHF
Viewer
• Updated
• 12k • 9
AIPlans/SafetyBench_WithLabels_Better_chinese
Viewer
• Updated
• 546 • 88
AIPlans/SafetyBench_WithLabels
Viewer
• Updated
• 546 • 93
AIPlans/ToxiGen_chinese
Viewer
• Updated
• 1k • 91