·
AI & ML interests
None yet
Organizations
aidando73/simplerl-v8-checkpoints
Updated
aidando73/simplerl-Qwen2.5-Math-7B-v5-checkpoint40
aidando73/simplerl-v5-checkpoints
Updated
aidando73/simplerl-v6-checkpoints
Updated
aidando73/simplerl-v4-checkpoints
Updated
aidando73/simplerl-single-grpo-v1-checkpoints
Updated
aidando73/Qwen-2.5-7B-Simple-RL-v9
Text Generation
• 8B • Updated
• 1
aidando73/Qwen-2.5-7B-Simple-RL-v8
Text Generation
• 8B • Updated
• 1
aidando73/Qwen-2.5-7B-Simple-RL-v7
Text Generation
• 8B • Updated
• 3
aidando73/Qwen-2.5-7B-Simple-RL-v6
Text Generation
• 8B • Updated
• 1
aidando73/Qwen-2.5-7B-Simple-RL-v5
Text Generation
• 8B • Updated
• 5
aidando73/Qwen-2.5-7B-Simple-RL-v4
Updated
aidando73/Qwen-2.5-7B-Simple-RL-v3
Text Generation
• 8B • Updated
• 1
aidando73/Qwen-2.5-7B-Simple-RL-v2
Text Generation
• 8B • Updated
• 1
aidando73/Qwen-2.5-7B-Simple-RL-v1
Text Generation
• 8B • Updated
aidando73/grpo-big-math-rl-v2
Updated
aidando73/llama-3.1-8b-grpo-big-math-rl-v3-checkpoints
Updated
aidando73/llama-3.1-8b-grpo-big-math-rl-v2-checkpoints
Updated
aidando73/Qwen2-0.5B-GRPO-summarize-2025-03-17-20750
Text Generation
• 0.5B • Updated
• 5
• aidando73/Qwen2-0.5B-summarize-SFT-2025-03-17-43773
Text Generation
• 0.5B • Updated
aidando73/Qwen2-0.5B-GRPO-20750
Text Generation
• 0.5B • Updated
• 2
• aidando73/llama-3.1-8b-grpo-checkpoints
Updated
aidando73/Qwen2-0.5B-summarize-SFT-2025-03-17
Updated
aidando73/llama-3.1-8b-grpo-33000-merged
Text Generation
• 8B • Updated
• 1
aidando73/Qwen2-0.5B-GRPO-8250
Text Generation
• 0.5B • Updated
• 5
• aidando73/llama-3.1-8b-grpo-19500-merged
Text Generation
• 8B • Updated
aidando73/llama-3.1-8b-grpo-10500-merged
Text Generation
• 8B • Updated
aidando73/llama-3.1-8b-4bit-merged
Text Generation
• 8B • Updated
• 1
aidando73/qwen2.5-3b-4bit-merged
Text Generation
• 3B • Updated
• 1
aidando73/llama-3.1-8b-grpo-4bit-merged
Text Generation
• 8B • Updated
• 1