AI & ML interests
None defined yet.
Recent Activity
RL4Reasoning/verl-grpo-lr-deepscaler-bsz128-16384-8192-rtl-cliphigh-hf-1.5B-2_deepscaler_-390
Updated
RL4Reasoning/verl-grpo-lr-deepscaler-bsz128-16384-rtl-dynamic-m-e-l4096-cliphigh-hf-1.5B-4_deepscaler_-220
2B
•
Updated
•
4
RL4Reasoning/verl-grpo-lr-deepscaler-bsz128-16384-rtl-dynamic-m-e-cliphigh-hf-1.5B-4_deepscaler_-390
2B
•
Updated
•
5
RL4Reasoning/verl-grpo-lr-deepscaler-bsz128-16384-2048-rtl-cliphigh-hf-1.5B-4_deepscaler_-340
2B
•
Updated
•
6
RL4Reasoning/verl-grpo-lr-deepscaler-bsz128-16384-4096-rtl-cliphigh-hf-1.5B-4_deepscaler_-140
2B
•
Updated
•
6
RL4Reasoning/verl-grpo-lr-deepscaler-bsz128-16384-8192-rtl-cliphigh-hf-1.5B-4_deepscaler_-390
2B
•
Updated
•
6
RL4Reasoning/verl-grpo-lr-deepscaler-bsz128-16384-rtl-dynamic-m-l4096-cliphigh-hf-1.5B-4_deepscaler_-320
2B
•
Updated
•
6
RL4Reasoning/verl-grpo-lr-deepscaler-bsz128-16384-rtl-dynamic-m-cliphigh-hf-1.5B-4_deepscaler_-460
2B
•
Updated
•
6
RL4Reasoning/verl-grpo-lr-deepscaler-bsz128-16384-rtl-dynamic-m-l1024-cliphigh-hf-1.5B-4_deepscaler_-430
2B
•
Updated
•
7
RL4Reasoning/verl-grpo-lr-deepscaler-bsz128-16384-rtl-dynamic-m-l4096-cliphigh-hf-1.5B-4_deepscaler_-220
2B
•
Updated
•
5
RL4Reasoning/verl-grpo-lr-deepscaler-bsz128-16384-rtl-dynamic-m-cliphigh-hf-1.5B-4_deepscaler_-390
2B
•
Updated
•
7
RL4Reasoning/verl-grpo-lr-deepscaler-bsz128-16384-rtl-dynamic-m-e-l1024-cliphigh-hf-1.5B-4_deepscaler_-460
2B
•
Updated
•
6
RL4Reasoning/verl-grpo-lr-deepscaler-bsz128-16384-rtl-dynamic-m-e-l4096-cliphigh-hf-32B-8_deepscaler_-40
33B
•
Updated
•
3
RL4Reasoning/verl-grpo-lr-deepscaler-bsz128-16384-norm-length-0.05-hf-1.5B-2_deepscaler_-370
2B
•
Updated
•
5
RL4Reasoning/verl-grpo-lr-deepscaler-bsz128-16384-norm-length-0.1-hf-1.5B-2_deepscaler_-480
2B
•
Updated
•
3
RL4Reasoning/verl-grpo-lr-deepscaler-bsz128-16384-2048-rtl-cliphigh-hf-1.5B-4_deepscaler_
Updated
RL4Reasoning/verl-grpo-lr-deepscaler-bsz128-16384-8192-rtl-cliphigh-hf-1.5B-4_deepscaler_
Updated
RL4Reasoning/verl-grpo-lr-deepscaler-bsz128-8192-clip-cliphigh-hf-1.5B-2_deepscaler_
Updated
RL4Reasoning/test_max_steps
Updated
RL4Reasoning/test_max_steps-20
Updated
RL4Reasoning/verl-grpo-lr-deepscaler-bsz128-2048-clip-cliphigh-hf-1.5B-4_deepscaler_-590
2B
•
Updated
•
4
RL4Reasoning/verl-grpo-lr-deepscaler-bsz128-4096-clip-cliphigh-hf-1.5B-4_deepscaler_-280
2B
•
Updated
•
4
RL4Reasoning/verl-grpo-lr-deepscaler-bsz128-2048-clip-cliphigh-hf-1.5B-2_deepscaler_-590
Updated
RL4Reasoning/verl-grpo-lr-deepscaler-bsz128-16384-norm-length-0.2-hf-1.5B-2_deepscaler_-380
2B
•
Updated
•
6
RL4Reasoning/verl-grpo-lr-deepscaler-bsz128-16384-norm-length-0.4-hf-1.5B-2_deepscaler_-280
2B
•
Updated
•
7
RL4Reasoning/verl-grpo-lr-deepscaler-bsz128-8192-clip-cliphigh-hf-1.5B-2_deepscaler_-490
2B
•
Updated
•
6
RL4Reasoning/verl-grpo-lr-deepscaler-bsz128-4096-clip-cliphigh-hf-1.5B-2_deepscaler_-280
Updated
RL4Reasoning/verl-grpo-lr-deepscaler-bsz128-8192-clip-cliphigh-hf-1.5B-2_deepscaler_-190
2B
•
Updated
•
5
RL4Reasoning/verl-grpo-lr-deepscaler-bsz128-16384-2048-rtl-cliphigh-hf-1.5B-4_deepscaler_-300
2B
•
Updated
•
6
RL4Reasoning/verl-grpo-lr-deepscaler-bsz128-16384-4096-rtl-cliphigh-hf-1.5B-4_deepscaler_-420
2B
•
Updated
•
6