mytestdpo (mytestdpo)

HanningZhang

authored a paper 8 months ago

Optimizing Chain-of-Thought Reasoners via Gradient Variance Minimization in Rejection Sampling and RL

Paper • 2505.02391 • Published May 5, 2025 • 25

1231czx

updated a dataset 10 months ago

mytestdpo/qwmathbase_raw_raft_step160_olympiadbench

Viewer • Updated Mar 19, 2025 • 675 • 6

1231czx

published a dataset 10 months ago

mytestdpo/qwmathbase_raw_raft_step160_olympiadbench

Viewer • Updated Mar 19, 2025 • 675 • 6

1231czx

updated a dataset 10 months ago

mytestdpo/qwmathbase_raw_raft_step160_minerva_math

Viewer • Updated Mar 19, 2025 • 272 • 4

1231czx

published a dataset 10 months ago

mytestdpo/qwmathbase_raw_raft_step160_minerva_math

Viewer • Updated Mar 19, 2025 • 272 • 4

1231czx

updated a dataset 10 months ago

mytestdpo/qwmathbase_raw_raft_step160_amc23

Viewer • Updated Mar 19, 2025 • 40 • 8

1231czx

published a dataset 10 months ago

mytestdpo/qwmathbase_raw_raft_step160_amc23

Viewer • Updated Mar 19, 2025 • 40 • 8

1231czx

updated a dataset 10 months ago

mytestdpo/qwmathbase_raw_raft_step160_math500

Viewer • Updated Mar 19, 2025 • 500 • 5

1231czx

published a dataset 10 months ago

mytestdpo/qwmathbase_raw_raft_step160_math500

Viewer • Updated Mar 19, 2025 • 500 • 5

1231czx

updated a dataset 10 months ago

mytestdpo/qwmathbase_raw_raft_step160_aime24

Viewer • Updated Mar 19, 2025 • 30 • 5

1231czx

published a dataset 10 months ago

mytestdpo/qwmathbase_raw_raft_step160_aime24

Viewer • Updated Mar 19, 2025 • 30 • 5

1231czx

updated a dataset 10 months ago

mytestdpo/qwmathbase_raw_raft_step40_olympiadbench

Viewer • Updated Mar 19, 2025 • 675 • 7

1231czx

published a dataset 10 months ago

mytestdpo/qwmathbase_raw_raft_step40_olympiadbench

Viewer • Updated Mar 19, 2025 • 675 • 7

1231czx

updated a dataset 10 months ago

mytestdpo/qwmathbase_raw_raft_step40_minerva_math

Viewer • Updated Mar 19, 2025 • 272 • 5

1231czx

published a dataset 10 months ago

mytestdpo/qwmathbase_raw_raft_step40_minerva_math

Viewer • Updated Mar 19, 2025 • 272 • 5

1231czx

updated a dataset 10 months ago

mytestdpo/qwmathbase_raw_raft_step40_amc23

Viewer • Updated Mar 19, 2025 • 40 • 5

1231czx

published a dataset 10 months ago

mytestdpo/qwmathbase_raw_raft_step40_amc23

Viewer • Updated Mar 19, 2025 • 40 • 5

1231czx

updated a dataset 10 months ago

mytestdpo/qwmathbase_raw_raft_step40_math500

Viewer • Updated Mar 19, 2025 • 500 • 5

1231czx

published a dataset 10 months ago

mytestdpo/qwmathbase_raw_raft_step40_math500

Viewer • Updated Mar 19, 2025 • 500 • 5

1231czx

updated a dataset 10 months ago

mytestdpo/qwmathbase_raw_raft_step40_aime24

Viewer • Updated Mar 19, 2025 • 30 • 6

AI & ML interests

Team members 4

mytestdpo's activity