Yinxu Pan's picture

Yinxu Pan

cppowboy

·

https://github.com/Cppowboy

AI & ML interests

RL for LLM, Code&Math Reasoning, Function Calling, Code Interpreter, Vision-Language Pretraining

Recent Activity

commented on a paper about 2 hours ago

VESPO: Variational Sequence-Level Soft Policy Optimization for Stable Off-Policy LLM Training

upvoted a paper about 2 hours ago

VESPO: Variational Sequence-Level Soft Policy Optimization for Stable Off-Policy LLM Training

liked a dataset about 4 hours ago

inclusionAI/AReaL-tau2-data

View all activity

Organizations

commented a paper about 2 hours ago

VESPO: Variational Sequence-Level Soft Policy Optimization for Stable Off-Policy LLM Training

Paper • 2602.10693 • Published 20 days ago • 216 •

New activity in Qwen/Qwen3.5-9B about 5 hours ago

No tool call return when calling qwen3.5 9b

#8 opened about 5 hours ago by

New activity in ernie-research/MEnvData-SWE 6 days ago

Is these docker images publicly avaiable on dockerhub?

#1 opened 6 days ago by

New activity in zai-org/GLM-4.7-Flash about 1 month ago

unsupport glm4-moe-lite

#25 opened about 1 month ago by

New activity in IQuestLab/IQuest-Coder-V1-40B-Instruct about 2 months ago

swe不是80多分么，怎么降到76了？

#6 opened about 2 months ago by

New activity in nebius/SWE-rebench 4 months ago

How can I find all instance_ids that come with a Docker image?

#10 opened 4 months ago by

New activity in hkust-nlp/WebExplorer-QA 6 months ago

Will the full train dataset be open sourced in the future?

#2 opened 6 months ago by

New activity in r2e-edits/SweSmith-RL-Dataset 6 months ago

Are these docker images publicly available?

#2 opened 6 months ago by

New activity in SWE-bench/SWE-smith 6 months ago

您好，请问FAIL_TO_PASS的文件在镜像里为什么没有啊

#6 opened 7 months ago by

New activity in nebius/SWE-rebench 6 months ago

Could this dataset be repurposed for LLM training?

#7 opened 6 months ago by

New activity in deepseek-ai/DeepSeek-R1-0528-Qwen3-8B 9 months ago

Failed to reproduce evaluation result on AIME24

#18 opened 9 months ago by

commented a paper 9 months ago

Reinforcement Pre-Training

Paper • 2506.08007 • Published Jun 9, 2025 • 263 •

New activity in nvidia/OpenCodeReasoning-2 10 months ago

question is empty

#1 opened 10 months ago by

New activity in virtuoussy/Qwen2.5-7B-Instruct-RLVR 10 months ago

badcase

#4 opened 10 months ago by

commented a paper over 1 year ago

Patience Is The Key to Large Language Model Reasoning

Paper • 2411.13082 • Published Nov 20, 2024 • 7 •

New activity in openbmb/MiniCPM3-4B over 1 year ago

remove minicpm tokenizer

#29 opened over 1 year ago by

Fix function calling parameter string quoting

#24 opened over 1 year ago by

Add full tool calling support to chat template

#20 opened over 1 year ago by

function calling dataset

#6 opened over 1 year ago by

New activity in apple/DCLM-7B over 1 year ago

Unable to load model “apple/DCLM-7B” - KeyError: ‘openlm’

#7 opened over 1 year ago by