1 6

jiaxin-ai(Sii)

julyai

AI & ML interests

None yet

Recent Activity

upvoted a paper 6 days ago

The Agent's First Day: Benchmarking Learning, Exploration, and Scheduling in the Workplace Scenarios

upvoted a paper 7 days ago

ArenaRL: Scaling RL for Open-Ended Agents via Tournament-based Relative Ranking

updated a dataset 3 months ago

julyai/ProJudge-173k

View all activity

Organizations

upvoted a paper 6 days ago

The Agent's First Day: Benchmarking Learning, Exploration, and Scheduling in the Workplace Scenarios

Paper • 2601.08173 • Published 9 days ago • 7

upvoted a paper 7 days ago

ArenaRL: Scaling RL for Open-Ended Agents via Tournament-based Relative Ranking

Paper • 2601.06487 • Published 11 days ago • 48

updated 2 datasets 3 months ago

julyai/ProJudge-173k

Viewer • Updated Oct 21, 2025 • 173k • 81

julyai/ProJudgeBench

Viewer • Updated Oct 21, 2025 • 2.4k • 7

upvoted a paper 7 months ago

Sekai: A Video Dataset towards World Exploration

Paper • 2506.15675 • Published Jun 18, 2025 • 66

upvoted 2 papers 10 months ago

ProJudge: A Multi-Modal Multi-Discipline Benchmark and Instruction-Tuning Dataset for MLLM-based Process Judges

Paper • 2503.06553 • Published Mar 9, 2025 • 7

ARMOR v0.1: Empowering Autoregressive Multimodal Understanding Model with Interleaved Multimodal Generation via Asymmetric Synergy

Paper • 2503.06542 • Published Mar 9, 2025 • 7

upvoted a paper 11 months ago

MM-Eureka: Exploring Visual Aha Moment with Rule-based Large-scale Reinforcement Learning

Paper • 2503.07365 • Published Mar 10, 2025 • 61

published 2 datasets 11 months ago

julyai/ProJudge-173k

Viewer • Updated Oct 21, 2025 • 173k • 81

julyai/ProJudgeBench

Viewer • Updated Oct 21, 2025 • 2.4k • 7

updated a collection about 1 year ago

critique

Collection

9 items • Updated Dec 14, 2024

commented 2 papers about 1 year ago

DynaMath: A Dynamic Visual Benchmark for Evaluating Mathematical Reasoning Robustness of Vision Language Models

Paper • 2411.00836 • Published Oct 29, 2024 • 15 •

DynaMath: A Dynamic Visual Benchmark for Evaluating Mathematical Reasoning Robustness of Vision Language Models

Paper • 2411.00836 • Published Oct 29, 2024 • 15 •

jiaxin-ai(Sii)

AI & ML interests

Recent Activity

Organizations

julyai's activity