Ai2 Open Coding Agents - Django, Sphinx, Sympy Data
AI & ML interests
Building breatkthrough AI to solve the world's biggest problems.
Recent Activity
View all activity
Papers
TOPReward: Token Probabilities as Hidden Zero-Shot Rewards for Robotics
How2Everything: Mining the Web for How-To Procedures to Evaluate and Improve LLMs
Organization Card
spaces 13
pinned
Running
20
AstaBench Leaderboard
🥇
View benchmark leaderboards
pinned
Running
422
Reward Bench Leaderboard
📐
Explore RewardBench model rankings and scores
pinned
Running
2
HREF Leaderboard
📐
Browse and search HREF leaderboard data
pinned
Running
91
Zebra Logic Bench
🦓
Show leaderboard and explore model puzzle results
pinned
Running
3
SUPER Leaderboard
🤖
Display a static leaderboard from a JSON file
pinned
Running
53
ZeroEval Leaderboard
📊
Embed ZeroEval for evaluation
models 852
allenai/olmo-3.2-tokenizer-think-dev
Updated
• 3
allenai/olmo-3-tokenizer-instruct-dev
Updated
• 1
allenai/Olmo-3-1025-7B
Text Generation • 7B • Updated
• 60.1k • 54
allenai/Flex-pes2o-2x7B-1T
Text Generation • 12B • Updated
• 173 • 2
allenai/Flex-news-2x7B-1T
Text Generation • 12B • Updated
• 167 • 2
allenai/Flex-creative-2x7B-1T
Text Generation • 12B • Updated
• 278 • 5
allenai/Flex-public-7B-1T
Text Generation • 7B • Updated
• 269 • 5
allenai/Flex-code-2x7B-1T
Text Generation • 12B • Updated
• 394 • 2
allenai/Flex-math-2x7B-1T
Text Generation • 12B • Updated
• 368 • 3
allenai/olmo-3-tokenizer-instruct-release
Updated
• 1
datasets 370
allenai/SimpleToM
Viewer
• Updated
• 4.59k • 143 • 10
allenai/asta-user-interactions
Viewer
• Updated
• 14.1M • 1 • 1
allenai/dolma3_pool_staging
Viewer
• Updated
• 1 • 14 • 1
allenai/prescience
Viewer
• Updated
• 839k • 53 • 9
allenai/dolma3_pool
Preview
• Updated
• 128k • 32
allenai/dolma3_longmino_mix-100B-1125
Preview
• Updated
• 17.6k • 12
allenai/dolma3_dolmino_mix-100B-1125
Preview
• Updated
• 227k • 19
allenai/asta-summary-citation-counts
Viewer
• Updated
• 47M • 398 • 8
allenai/olmix
Preview
• Updated
• 282 • 38
allenai/Dolci-Instruct-DPO
Viewer
• Updated
• 260k • 1.54k • 7