Evaluation tool to assess the cultural relevance of images for user-defined culture labels
AI & ML interests
None defined yet.
Recent Activity
View all activity
Papers
Benchmark Test-Time Scaling of General LLM Agents
On the Interplay of Pre-Training, Mid-Training, and RL on Reasoning Language Models
-
SOTOPIA-π: Interactive Learning of Socially Intelligent Language Agents
Paper • 2403.08715 • Published • 21 -
SOTOPIA: Interactive Evaluation for Social Intelligence in Language Agents
Paper • 2310.11667 • Published • 4 -
cmu-lti/sotopia
Updated • 238 • 5 -
cmu-lti/sotopia-pi
Viewer • Updated • 33.4k • 371 • 8
Evaluation tool to assess the cultural relevance of images for user-defined culture labels
-
SOTOPIA-π: Interactive Learning of Socially Intelligent Language Agents
Paper • 2403.08715 • Published • 21 -
SOTOPIA: Interactive Evaluation for Social Intelligence in Language Agents
Paper • 2310.11667 • Published • 4 -
cmu-lti/sotopia
Updated • 238 • 5 -
cmu-lti/sotopia-pi
Viewer • Updated • 33.4k • 371 • 8
datasets 12
cmu-lti/machine-translation-for-vision
Viewer
• Updated
• 696 • 128 • 1
cmu-lti/stateful
Viewer
• Updated
• 500 • 72
cmu-lti/caire-specific
Viewer
• Updated
• 68 • 13
cmu-lti/interactive-swe
Viewer
• Updated
• 500 • 58
cmu-lti/caire-universal
Viewer
• Updated
• 400 • 12
cmu-lti/caire-index-ckpts
Updated
• 7
cmu-lti/AI-LieDar
Updated
• 9
cmu-lti/agents_vs_script
Viewer
• Updated
• 20.3k • 22 • 3
cmu-lti/sotopia
Updated
• 238 • 5
cmu-lti/sotopia-pi
Viewer
• Updated
• 33.4k • 371 • 8