Your Bench
community
AI & ML interests
None defined yet.
Organization Card
YourBench is an open-source framework for generating zero-shot benchmarks from your own documents. It helps you test language models on custom domains using automated pipelines for ingestion, summarization, and question generation.
- 📚 Build benchmarks from PDFs, HTML, or text files
- 🧠 Generate both single-hop and multi-hop questions
- 🔍 Evaluate top models and deploy leaderboards instantly
- 🛠️ Fully configurable via a single YAML file
Built with 🤗 by the OpenEvals team — GitHub
-
yourbench/yourbench_reproduction_o4mini_biology
Viewer • Updated • 1.83k • 95 -
yourbench/yourbench_reproduction_o4mini_business
Viewer • Updated • 829 • 91 -
yourbench/yourbench_reproduction_o4mini_chemistry
Viewer • Updated • 805 • 92 -
yourbench/yourbench_reproduction_o4mini_computerscience
Viewer • Updated • 1.81k • 18
-
yourbench/yourbench_reproduction_o4mini_biology
Viewer • Updated • 1.83k • 95 -
yourbench/yourbench_reproduction_o4mini_business
Viewer • Updated • 829 • 91 -
yourbench/yourbench_reproduction_o4mini_chemistry
Viewer • Updated • 805 • 92 -
yourbench/yourbench_reproduction_o4mini_computerscience
Viewer • Updated • 1.81k • 18
spaces 7
Running on CPU Upgrade
44
YourBench
🚀
Generate custom evaluations from your data easily!
Sleeping
Essential Web Medical
🏆
Select and annotate high-quality web documents
Sleeping
View Essentialweb Cleaned
🏃
Sleeping
Reachy Trivia
🚀
Trivia Questions For The Reachy Mini and Reachy Team!
Runtime error
Essential Web Annotation
📊
Annotating Essential Web!
Sleeping
Visualize Expert Level Filter
🔥
Browse and inspect classified documents from a dataset
models 0
None public yet
datasets 84
yourbench/childrens_books_questions
Viewer
• Updated
• 62 • 36
yourbench/mckinsey_great_trade_global_report
Viewer
• Updated
• 511 • 28
yourbench/aws_bedrock_documentation_demo
Viewer
• Updated
• 1.18k • 21
yourbench/yourbench-custom-prompts-example-gpt-4.1
Viewer
• Updated
• 55 • 35
yourbench/yourbench-custom-prompts-example-oss-120b
Viewer
• Updated
• 3 • 9
yourbench/yourbench-custom-prompts-example
Viewer
• Updated
• 52 • 52
yourbench/yourbench-simple-example
Viewer
• Updated
• 46 • 32
yourbench/mckinsey_state_of_ai_doc_understanding
Viewer
• Updated
• 29 • 96
yourbench/highpass-medfilter-v2
Viewer
• Updated
• 465 • 85
yourbench/highpassfilter-medical-documents-o4-mini
Viewer
• Updated
• 465 • 84