Usefulness Judge Finetuned judges to evaluate how useful a response is to a prompt miulab/Qwen3-1.7B-Usefulness Text Generation • 2B • Updated Dec 15, 2025 • 154 • 1 miulab/Qwen3-4B-Usefulness Text Generation • 4B • Updated Dec 15, 2025 • 100 • 1 miulab/Qwen3-8B-Usefulness Text Generation • 8B • Updated 2 days ago • 17
DogeRM Models trained/used in the paper "DogeRM: Equipping Reward Models with Domain Knowledge through Model Merging ( https://arxiv.org/abs/2407.01470) miulab/llama2-7b-oss-instruct Text Generation • 7B • Updated Oct 3, 2024 • 3 miulab/llama2-7b-alpaca-sft-10k Text Generation • 7B • Updated Oct 3, 2024 • 37 miulab/llama2-7b-magicoder-evol-instruct Text Generation • 7B • Updated Oct 3, 2024 • 1 miulab/llama2-7b-ultrafeedback-rm Text Classification • 7B • Updated Oct 3, 2024 • 4 • 1
Usefulness Judge Finetuned judges to evaluate how useful a response is to a prompt miulab/Qwen3-1.7B-Usefulness Text Generation • 2B • Updated Dec 15, 2025 • 154 • 1 miulab/Qwen3-4B-Usefulness Text Generation • 4B • Updated Dec 15, 2025 • 100 • 1 miulab/Qwen3-8B-Usefulness Text Generation • 8B • Updated 2 days ago • 17
DogeRM Models trained/used in the paper "DogeRM: Equipping Reward Models with Domain Knowledge through Model Merging ( https://arxiv.org/abs/2407.01470) miulab/llama2-7b-oss-instruct Text Generation • 7B • Updated Oct 3, 2024 • 3 miulab/llama2-7b-alpaca-sft-10k Text Generation • 7B • Updated Oct 3, 2024 • 37 miulab/llama2-7b-magicoder-evol-instruct Text Generation • 7B • Updated Oct 3, 2024 • 1 miulab/llama2-7b-ultrafeedback-rm Text Classification • 7B • Updated Oct 3, 2024 • 4 • 1