-
Nanbeige/Nanbeige4-3B-Thinking-2511
Text Generation • 4B • Updated • 2.89k • 174 -
ServiceNow-AI/Apriel-1.6-15b-Thinker
Image-Text-to-Text • 15B • Updated • 2.9k • • 265 -
ByteDance/Ouro-1.4B-Thinking
Text Generation • Updated • 1.67k • 27 -
ByteDance/Ouro-2.6B-Thinking
Text Generation • Updated • 154 • 71
Urro
urroxyz
·
AI & ML interests
None yet
Recent Activity
commented on
a paper
about 9 hours ago
Distribution-Aligned Sequence Distillation for Superior Long-CoT Reasoning
upvoted
a
paper
about 9 hours ago
Distribution-Aligned Sequence Distillation for Superior Long-CoT Reasoning
upvoted
a
paper
about 9 hours ago
Where Did This Sentence Come From? Tracing Provenance in LLM Reasoning Distillation
Organizations
HUMAN-WRITTEN & LEGALLY-SOURCED
Datasets written by humans and/or reverse-engineered from text with deterministic algorithms. No illegal scraping or unethical synthesis.
ETHICALLY-DECENT & LEGALLY-ADJACENT
Depending on your definitions, these models may not be strictly "ethical" or "legal", yet they are 100% more ethical and legal than GPT or Claude.
-
ibm-granite/granite-4.0-h-small
Text Generation • 32B • Updated • 40.8k • 289 -
ibm-granite/granite-3.3-8b-instruct
Text Generation • 8B • Updated • 50.1k • 145 -
ibm-granite/granite-3.0-8b-instruct
Text Generation • 8B • Updated • 13.3k • 204 -
alea-institute/kl3m-003-1.7b
Text Generation • 2B • Updated • 50 • 3
ATTENTIVE ASR MODELS FOR ONNX
ONNX conversions of ASR models with attentions enabled for output. Especially useful for word-level timestamp extraction.
TINY MODELS WITH BIG INTELLIGENCE
-
Nanbeige/Nanbeige4-3B-Thinking-2511
Text Generation • 4B • Updated • 2.89k • 174 -
ServiceNow-AI/Apriel-1.6-15b-Thinker
Image-Text-to-Text • 15B • Updated • 2.9k • • 265 -
ByteDance/Ouro-1.4B-Thinking
Text Generation • Updated • 1.67k • 27 -
ByteDance/Ouro-2.6B-Thinking
Text Generation • Updated • 154 • 71
ETHICALLY-DECENT & LEGALLY-ADJACENT
Depending on your definitions, these models may not be strictly "ethical" or "legal", yet they are 100% more ethical and legal than GPT or Claude.
-
ibm-granite/granite-4.0-h-small
Text Generation • 32B • Updated • 40.8k • 289 -
ibm-granite/granite-3.3-8b-instruct
Text Generation • 8B • Updated • 50.1k • 145 -
ibm-granite/granite-3.0-8b-instruct
Text Generation • 8B • Updated • 13.3k • 204 -
alea-institute/kl3m-003-1.7b
Text Generation • 2B • Updated • 50 • 3
HUMAN-WRITTEN & LEGALLY-SOURCED
Datasets written by humans and/or reverse-engineered from text with deterministic algorithms. No illegal scraping or unethical synthesis.
ATTENTIVE ASR MODELS FOR ONNX
ONNX conversions of ASR models with attentions enabled for output. Especially useful for word-level timestamp extraction.