RelayLLM: Efficient Reasoning via Collaborative Decoding Paper • 2601.05167 • Published 2 days ago • 23
Jamba Reasoning 3B Collection AI21's top-performing reasoning model that packs leading scores on intelligence benchmarks and highly-efficient processing into a compact 3B build • 2 items • Updated Oct 8, 2025 • 6
Cerebras REAP Collection Sparse MoE models compressed using REAP (Router-weighted Expert Activation Pruning) method • 21 items • Updated about 18 hours ago • 80
Can LLMs Predict Their Own Failures? Self-Awareness via Internal Circuits Paper • 2512.20578 • Published 18 days ago • 68
Unsloth Dynamic 2.0 Quants Collection New 2.0 version of our Dynamic GGUF + Quants. Dynamic 2.0 achieves superior accuracy & SOTA quantization performance. • 67 items • Updated 11 days ago • 296
view article Article M2.1: Multilingual and Multi-Task Coding with Strong Generalization 6 days ago • 27
MiroThinker-v1.5 Collection MiroMind’s Flagship Search Agent Model • 4 items • Updated 4 days ago • 19
view article Article Tricks from OpenAI gpt-oss YOU 🫵 can use with transformers +5 Sep 11, 2025 • 177
DEER: Draft with Diffusion, Verify with Autoregressive Models Paper • 2512.15176 • Published 24 days ago • 42
From Code Foundation Models to Agents and Applications: A Practical Guide to Code Intelligence Paper • 2511.18538 • Published Nov 23, 2025 • 283
Kimi Linear: An Expressive, Efficient Attention Architecture Paper • 2510.26692 • Published Oct 30, 2025 • 119