HelpingAI

company

Verified

https://helpingai.co/

helping_ai

HelpingAI

helping-ai

Activity Feed

AI & ML interests

Helping AI to become AGI

Recent Activity

KingNish updated a dataset about 1 month ago

HelpingAI/Dhanishtha-2.0-SUPERTHINKER

Abhaykoul updated a dataset about 1 month ago

HelpingAI/KS-WIKI

Abhaykoul published a dataset about 1 month ago

HelpingAI/KS-WIKI

View all activity

KingNish

posted an update 28 days ago

Post

2449

Muon vs MuonClip vs Muon+Adamw

Muon has gone from an experiment to a mainstream optimizer, but does it hold up for fine‑tuning? We ran head‑to‑head tests on Qwen3‑4B (10k+ high‑quality instruction rows) to find out.

Short story: Pure Muon converged fastest at the start, but its gradient‑norm spikes made training unstable. MuonClip (Kimi K2’s clipping) stabilizes long pretraining runs, yet in our small‑scale fine‑tune it underperformed, lower token accuracy and slower convergence. The winner was the hybrid: Muon for 2D layers + AdamW for 1D layers. It delivered the best balance of stability and final performance and even beat vanilla AdamW.

Takeaway: for small-scale fine-tuning, hybrid = practical and reliable.

Next Step: scale to larger models/datasets to see if Muon’s spikes become catastrophic or if clipping wins out.

Full Blog Link: https://huggingface.co/blog/KingNish/optimizer-part1

KingNish

posted an update about 1 month ago

Post

2495

I tested Muon vs MuonClip vs Muon+AdamW for fine-tuning LLMs
Just published a blog on that, Read here 👉 https://huggingface.co/blog/KingNish/optimizer-part1

1 reply

KingNish

updated a dataset about 1 month ago

HelpingAI/Dhanishtha-2.0-SUPERTHINKER

Viewer • Updated Dec 7, 2025 • 11.7k • 148 • 24

Abhaykoul

updated a dataset about 1 month ago

HelpingAI/KS-WIKI

Viewer • Updated Dec 4, 2025 • 1.24M • 12 • 1

Abhaykoul

published a dataset about 1 month ago

HelpingAI/KS-WIKI

Viewer • Updated Dec 4, 2025 • 1.24M • 12 • 1

KingNish

updated a Space 2 months ago

README

👀

KingNish

published a Space 2 months ago

README

👀

Abhaykoul

updated 3 collections 4 months ago

Abhaykoul

updated 2 datasets 4 months ago

HelpingAI/Dhanishtha-2.0-SUPERTHINKER

Viewer • Updated Dec 7, 2025 • 11.7k • 148 • 24

HelpingAI/Intermediate-Thinking-130k

Viewer • Updated Sep 22, 2025 • 135k • 46 • 46

Abhaykoul

updated a model 4 months ago

HelpingAI/hai3.1-checkpoint-0002

Text Generation • 16B • Updated Sep 15, 2025 • 27 • 8

Abhaykoul

posted an update 4 months ago

Post

3171

🚀 Ever dreamed of training your own Large Language Model from scratch? What if I told you it doesn't require a supercomputer or PhD in ML? 🤯

Introducing LLM Trainer - the educational framework that makes LLM training accessible to EVERYONE! Whether you're on a CPU-only laptop or scaling to distributed GPUs, we've got you covered. 💻➡️🖥️

Why LLM Trainer? Because existing tools are either too simplistic (hiding the magic) or too complex (requiring expert knowledge). We bridge the gap with:

🎓 Educational transparency - every component built from scratch with clear code
💻 CPU-first approach - start training immediately, no GPU needed
🔧 Full customization - modify anything you want
📈 Seamless scaling - from laptop to cluster without code changes
🤝 HuggingFace integration - works with existing models & tokenizers

Key highlights:
✅ Built-in tokenizers (BPE, WordPiece, HF wrappers)
✅ Complete Transformer implementation from scratch
✅ Optimized for CPU training
✅ Advanced features: mixed precision, gradient checkpointing, multiple generation strategies
✅ Comprehensive monitoring & metrics

Perfect for:
- Students learning transformers
- Researchers prototyping new ideas
- Developers building domain-specific models

Ready to train your first LLM? It's easier than you think!

🔗 Check it out: https://github.com/HelpingAI/llm-trainer
📚 Docs: Getting Started Guide
💬 Join the community: GitHub Discussions

#AI #MachineLearning #LLM #DeepLearning #OpenSource #Python #HuggingFace #NLP

Special thanks to HuggingFace and PyTorch teams for the amazing ecosystem! 🙏