Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
In a Training Loop 🔄
8
15
23
Khoi Truong
Kiy-K
Follow
MS123ready's profile picture
John6666's profile picture
drrockso72's profile picture
5 followers
·
38 following
https://kiy-k.github.io/My-Portfolio-K/
Kiy-K
AI & ML interests
Building Computer Use for AI agents
Recent Activity
reacted
to
nyuuzyou
's
post
with 🔥
1 day ago
🌐 NNTP Discussion Archives - 387M Messages from Public Newsgroups - https://huggingface.co/datasets/nyuuzyou/nntp-text-387m Here's something different from the code datasets: 20+ years of public discussion archives from NNTP newsgroups. Clean Parquet format, but this time it's conversations instead of code. Key Stats: - 386,629,949 messages from 159,345 newsgroups - 191 GB compressed Parquet storage - Spans 2002-2026 - Multilingual: English, German, French, Italian, Dutch, Polish, Russian, and others - Email addresses redacted for privacy The data is messy in the way real discussions are messy. Spam wasn't filtered out - you get the advertisements, the arguments, the off-topic threads, all of it. If you want sanitized text, this isn't it. If you want to see how people actually talked online before Discord and Reddit took over, here you go. Processing kept it simple: convert everything to UTF-8, remove exact duplicates, strip binary attachments, redact emails. Legacy character encodings were a nightmare - had to handle Windows-1252, ISO-8859 variants, KOI8-R, Shift-JIS, GBK, and others just to get readable text. At least it was fun to do, and I think the result turned out pretty well. I hope someone else will also be able to have fun or gain something useful from this project.
upvoted
a
collection
10 days ago
TranslateGemma
liked
a Space
10 days ago
google/ehr-navigator-agent-with-medgemma
View all activity
Organizations
Kiy-K
's datasets
74
Sort: Recently updated
Kiy-K/pretraining-corpus
Viewer
•
Updated
Dec 5, 2025
•
41.2k
•
18
•
2
Kiy-K/smoltrace-leaderboard
Viewer
•
Updated
Nov 25, 2025
•
3
•
27
Kiy-K/smoltrace-metrics-20251125_134447
Viewer
•
Updated
Nov 25, 2025
•
212
•
21
Kiy-K/smoltrace-traces-20251125_134447
Viewer
•
Updated
Nov 25, 2025
•
15
•
6
Kiy-K/smoltrace-results-20251125_134447
Viewer
•
Updated
Nov 25, 2025
•
15
•
24
Kiy-K/smoltrace-metrics-20251125_134933
Viewer
•
Updated
Nov 25, 2025
•
68
•
3
Kiy-K/smoltrace-traces-20251125_134933
Viewer
•
Updated
Nov 25, 2025
•
15
•
3
Kiy-K/smoltrace-results-20251125_134933
Viewer
•
Updated
Nov 25, 2025
•
15
•
17
Kiy-K/smoltrace-metrics-20251125_131628
Viewer
•
Updated
Nov 25, 2025
•
248
•
12
Kiy-K/smoltrace-traces-20251125_131628
Viewer
•
Updated
Nov 25, 2025
•
15
•
3
Kiy-K/smoltrace-results-20251125_131628
Viewer
•
Updated
Nov 25, 2025
•
15
•
4
Kiy-K/smoltrace-metrics-20251125_131633
Viewer
•
Updated
Nov 25, 2025
•
40
•
14
Kiy-K/smoltrace-traces-20251125_131633
Viewer
•
Updated
Nov 25, 2025
•
15
•
3
Kiy-K/smoltrace-results-20251125_131633
Viewer
•
Updated
Nov 25, 2025
•
15
•
6
Kiy-K/fyodor-data
Viewer
•
Updated
Nov 24, 2025
•
1.39k
•
2
Kiy-K/smoltrace-metrics-20251124_004104
Viewer
•
Updated
Nov 24, 2025
•
40
•
2
Kiy-K/smoltrace-traces-20251124_004104
Viewer
•
Updated
Nov 24, 2025
•
15
•
5
Kiy-K/smoltrace-results-20251124_004104
Viewer
•
Updated
Nov 24, 2025
•
15
•
2
Kiy-K/smoltrace-metrics-20251123_054612
Viewer
•
Updated
Nov 23, 2025
•
272
•
2
Kiy-K/smoltrace-traces-20251123_054612
Viewer
•
Updated
Nov 23, 2025
•
15
•
2
Kiy-K/smoltrace-results-20251123_054612
Viewer
•
Updated
Nov 23, 2025
•
15
•
2
Kiy-K/smoltrace-metrics-20251123_051709
Viewer
•
Updated
Nov 23, 2025
•
306
•
3
Kiy-K/smoltrace-traces-20251123_051709
Viewer
•
Updated
Nov 23, 2025
•
15
•
2
Kiy-K/smoltrace-results-20251123_051709
Viewer
•
Updated
Nov 23, 2025
•
15
•
1
Kiy-K/smoltrace-metrics-20251123_051002
Viewer
•
Updated
Nov 23, 2025
•
162
•
2
Kiy-K/smoltrace-traces-20251123_051002
Viewer
•
Updated
Nov 23, 2025
•
15
•
4
Kiy-K/smoltrace-results-20251123_051002
Viewer
•
Updated
Nov 23, 2025
•
15
•
1
Kiy-K/code-snippets
Viewer
•
Updated
Nov 13, 2025
•
45
•
3
Kiy-K/codebench-verified
Viewer
•
Updated
Nov 6, 2025
•
35.3k
•
5
Kiy-K/fyodor-personality-PRO
Viewer
•
Updated
Nov 4, 2025
•
25k
•
28
Previous
1
2
3
Next