Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
Building on HF
75.2
TFLOPS
281
21
59
nyuuzyou
PRO
nyuuzyou
Follow
rmoeller's profile picture
kalomaze's profile picture
Ilham28HAMMAOUI's profile picture
286 followers
·
33 following
https://ducks.party/donate
nyuuzyou
nyuuzyou
AI & ML interests
None yet
Recent Activity
posted
an
update
1 day ago
🌐 NNTP Discussion Archives - 387M Messages from Public Newsgroups - https://huggingface.co/datasets/nyuuzyou/nntp-text-387m Here's something different from the code datasets: 20+ years of public discussion archives from NNTP newsgroups. Clean Parquet format, but this time it's conversations instead of code. Key Stats: - 386,629,949 messages from 159,345 newsgroups - 191 GB compressed Parquet storage - Spans 2002-2026 - Multilingual: English, German, French, Italian, Dutch, Polish, Russian, and others - Email addresses redacted for privacy The data is messy in the way real discussions are messy. Spam wasn't filtered out - you get the advertisements, the arguments, the off-topic threads, all of it. If you want sanitized text, this isn't it. If you want to see how people actually talked online before Discord and Reddit took over, here you go. Processing kept it simple: convert everything to UTF-8, remove exact duplicates, strip binary attachments, redact emails. Legacy character encodings were a nightmare - had to handle Windows-1252, ISO-8859 variants, KOI8-R, Shift-JIS, GBK, and others just to get readable text. At least it was fun to do, and I think the result turned out pretty well. I hope someone else will also be able to have fun or gain something useful from this project.
new
activity
3 days ago
nyuuzyou/nntp-text-387m:
[bot] Conversion to Parquet
updated
a dataset
3 days ago
nyuuzyou/nntp-text-387m
View all activity
Organizations
nyuuzyou
's models
23
Sort: Recently updated
nyuuzyou/TowerVision-9B-GGUF
Image-Text-to-Text
•
9B
•
Updated
20 days ago
•
871
nyuuzyou/TowerVision-2B-GGUF
Image-Text-to-Text
•
3B
•
Updated
Dec 2, 2025
•
431
nyuuzyou/EuroVLM-9B-Preview-GGUF
9B
•
Updated
Dec 2, 2025
•
78
•
1
nyuuzyou/EuroMoE-2.6B-A0.6B-Instruct-Preview-GGUF
3B
•
Updated
Dec 2, 2025
•
283
•
3
nyuuzyou/EuroLLM-22B-Preview-GGUF
23B
•
Updated
Dec 2, 2025
•
119
nyuuzyou/EuroLLM-22B-Instruct-Preview-GGUF
23B
•
Updated
Dec 2, 2025
•
29
nyuuzyou/EuroMoE-2.6B-A0.6B-Preview-GGUF
3B
•
Updated
Dec 2, 2025
•
111
•
1
nyuuzyou/Dhanishtha-2.0-preview-0725-GGUF
15B
•
Updated
Dec 2, 2025
•
57
nyuuzyou/EuroVLM-1.7B-Preview-GGUF
2B
•
Updated
Dec 2, 2025
•
63
nyuuzyou/SmolLM2-1.7B-Eagle-GGUF
Text Generation
•
2B
•
Updated
Dec 2, 2025
•
36
nyuuzyou/SmolLM2-360M-Eagle-GGUF
Text Generation
•
0.4B
•
Updated
Dec 2, 2025
•
23
nyuuzyou/SmolLM2-135M-Eagle-GGUF
Text Generation
•
0.1B
•
Updated
Dec 2, 2025
•
122
•
1
nyuuzyou/Orpheus-3B-ASMR
Text-to-Speech
•
3B
•
Updated
May 26, 2025
•
1
•
2
nyuuzyou/Orpheus-3B-ASMR-LoRA
Text-to-Speech
•
Updated
May 26, 2025
nyuuzyou/AircraftFLUX-LoRA
Text-to-Image
•
Updated
May 26, 2025
•
1
•
4
nyuuzyou/Planespotting-YOLO11
Object Detection
•
Updated
May 17, 2025
•
4
•
1
nyuuzyou/Qwen2.5-0.5B-Bluesky-Instruct
Text Generation
•
0.5B
•
Updated
Apr 28, 2025
•
14
•
3
nyuuzyou/Qwen2.5-0.5B-Bluesky
Text Generation
•
0.5B
•
Updated
Apr 27, 2025
•
4
nyuuzyou/SmolLM2-1.7B-Eagle
Text Generation
•
2B
•
Updated
Apr 18, 2025
•
1
nyuuzyou/SmolLM2-360M-Eagle
Text Generation
•
0.4B
•
Updated
Apr 18, 2025
•
2
nyuuzyou/SmolLM2-135M-Eagle
Text Generation
•
0.1B
•
Updated
Apr 18, 2025
•
1
•
3
nyuuzyou/stickers
Image Classification
•
Updated
Aug 20, 2023
•
4
nyuuzyou/AnimeHeads
Object Detection
•
Updated
Apr 16, 2023
•
9