My pretrained LMs on FineWeb datasets - part of my TensorFlow Model Garden LMs project
Stefan Schweter PRO
stefan-it
AI & ML interests
Flair Library ๐, NER & PoS Tagging, LM Pretraining (mostly encoder-only & encoder-decoder), Historical Language Models, German Language Models, Bavarian NLP ๐ฅจ
Recent Activity
liked
a dataset 5 days ago
windprak/steuerllm_instruct_dataset reacted
to
SeaWolf-AI's
post with ๐ฅ 5 days ago
Do Bubbles Form When Tens of Thousands of AIs Simulate Capitalism?
We gave LLMs autonomous trading over 30 real tickers at 100x leverage. All went bankrupt in 30 minutes from hallucination. This spawned FINAL Bench (first metacognition benchmark) and AI NPC Trading Arena โ tens of thousands of metacognition-equipped AI agents competing under capitalist rules. Humans can only watch.
Live Demo: https://huggingface.co/spaces/Heartsync/Prompt-Dump
Article: https://huggingface.co/blog/FINAL-Bench/pumpdump
NPCs form a society: 3-tier memory, self-modifying parameters, mutual criticism, strategy propagation, and a virtual SEC enforcing fines every 20 minutes. Every trade passes 4-stage verification including Brave Search fact-check. FINAL Bench confirmed across 9 SOTA models that AI can say "I might be wrong" (MA 0.694) but cannot actually fix errors (ER 0.302).
Six findings: Bubbles form naturally through knowledge transfer and swarm herding. Identical NPCs diverge irreversibly from their first three trades. Metacognition blocks individual hallucination but not collective herding โ this is the key finding. Information asymmetry solidifies hierarchy. Fraud and regulation co-evolve. Criticism improves returns.
Individual intelligence does not guarantee collective intelligence.
Dataset & Paper:
https://huggingface.co/datasets/FINAL-Bench/Metacognitive liked
a dataset 5 days ago
castorini/NanoKnow-Fineweb-Edu-Index Organizations
โ๏ธ Fine-Tuned Historical NER Models (hmTEAMS)
Fined-Tuned NER Models on Historical NER Datasets (HIPE-2022) with Flair and hmTEAMS as backbone LM
โ๏ธ Fine-Tuned Historical NER Models (hmByT5)
Fined-Tuned NER Models on Historical NER Datasets (HIPE-2022) with Flair and hmBERT as backbone LM
-
hmbyt5-preliminary/flair-hipe-2022-ajmc-de
Token Classification โข Updated โข 2 -
hmbyt5-preliminary/flair-hipe-2022-ajmc-en
Token Classification โข Updated -
hmbyt5-preliminary/flair-hipe-2022-ajmc-fr
Token Classification โข Updated โข 1 -
hmbyt5-preliminary/flair-hipe-2022-newseye-de
Token Classification โข Updated โข 1
โ๏ธ Fine-Tuned Historical NER Models (hmBERT Tiny)
Fined-Tuned NER Models on Historical NER Datasets (HIPE-2022) with Flair and hmBERT Tiny as backbone LM
๐น๐ท Turkish Language Models
My pretrained Language Models for Turkish
๐ฌ๐ช Georgian NER Models
My fine-tuned NER models for Georgian
-
stefan-it/autotrain-flair-georgian-ner-xlm_r_large-bs4-e10-lr5e-06-1
Token Classification โข Updated โข 18 -
stefan-it/autotrain-flair-georgian-ner-xlm_r_large-bs4-e10-lr5e-06-2
Token Classification โข Updated -
stefan-it/autotrain-flair-georgian-ner-xlm_r_large-bs4-e10-lr5e-06-3
Token Classification โข Updated -
stefan-it/autotrain-flair-georgian-ner-xlm_r_large-bs4-e10-lr5e-06-4
Token Classification โข Updated โข 3
๐งน Fine-Tuned CleanCoNLL Models
My fine-tuned Flair NER models on CleanCoNLL dataset (with different seeds)
๐ Historical Multilingual Language Models
A Collection of Historical Multilingual Language Models
-
dbmdz/bert-base-historic-multilingual-cased
Fill-Mask โข 0.1B โข Updated โข 456 โข 8 -
dbmdz/bert-base-historic-multilingual-64k-td-cased
Fill-Mask โข 0.1B โข Updated โข 13 โข 2 -
hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax
Updated โข 1 -
hmteams/teams-base-historic-multilingual-discriminator
0.1B โข Updated โข 5
โ๏ธ Fine-Tuned Historical NER Models (hmBERT)
Fined-Tuned NER Models on Historical NER Datasets (HIPE-2022) with Flair and hmBERT as backbone LM
โ๏ธ Fine-Tuned Flair Models on German MobIE Dataset
Fine-Tuned Flair Models on German MobIE Dataset using ๐ค AutoTrain SpaceRunner
-
stefan-it/autotrain-flair-mobie-gbert_base-bs16-e10-lr5e-05-2
Token Classification โข Updated โข 3 -
stefan-it/autotrain-flair-mobie-gbert_base-bs16-e10-lr3e-05-3
Token Classification โข Updated โข 6 -
stefan-it/autotrain-flair-mobie-gbert_base-bs16-e10-lr5e-05-5
Token Classification โข Updated โข 2 -
stefan-it/autotrain-flair-mobie-gbert_base-bs16-e10-lr5e-05-3
Token Classification โข Updated โข 2 โข 1
โ๏ธ Fine-Tuned Historical NER Models (hmBERT 64k)
Fined-Tuned NER Models on Historical NER Datasets (HIPE-2022) with Flair and hmBERT 64k as backbone LM
๐ฑ Microsoft Papers with no code/data release
Collection of Microsoft Papers with no code/data release
-
MEGA: Multilingual Evaluation of Generative AI
Paper โข 2303.12528 โข Published -
MEGAVERSE: Benchmarking Large Language Models Across Languages, Modalities, Models and Tasks
Paper โข 2311.07463 โข Published โข 15 -
Kosmos-2.5: A Multimodal Literate Model
Paper โข 2309.11419 โข Published โข 56 -
A Unified View of Masked Image Modeling
Paper โข 2210.10615 โข Published
๐ผ Fine-Tuned CO-Funer Models
My fine-tuned Flair models on CO-FUN NER Dataset
๐ง xLSTM Language Models
My trained xLSTM LMs (under development)
๐ก FineWeb-LMs
My pretrained LMs on FineWeb datasets - part of my TensorFlow Model Garden LMs project
๐ Historical Multilingual Language Models
A Collection of Historical Multilingual Language Models
-
dbmdz/bert-base-historic-multilingual-cased
Fill-Mask โข 0.1B โข Updated โข 456 โข 8 -
dbmdz/bert-base-historic-multilingual-64k-td-cased
Fill-Mask โข 0.1B โข Updated โข 13 โข 2 -
hmbyt5-preliminary/byt5-small-historic-multilingual-span20-flax
Updated โข 1 -
hmteams/teams-base-historic-multilingual-discriminator
0.1B โข Updated โข 5
โ๏ธ Fine-Tuned Historical NER Models (hmTEAMS)
Fined-Tuned NER Models on Historical NER Datasets (HIPE-2022) with Flair and hmTEAMS as backbone LM
โ๏ธ Fine-Tuned Historical NER Models (hmBERT)
Fined-Tuned NER Models on Historical NER Datasets (HIPE-2022) with Flair and hmBERT as backbone LM
โ๏ธ Fine-Tuned Historical NER Models (hmByT5)
Fined-Tuned NER Models on Historical NER Datasets (HIPE-2022) with Flair and hmBERT as backbone LM
-
hmbyt5-preliminary/flair-hipe-2022-ajmc-de
Token Classification โข Updated โข 2 -
hmbyt5-preliminary/flair-hipe-2022-ajmc-en
Token Classification โข Updated -
hmbyt5-preliminary/flair-hipe-2022-ajmc-fr
Token Classification โข Updated โข 1 -
hmbyt5-preliminary/flair-hipe-2022-newseye-de
Token Classification โข Updated โข 1
โ๏ธ Fine-Tuned Flair Models on German MobIE Dataset
Fine-Tuned Flair Models on German MobIE Dataset using ๐ค AutoTrain SpaceRunner
-
stefan-it/autotrain-flair-mobie-gbert_base-bs16-e10-lr5e-05-2
Token Classification โข Updated โข 3 -
stefan-it/autotrain-flair-mobie-gbert_base-bs16-e10-lr3e-05-3
Token Classification โข Updated โข 6 -
stefan-it/autotrain-flair-mobie-gbert_base-bs16-e10-lr5e-05-5
Token Classification โข Updated โข 2 -
stefan-it/autotrain-flair-mobie-gbert_base-bs16-e10-lr5e-05-3
Token Classification โข Updated โข 2 โข 1
โ๏ธ Fine-Tuned Historical NER Models (hmBERT Tiny)
Fined-Tuned NER Models on Historical NER Datasets (HIPE-2022) with Flair and hmBERT Tiny as backbone LM
โ๏ธ Fine-Tuned Historical NER Models (hmBERT 64k)
Fined-Tuned NER Models on Historical NER Datasets (HIPE-2022) with Flair and hmBERT 64k as backbone LM
๐น๐ท Turkish Language Models
My pretrained Language Models for Turkish
๐ฑ Microsoft Papers with no code/data release
Collection of Microsoft Papers with no code/data release
-
MEGA: Multilingual Evaluation of Generative AI
Paper โข 2303.12528 โข Published -
MEGAVERSE: Benchmarking Large Language Models Across Languages, Modalities, Models and Tasks
Paper โข 2311.07463 โข Published โข 15 -
Kosmos-2.5: A Multimodal Literate Model
Paper โข 2309.11419 โข Published โข 56 -
A Unified View of Masked Image Modeling
Paper โข 2210.10615 โข Published
๐ฌ๐ช Georgian NER Models
My fine-tuned NER models for Georgian
-
stefan-it/autotrain-flair-georgian-ner-xlm_r_large-bs4-e10-lr5e-06-1
Token Classification โข Updated โข 18 -
stefan-it/autotrain-flair-georgian-ner-xlm_r_large-bs4-e10-lr5e-06-2
Token Classification โข Updated -
stefan-it/autotrain-flair-georgian-ner-xlm_r_large-bs4-e10-lr5e-06-3
Token Classification โข Updated -
stefan-it/autotrain-flair-georgian-ner-xlm_r_large-bs4-e10-lr5e-06-4
Token Classification โข Updated โข 3
๐ผ Fine-Tuned CO-Funer Models
My fine-tuned Flair models on CO-FUN NER Dataset
๐งน Fine-Tuned CleanCoNLL Models
My fine-tuned Flair NER models on CleanCoNLL dataset (with different seeds)
๐ง xLSTM Language Models
My trained xLSTM LMs (under development)