EXL3 quants of Step-3.5-Flash

⚠️ Requires ExLlamaV3 v0.0.23 (or v0.0.22 dev branch)

Base bitrates:

2.00 bits per weight (coming soon)
3.00 bits per weight
4.00 bits per weight

Optimized:

3.05 bits per weight
(more coming soon)

. Ppl¹ KL-div
2.00 bpw TBD TBD
3.00 bpw 1.521 0.142
3.05 bpw 1.478 0.118
4.00 bpw 1.379 0.053
Original 1.336

¹ (10 rows of wikitext2)

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for turboderp/Step-3.5-Flash-exl3

Quantized
(23)
this model

Collection including turboderp/Step-3.5-Flash-exl3