Content

This model area holds the public parts of converted gguf models using Skipper (T3) or Mate (M8) technology. Future modes will also follow the nautic theme.

The T3 (and M8) project is a Proto Open Source project that does NOT publish its code but applies the benefits ONLY to OSI models and some select Open Weights models. The goal is to strengthen the True Open Source model family. Open Sourcing the code would also benefit proprietary models. Currently there are demos on Huggingface Spaces to try out the model behavior under such extreme compression. Further variants for faster inference and local inference will follow.

Demo Spaces

Regular compression
- Granite4family All Granite4 models (small, tiny, micro, nano 1b and nano 350m)
T3 OSI compression
- TOM@zero Demo of next generation 2bpw compression (Skipper aka T3) with high quality open source models (OSI)
  - Olmo3
  - Smol3
  - Apertus
T3 Open Weights compression
- Granite4extreme Granite 4 small hybrid 32b compressed to below 9GB in fp16 quality
- tbd Qwen3.5

Challenge: high quality models in 1/2/4/8/.. GB size

Phone 4GB
Home 8GB
Game 16GB
Pro 32GB
Zero 64GB - 71GB
Server 128GB+

Quality vs. Size	Casual	Premium	Advanced	Frontier
64-71 GB	SOTA	SOTA	SOTA	BETA
32 GB	SOTA	SOTA	SOTA+	RESEARCH
16 GB	SOTA	SOTA+	BETA	-
8 GB	SOTA	BETA	BETA	-
4 GB	SOTA	RESEARCH	-	-
2 GB	RESEARCH	-	-	-
1 GB	-	-	-	-

SOTA: K quants
SOTA+: UD quants
BETA: REAP + UD
RESEARCH: M8 and better

ELO (https://lmarena.ai/leaderboard/text)

Towards Frontier@Phone (within 40 ELO of #1) non plus ultra
- qwen3-vl-235b-a22b-instruct 1415 (-37 ELO)
  - https://huggingface.co/Qwen/Qwen3-235B-A22B-Instruct-2507
  - https://huggingface.co/unsloth/Qwen3-235B-A22B-Instruct-2507-GGUF/tree/main/Q2_K_L (85.8 GB)
- Frontier@Zero: GLM-4.6-REAP-218B-A32B 1428 REAP50 + 3bpw (81,8GB)
- Frontier@Phone: GLM-4.6-REAP-218B-A32B 1428 REAP75 + 0.3bpw (4,0GB)
Towards Advanced@Phone (within 60 ELO of #1) almost perfect
- Advanced@Gamer: qwen3-next-80b-a3b-instruct 1402 REAP50 + 3.6bpw (21,6GB)
- Advanced@Phone: qwen3-next-80b-a3b-instruct 1402 REAP75 + 1.2bpw (3,6GB)
Towards Premium@Phone (within 80 ELO of #1) extremely good for everyday
- Premium@Home: qwen3-30b-a3b-instruct-2507 1385 REAP50 + 3.6bpw (8,1GB)
- Premium@Phone: qwen3-30b-a3b-instruct-2507 1385 REAP75 + 3.6bpw (4,1GB)
Towards Casual@Phone (within 99 ELO of #1) very useful
- Casual@Phone: gemma-3n-e4b-it (133 ELO diff!) 1318 (4.1GB) https://huggingface.co/unsloth/gemma-3n-E4B-it-GGUF/blob/main/gemma-3n-E4B-it-UD-Q3_K_XL.gguf

Versions

Version	Codename	Fileprefix	typical bpw range	new feature
1.0	Skipper	T3 and T2	0.8 .. 2.2	introduce new compression method
1.5	Mate	M8	0.4 .. 2	compression improvements
2.0	Cheng	Cx	0.3 .. 2	speed improvements
2.5	Cheng++	Cy	0.1 .. 2	reduce compute requirements

V1 does reduce model size significantly at same subjective quality, but leaves compute requirements high.

V2 will scale down compute requirements and support cheap NPUs

expected bpw (bit per weight)

Actual bpw are higher for small models and lower for larger models. Similar to JPEG and video encoding, higher input quality opens more opportunity for compression.

Base	Mode	%	bpw@30b
Q5_K	T3UD	95	2 .. 2.2
Q4_K	T2UD	90	1.4 .. 1.6
Q2_K	T2UD2	75	1 .. 1.2
Q2_K	T2UD1	60	0.8
Q2_K	M8HQ	75	0.8
Q2_K	M8LQ	60	0.4 .. 0.6

Downloads last month: 105

GGUF

Hardware compatibility

4-bit

6-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support