iAkashPaul
/

gemma-2b-it-gguf

Model card Files Files and versions

Gemma 2B Instruct GGUF

Contains Q4 & Q8 quantized GGUFs for google/gemma

Perf

Variant	Device	Perf
Q4	M1 Pro 10-core GPU	90 tok/s
	Snapdragon 778G CPU	10 tok/s
	RTX 2070S	40 tok/s
Q8	M1 Pro 10-core GPU	54 tok/s
	Snapdragon 778G CPU	6 tok/s
	RTX 2070S	25 tok/s
F16	M1 Pro 10-core GPU	30 tok/s
	Snapdragon 778G CPU	<1 tok/s

Downloads last month: 76

GGUF

Model size

3B params

Architecture

gemma

Hardware compatibility

Log In to add your hardware

4-bit

8-bit

16-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support