Create SmartQuant.md
239a594 verified
SmartQuant
- optional: extract quant distribution from existing GGUF
- default: standard llama.cpp distribution
- optional: adapt quant distribution based on rules or imatrix
- optional: adapt quant distribution to almost compatible model
- apply updated quants to
- the model used for extraction: only requant changed tensors
- example: apply minor quant optimizations
- a compatible model: apply known quant distribution
- example: apply MoE quant distribution to REAP model
- an almost compatible model: needs previous adaption step
- example: apply dense quant distribution to REAP model