maegic / SmartQuant.md
TobDeBer's picture
Create SmartQuant.md
239a594 verified

SmartQuant

  • optional: extract quant distribution from existing GGUF
    • default: standard llama.cpp distribution
  • optional: adapt quant distribution based on rules or imatrix
    • default: skip
  • optional: adapt quant distribution to almost compatible model
    • default: skip
  • apply updated quants to
    • the model used for extraction: only requant changed tensors
      • example: apply minor quant optimizations
    • a compatible model: apply known quant distribution
      • example: apply MoE quant distribution to REAP model
    • an almost compatible model: needs previous adaption step
      • example: apply dense quant distribution to REAP model