SherlockID365's picture
Initial upload of AWQ quantized Qwen3-VL-8B-Instruct
df4a298 verified
metadata
license: apache-2.0
library_name: transformers
tags:
  - qwen
  - vision-language
  - awq
  - int4
  - vllm
base_model: Qwen/Qwen3-VL-8B-Instruct

code taken from : https://github.com/vllm-project/llm-compressor/blob/main/examples/awq/qwen3-vl-30b-a3b-Instruct-example.py

Qwen3-VL-8B-Instruct-AWQ

AWQ (W4A16) quantized version of Qwen/Qwen3-VL-8B-Instruct.

  • Quantization: AWQ, 4 bits, group_size=128, zero_point=true, version="gemm"
  • modules_to_not_convert: ["visual"]
  • Prepared with LLM Compressor oneshot AWQ. recipe = AWQModifier( targets="Linear", scheme="W4A16", ignore=[r"re:model.visual.", r"re:visual."], # drop lm_head from ignore duo_scaling=True, )