Pleaase upload nvfp4, fp8, gguf versions

#2
by shivshankar - opened

Pleaase upload nvfp4, fp8, gguf versions

Fp8 version is critical. It's important to get the transformer model itself to fit into vram.

I just did 2 t2v videos with the sulphur lora with the distilled gguf (distill 1.1) and videos are coming out pretty dang impressive. Speed is about the same even though lora is 10gb.

fp8, gguf 😥

Can this be loaded in FP8? Or no, because it's a checkpoint? I have an RTX 5090 and cannot fit this into VRAM

Running the full model on my poor 5060ti 16gb vram and 32gb ram. It works just takes bit longer than with fp8 haha

I'll have them converted to fp8

ohh will you upload to your github?

On Huggingface - takes some time but I'll get it done today
ETA is in a few hours and I'll post the link here

ohh will you upload to your github?

Huggingface

@Winnougan Make sure to credit the one who made the tool you are using to quantize models too.

I have done GGUF conversions but they are not Comfy compatible as it seems Sulphur have changed the model dimensions. I could patch Comfy and submit a PR, but do we know if that's intentional/already planned?

Sign up or log in to comment