How much Vram needed for the full context length?
#31
by
Aly87
- opened
with no quants.
I wonder how do you run this there 2.63M downloads where do you get the gpu's?
Runpod or other cloud providers?
Runpod A100 80GB is 2USD/hr
So that makes using directly chinese providers cheaper.
What do you mean directly Chinese providers? Through their chat interface? Or API or what? I want to fine-tune the model and “own” the weights not just chat with it
I mean uzing AliPay to pay them directly for API
@Aly87
According to https://huggingface.co/spaces/oobabooga/accurate-gguf-vram-calculator
512 context:
Expected VRAM usage: 155802 MiB
Safe estimate: 156379 MiB - 95% chance the VRAM is at most this.
full 131072 context:
Expected VRAM usage: 178796 MiB
Safe estimate: 179373 MiB - 95% chance the VRAM is at most this.