How much Vram needed for the full context length?

#31

by Aly87 - opened Sep 27

Discussion

Aly87

Sep 27

with no quants.

akierum

Sep 29

I wonder how do you run this there 2.63M downloads where do you get the gpu's?

Aly87

Sep 29

Runpod or other cloud providers?

akierum

Sep 30

Runpod A100 80GB is 2USD/hr
So that makes using directly chinese providers cheaper.

Aly87

Sep 30

•

edited Sep 30

What do you mean directly Chinese providers? Through their chat interface? Or API or what? I want to fine-tune the model and “own” the weights not just chat with it

akierum

Oct 4

I mean uzing AliPay to pay them directly for API

rtzurtz

about 17 hours ago

@Aly87
According to https://huggingface.co/spaces/oobabooga/accurate-gguf-vram-calculator

512 context:

Expected VRAM usage: 155802 MiB
Safe estimate: 156379 MiB - 95% chance the VRAM is at most this.

full 131072 context:

Expected VRAM usage: 178796 MiB
Safe estimate: 179373 MiB - 95% chance the VRAM is at most this.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment