| Quantization | VRAM Required | Min GPU |
|---|---|---|
| FP16 (Half Precision) | 140.0GB | A100 / H100 |
| INT8 (8-bit Integer) | 70.0GB | A100 / H100 |
| Q4_K_M (GGUF 4-bit) | 35.0GB | A6000 / 2x 4090 |
| q3_k_m | 28.0GB | A6000 / 2x 4090 |
40.0GB VRAM
40.0GB VRAM
80.0GB VRAM
80.0GB VRAM
80.0GB VRAM
80.0GB VRAM