Can I run Mixtral 8x7B (INT8 (8-bit Integer)) on NVIDIA RTX 4090?

cancel
Fail/OOM
This GPU doesn't have enough VRAM
GPU VRAM
24.0GB
Required
46.7GB
Headroom
-22.7GB

VRAM Usage

0GB 100% used 24.0GB

info Technical Analysis

NVIDIA RTX 4090 cannot run Mixtral 8x7B (46.70B) in this configuration. The model requires 46.7GB but only 24.0GB is available, leaving you 22.7GB short.

lightbulb Recommendation

Consider using a more aggressive quantization (Q4_K_M, Q3_K_M) to reduce VRAM requirements, or upgrade to a GPU with more VRAM. Cloud GPU services like RunPod or Vast.ai offer affordable options.

tune Recommended Settings

Batch_Size
None
Context_Length
None
Inference_Framework
llama.cpp or vLLM

help Frequently Asked Questions

Can I run Mixtral 8x7B (46.70B) on NVIDIA RTX 4090? expand_more
NVIDIA RTX 4090 (24.0GB VRAM) cannot run Mixtral 8x7B (46.70B) which requires 46.7GB. You are 22.7GB short. Consider using a more aggressive quantization (like Q4_K_M or Q3_K_M) or upgrading to a GPU with more VRAM.
How much VRAM does Mixtral 8x7B (46.70B) need? expand_more
Mixtral 8x7B (46.70B) requires approximately 46.7GB of VRAM.
What performance can I expect? expand_more
Estimated None tokens per second.