LLaVA 1.6 34B on Jetson Orin Nano 8GB: Compatibility Analysis

info Technical Analysis

The primary limiting factor in running large language models (LLMs) like LLaVA 1.6 34B is VRAM. This model, in FP16 precision, requires approximately 68GB of VRAM to load the model weights and perform inference. The NVIDIA Jetson Orin Nano 8GB, as the name suggests, only provides 8GB of VRAM. This creates a significant VRAM headroom deficit of -60GB, meaning the model cannot be loaded entirely onto the GPU. Attempting to run the model directly would result in out-of-memory errors. The memory bandwidth of 0.07 TB/s on the Jetson Orin Nano, while adequate for smaller models, would also become a bottleneck if swapping to system RAM were attempted, severely impacting inference speed. Finally, the Ampere architecture and number of CUDA and Tensor cores, while capable, cannot overcome the fundamental limitation of insufficient VRAM.

lightbulb Recommendation

Due to the substantial VRAM discrepancy, directly running LLaVA 1.6 34B on the Jetson Orin Nano 8GB is not feasible. Consider smaller models that fit within the 8GB VRAM limit. Alternatively, explore aggressive quantization techniques like Q4_K_M or even lower precisions if supported by the inference framework. Offloading layers to system RAM (CPU) is possible, but will drastically reduce performance, making it unsuitable for real-time or interactive applications. If possible, consider using a more powerful GPU with sufficient VRAM, or leverage cloud-based inference services.

tune Recommended Settings

Batch_Size

1

Context_Length

512

Other_Settings

['Reduce the number of layers loaded on GPU', 'Enable memory offloading to system RAM (expect significant performance degradation)', 'Use a smaller model variant if available']

Inference_Framework

llama.cpp

Quantization_Suggested

Q4_K_M or lower (e.g., Q2_K)

help Frequently Asked Questions

Is LLaVA 1.6 34B compatible with NVIDIA Jetson Orin Nano 8GB? expand_more

No, it is not directly compatible due to insufficient VRAM.

What VRAM is needed for LLaVA 1.6 34B? expand_more

LLaVA 1.6 34B requires approximately 68GB of VRAM in FP16 precision.

How fast will LLaVA 1.6 34B run on NVIDIA Jetson Orin Nano 8GB? expand_more

It is unlikely to run at all without significant modifications and performance will be very slow if memory offloading is used.

NelsaHost

Can I run LLaVA 1.6 34B on NVIDIA Jetson Orin Nano 8GB?

VRAM Usage

info Technical Analysis

lightbulb Recommendation

tune Recommended Settings

help Frequently Asked Questions

GPU

AI Model

More with Jetson Orin Nano 8GB