The VRAM Formula
The Golden Rule
To know if a model fits on your GPU, look at its Parameter Count (e.g., 8B, 70B) and its Quantization (e.g., Q4, Q8).The simplified formula for Q4 (4-bit) models:
(Parameters in Billions) × 0.75 = VRAM needed in GB
Common Sizes & Requirements
| Model Size | Quantization | Est. VRAM Needed | Example GPU |
|---|---|---|---|
| 8 Billion (8B) | Q4_K_M | ~6 GB | RTX 3060 / 4060 |
| 8 Billion (8B) | FP16 | ~16 GB | RTX 3090 / 4080 |
| 70 Billion (70B) | Q4_K_M | ~40 GB | 2x RTX 3090 / Mac Studio |
Don't Forget \"Context\"
The math above is just to load the model. You need extra VRAM to actually talk to it (the Context Window). If you want the model to read a 50-page PDF, that takes up extra VRAM (called the KV Cache). Always leave 1-2GB of \"headroom\" on your GPU. If you have an 8GB card, don't try to load a model that takes 7.9GB.
DijiPilot Academy Access Required
This comprehensive masterclass (8.9.3.3 - Hardware Math: Can I Run This AI Model? (Difficulty: Hero | Path: Lab)) is locked. Upgrade your plan to unlock the full technical roadmap.
Loading lesson roadmap for Phase 8.9.3.3...
Questions & Answers
Reviewing this step? Browse questions from other DijiPilot users below. If you are stuck, check the existing answers to bridge the gap between setup and success.