Parameter Count

The number of Parameters in a large-language-model (LLM) directly determines model capacity, training complexity, and resource requirements. For example, NVIDIA’s Llama 3.1 Nemotron 70B (70.6 billion parameters) requires ~150GB storage at full precision (32-bit), distributed across 30+ files (~5GB each).

  • adam-lucek - quantisation of LLM: Video explaining model-efficiency techniques (e.g., reducing precision from 32-bit to 8-bit), which cuts storage needs by ~75% (e.g., 70B model from ~150GB to ~37.5GB) while preserving model performance.
  • model-efficiency: Technique for reducing parameter precision without significant accuracy loss, critical for deploying large models on constrained hardware.
  • large-language-model: Model class where parameter count correlates strongly with capabilities but also with computational cost.
  • nemotron: NVIDIA’s Nemotron-3 family (released April 2026) includes Nano (30B total, 3B active via Mixture-of-Experts), Super (100B total, 10B active), and Ultra (500B total, 50B active), offering scalable open-source LLMs with reduced active parameter counts.

2026 04 14 Adam Lucek quantisation of LLM 2026 04 14 Gary Explains channel Nematron 3