LLM training

Training large language models (LLMs) requires substantial computational resources and cost, with recent estimates highlighting extreme expenses. Key developments include:

  • Cost: Stanford reported Gemini Ultra (2023) training cost ~78M (Altman claimed higher); 2025 estimates continue to reflect prohibitive costs.
  • 4-bit training: Shift towards 4-bit floating-point (FP4) training to reduce memory and compute demands, as detailed in How does 4bit quantisation work.

Related concepts:

2026 04 14 How does 4bit quantisation work

Source Notes