LLM training
Training large language models (LLMs) requires substantial computational resources and cost, with recent estimates highlighting extreme expenses. Key developments include:
- Cost: Stanford reported Gemini Ultra (2023) training cost ~78M (Altman claimed higher); 2025 estimates continue to reflect prohibitive costs.
- 4-bit training: Shift towards 4-bit floating-point (FP4) training to reduce memory and compute demands, as detailed in How does 4bit quantisation work.
Related concepts:
- large-language-model
- Quantisation
- Model training
2026 04 14 How does 4bit quantisation work
Source Notes
- 2026-04-26: [[lab-notes/2026-04-26-Karpathys-AutoResearch-An-AI-Agent-for-Independent-LLM-Program-Improvement|Karpathy’s AutoResearch: An AI Agent for Independent LLM Program Improvement]]