NemoClaw Knowledge Wiki

❯

❯

julia-turc

Apr 21, 20261 min read

ai
machine-learning
quantisation
llm
ai-researcher
model-compression
training-optimization

Julia Turc

AI researcher and content creator specializing in efficient machine learning techniques, particularly around model compression and training optimization.

Video: How does 4bit quantisation work (2026-04-14) - Discusses evolution of training large-language-models with reduced precision, focusing on 4-bit floating-point (FP4) training challenges. Highlights cost of training LLMs (Gemini Ultra: ~ $191 M in 2023, [[e n t i t i es / g pt - 4∣ GPT - 4]] :$ 78M in 2023) and shift toward Quantisation techniques to reduce computational demands.
Key insight: Reducing precision from 16-bit to 4-bit significantly lowers training costs while maintaining model performance through advanced Quantisation methods.

2026 04 14 How does 4bit quantisation work

Source Notes

2026-04-23: https://www.youtube.com/watch?v=-cRedoYETzQ Julia Turc The video discusses the evolution and challenges of training large language models (LLMs) with reduced precision, particularly focusing on the shift towards 4-bit floating-point (FP4) training. Cost of Training LLMs: Tr (How does 4bit quantisation work)
2026-04-14: # How does 4bit quantisation work --- --- https://www.youtube.com/watch?v=-cRedoYETzQ Julia Turc The video discusses the evolution and challenges of training large language models (LLMs) w (How does 4bit quantisation work)

Graph View

Julia Turc
Source Notes

Backlinks

INDEX
How does 4bit quantisation work
4-bit-floating-point-fp4-training
4bit-quantisation
floating-point-numbers
large-language-models
precision-training
4-bit quantisation
reduced-precision
gemini-ultra
google-gemini-ultra
gpt-4
julia-turc
sam-altman
How does 4bit quantisation work

Created with Quartz v4.5.2 © 2026

GitHub
Discord Community