4bit quantisation

A technique reducing numerical precision in machine learning models to 4 bits per parameter, significantly lowering memory footprint and computational costs while maintaining model performance.

2026 04 14 How does 4bit quantisation work