Kimi Team
Kimi Team is the research division of Moonshot AI dedicated to advancing large language model (LLM) architecture and computational efficiency. The team focuses on addressing fundamental challenges in deep neural network design, particularly those that emerge as models scale to greater depths and complexity.
Attention Residuals
The team is known for developing Attention Residuals (AttnRes), an architectural innovation aimed at solving the pre-norm dilution problem in deep language models. Pre-norm dilution refers to the degradation of attention mechanisms in deeper network layers when using pre-normalization configurations, which can limit model performance and training stability. The AttnRes approach provides a technical solution to maintain attention effectiveness across network depth.