Elo score

A rating system used to calculate the relative skill levels of participants in zero-sum games or competitive ranking environments.

Mechanics

  • Probability-based: Ratings are adjusted based on the discrepancy between the predicted outcome and the actual result of a match.
  • Zero-sum: In its fundamental application, points gained by one participant are lost by another.
  • Applications: Extensively used in Chess, eSports, and machine-learning leaderboards.

Applications in AI Evaluation

Source Notes

  • 2026-04-22: [[lab-notes/2026-04-22-OpenAI-GPT-Image-2.0-Evaluating-Next-Gen-AI-Image-Generation-Capabilities|OpenAI GPT Image 2.0: Evaluating Next-Gen AI Image Generation Capabilities]]