Coding benchmarks

Metrics and frameworks used to evaluate the proficiency of large-language-models in software engineering tasks, including code generation, debugging, and repository-level problem-solving.

Key Benchmarks

Sources

Source Notes