evals
openai/evals
Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.
17,924stars
Forks
2,901
Open issues
176
Watchers
17,924
Size
6.5 MB
PythonOther
Created: Jan 23, 2023
Updated: Feb 27, 2026
Last push: Nov 3, 2025