evals
openai/evals
Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.
18,559stars
Forks
2,971
Open issues
206
Watchers
18,559
Size
6.5 MB
PythonOther
Created: Jan 23, 2023
Updated: May 29, 2026
Last push: Apr 14, 2026