axbench
stanfordnlp/axbench
Stanford NLP Python library for benchmarking the utility of LLM interpretability methods
180stars
Forks
32
Open issues
13
Watchers
180
Size
631.6 MB
PythonApache License 2.0
interpretabilityinterventionlarge-language-modelsllm-steeringmechanistic-interpretability
Created: Aug 7, 2024
Updated: Apr 10, 2026
Last push: Mar 12, 2026