datatrove
huggingface/datatrove
Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.
2,910stars
Forks
246
Open issues
83
Watchers
2,910
Size
33.7 MB
PythonApache License 2.0
Created: Jun 14, 2023
Updated: Feb 28, 2026
Last push: Feb 25, 2026