datatrove
huggingface/datatrove
Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.
2,987stars
Forks
252
Open issues
84
Watchers
2,987
Size
35.3 MB
PythonApache License 2.0
Created: Jun 14, 2023
Updated: Apr 14, 2026
Last push: Apr 10, 2026