Megatron-DeepSpeed
deepspeedai/Megatron-DeepSpeed
Ongoing research training transformer language models at scale, including: BERT & GPT-2
2,250stars
Forks
365
Open issues
161
Watchers
2,250
Size
8.1 MB
PythonOther
Created: Jun 21, 2021
Updated: May 24, 2026
Last push: Aug 14, 2025
Fork