Megatron-DeepSpeed
deepspeedai/Megatron-DeepSpeed
Ongoing research training transformer language models at scale, including: BERT & GPT-2
2,242stars
Forks
370
Open issues
161
Watchers
2,242
Size
8.1 MB
PythonOther
Created: Jun 21, 2021
Updated: Apr 13, 2026
Last push: Aug 14, 2025
Fork