数据并行数 × epoch数 = 真实的epoch数? #2961
Unanswered
bobo0810
asked this question in
Community | Q&A
Replies: 1 comment 1 reply
-
Hi @bobo0810 如果是PyTorch正常dataloader提供给Colossal,会被自动转成DistributedSampler。每个GPU各自处理一部分数据,共同完成整个数据集的epoch。epoch=3是3遍数据集。 |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
每个gpu上的dataloader都是完整的数据集,未做拆分。 即epoch=3 gpu=2时仅数据并行,模型实际上过了6遍数据集。
Beta Was this translation helpful? Give feedback.
All reactions