load_gbs in build_dataloader, how it works? #453

shensimeteor · 2024-12-17T00:34:47Z

shensimeteor
Dec 17, 2024

Hi everyone,

Anyone knows this load_gbs parameter (https://github.com/NVIDIA/NeMo-Aligner/blob/main/nemo_aligner/data/nlp/builders.py#L471), how it works?

I noticed if I set load_gbs = True, the dataloader, in each iteration, will respond data of whole global_batch_size (divided by data_parallel_size), while if it's False, it reponds data of single micro_batch_size.

My issue is if using load_gbs=True, the data will exceed memory so dataloader will crash. So I'm wondering if I can switch it to False. Or will it have other side effects of changing it to False?

Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

load_gbs in build_dataloader, how it works? #453

{{title}}

Replies: 0 comments

Select a reply

load_gbs in build_dataloader, how it works? #453

shensimeteor Dec 17, 2024

Replies: 0 comments

shensimeteor
Dec 17, 2024