Decoupling generation and loss batch sizes by sidnarayanan · Pull Request #1 · Future-House/trl

sidnarayanan · 2025-02-01T00:30:57Z

This introduces a per_device_loss_batch_size to define microbatches to be used when computing the loss. Ideally, I would have liked to compute the loss in chunks of per_device_loss_batch_size and accumulate gradients. However, to compute the advantage, we need all per_device_train_batch_size * num_generations samples.

So instead, we compute the three tensors needed for the loss (reward, logp, KL) in chunks of per_device_loss_batch_size, concatenate the chunks, and compute the full loss all at once. I think this should result in a similar memory reduction, but it remains to be tested.

I also think this code is pretty compilation-unfriendly, since I'm slicing tensors dynamically. Oh well.

…po-generation

jamesbraza

LGTM, bonus points for a simple unit test

sidnarayanan added 5 commits January 29, 2025 14:34

make sure model is in eval mode before generating

9b5a09e

bit more logging

bcd74ec

switch to transformers logging

8e8b988

Merge branch 'main' of https://github.com/huggingface/trl into fix-gr…

82d81a5

…po-generation

supporting microbatchces for ocmputing loss terms

b6a92fe

sidnarayanan requested review from jamesbraza and mskarlin February 1, 2025 00:30

jamesbraza approved these changes Feb 1, 2025

View reviewed changes

sidnarayanan added 4 commits February 1, 2025 15:54

fix issue with reward func kwargs

fd751f3

rename arg to micro bsz

ed038fe

fix microbatching

c763f07

rename to completion_ids in vllm block

38c4d63

sidnarayanan force-pushed the main branch 3 times, most recently from 3bf982c to cad4e65 Compare February 6, 2025 23:31

jamesbraza force-pushed the main branch from cad4e65 to 69ad852 Compare February 25, 2025 23:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Decoupling generation and loss batch sizes#1

Decoupling generation and loss batch sizes#1
sidnarayanan wants to merge 9 commits into
mainfrom
decouple-batch-sizes

sidnarayanan commented Feb 1, 2025

Uh oh!

jamesbraza left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

sidnarayanan commented Feb 1, 2025

Uh oh!

jamesbraza left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants