fix(hooks): Add padding support to context parallel hooks #12595

Ratish1 · 2025-11-05T14:34:40Z

What does this PR do?

This PR now modifies the ContextParallelSplitHook and ContextParallelGatherHook to gracefully handle sequence lengths that are not divisible by the world size.

This PR changes:

Generic Padding: The ContextParallelSplitHook now pads any input tensor to a divisible length before sharding.
State Management: It temporarily stores the original sequence length on the module instance itself.
Generic Trimming: The ContextParallelGatherHook uses this stored length to trim the padding from the final output tensor before returning it.

This ensures that the padding is completely transparent to the model and the end-user, preventing crashes without altering the output shape. The fix is now contained entirely within the hooks and requires no changes to the Qwen transformer or any other model.

I have also added a new unit test in tests/hooks/test_hooks.py that directly tests this new padding and trimming logic,

Fixes #12568

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline?
Did you read our philosophy doc (important for complex PRs)?
Was this discussed/approved via a GitHub issue or the forum? Please add a link to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.
@sayakpaul @yiyixuxu @DN6

yiyixuxu · 2025-11-05T19:47:57Z

thanks for the PR! however, we will not want any of these logic go into qwen transformer
would you be interested to how to support this case( not just qwen) from the context parallel hooks
https://github.com/huggingface/diffusers/blob/main/src/diffusers/hooks/context_parallel.py#L204

Ratish1 · 2025-11-05T20:27:46Z

thanks for the PR! however, we will not want any of these logic go into qwen transformer would you be interested to how to support this case( not just qwen) from the context parallel hooks https://github.com/huggingface/diffusers/blob/main/src/diffusers/hooks/context_parallel.py#L204

Hi @yiyixuxu, yes I would be interested to support this change.

Ratish1 · 2025-11-08T19:16:04Z

Hi @yiyixuxu ,Just wanted to follow up. After looking at the hook implementation as you suggested, I've updated the PR with a new approach that is fully generic and contains all logic within the hooks, with no changes to the transformer.

The solution now involves adding padding in the ContextParallelSplitHook and then trimming it in theContextParallelGatherHook, using the module instance to temporarily store the original sequence length. I've also added a new unit test for this logic in test_hooks.py. Thanks and lmk if you need more changes. I've updated the PR description with the full details.

CC @sayakpaul @DN6

HuggingFaceDocBuilderDev · 2025-11-11T11:14:42Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Ratish1 · 2025-11-18T14:34:29Z

Hello @yiyixuxu, I wanted to follow up on this in case you were busy. Thanks.

sayakpaul · 2025-12-09T04:43:33Z

Hmm, I think we need to wait a bit before #12702 is merged because it is tackling padding too.

Ratish1 · 2025-12-09T05:49:05Z

Ok got it @sayakpaul , thanks for letting me know.

Ratish1 changed the title ~~fix(qwenimage): Correct context parallelism padding~~ fix(qwenimage): Add padding for context parallelism Nov 5, 2025

Ratish1 added 2 commits November 8, 2025 23:17

fix(qwenimage): Correct context parallelism padding

5560eb2

fix(hooks): Add generic padding to context parallel hook

afc18a1

Ratish1 force-pushed the context-parallel branch from 93340a2 to afc18a1 Compare November 8, 2025 19:29

Ratish1 changed the title ~~fix(qwenimage): Add padding for context parallelism~~ fix(hooks): Add padding support to context parallel hooks Nov 8, 2025

sayakpaul requested a review from yiyixuxu November 11, 2025 11:06

sayakpaul mentioned this pull request Nov 11, 2025

The Diffusers MVP 🚀 #12635

Open

Merge branch 'main' into context-parallel

d834e25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(hooks): Add padding support to context parallel hooks #12595

fix(hooks): Add padding support to context parallel hooks #12595

Uh oh!

Ratish1 commented Nov 5, 2025 •

edited

Loading

Uh oh!

yiyixuxu commented Nov 5, 2025

Uh oh!

Ratish1 commented Nov 5, 2025 •

edited

Loading

Uh oh!

Ratish1 commented Nov 8, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Nov 11, 2025

Uh oh!

Ratish1 commented Nov 18, 2025

Uh oh!

sayakpaul commented Dec 9, 2025

Uh oh!

Ratish1 commented Dec 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

fix(hooks): Add padding support to context parallel hooks #12595

Are you sure you want to change the base?

fix(hooks): Add padding support to context parallel hooks #12595

Uh oh!

Conversation

Ratish1 commented Nov 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Before submitting

Who can review?

Uh oh!

yiyixuxu commented Nov 5, 2025

Uh oh!

Ratish1 commented Nov 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Ratish1 commented Nov 8, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Nov 11, 2025

Uh oh!

Ratish1 commented Nov 18, 2025

Uh oh!

sayakpaul commented Dec 9, 2025

Uh oh!

Ratish1 commented Dec 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Ratish1 commented Nov 5, 2025 •

edited

Loading

Ratish1 commented Nov 5, 2025 •

edited

Loading