[Blocksparse] bug fixing half + sequence length #25

blefaudeux · 2021-10-22T17:16:35Z

What does this PR do?

Fixes #24.

blocksparse works on fp16
sequence length needs to be power of two for now
do not expose Blocksparse if the current GPU does not have tensor cores

Would be nice to follow up with a PR to Triton to fix the second point. cc @ptillet

Before submitting

Did you have fun?
- Make sure you had fun coding 🙃
Did you read the contributor guideline?
Was this discussed/approved via a Github issue? (no need for typos, doc improvements)
- N/A
Did you make sure to update the docs?
- Doing this right now, updating the PR
Did you write any new necessary tests?
- Sort of, we assert in the attention to catch this broken case and explain a little better
Did you update the changelog? (if needed)
- N/A

PR review

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

…of-two sequence lengths

blefaudeux · 2021-10-22T17:51:55Z

xformers/benchmarks/benchmark_encoder.py

@@ -325,12 +325,12 @@ def plot(args, results: List[Dict[str, Any]]):
        "-emb", "--embedding_dim", nargs="+", default=[64, 128, 256], type=int
    )
    parser.add_argument(
-        "-sl", "--sequence_length", nargs="+", default=[512, 768, 1024], type=int
+        "-sl", "--sequence_length", nargs="+", default=[512, 1024], type=int


would have been nice to test for longer sequences, but 2048 OOMS with the vanilla attention on a V100..

blefaudeux · 2021-10-22T22:41:16Z

Approved via internal chat :)

…ecoder [Refactor] change ck decoder invocation way from old CK to CK-Tile

bug fixing for now, would be nice to PR Triton and support non-power-…

33e3889

…of-two sequence lengths

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 22, 2021

blefaudeux requested review from jieru-hu, dianaml0 and fmassa October 22, 2021 17:16

blefaudeux added 2 commits October 22, 2021 10:19

Add some warnings in the doc about the sequence length

870afc9

do not expose blocksparse if not tensor core enabled GPU

5a9edb6

blefaudeux force-pushed the blocksparse_crash branch from 308fdb9 to 5a9edb6 Compare October 22, 2021 17:49

blefaudeux commented Oct 22, 2021

View reviewed changes

blefaudeux merged commit 2e5906a into main Oct 22, 2021

blefaudeux deleted the blocksparse_crash branch October 22, 2021 22:41

tenpercent added a commit to tenpercent/xformers that referenced this pull request Oct 8, 2024

Merge pull request facebookresearch#25 from tenpercent/refactor-hip-d…

f37fb3d

…ecoder [Refactor] change ck decoder invocation way from old CK to CK-Tile

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Blocksparse] bug fixing half + sequence length #25

[Blocksparse] bug fixing half + sequence length #25

blefaudeux commented Oct 22, 2021 •

edited

Loading

blefaudeux Oct 22, 2021

blefaudeux commented Oct 22, 2021

[Blocksparse] bug fixing half + sequence length #25

[Blocksparse] bug fixing half + sequence length #25

Conversation

blefaudeux commented Oct 22, 2021 • edited Loading

What does this PR do?

Before submitting

PR review

blefaudeux Oct 22, 2021

Choose a reason for hiding this comment

blefaudeux commented Oct 22, 2021

blefaudeux commented Oct 22, 2021 •

edited

Loading