Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Minor] Open sourcing update #9

Merged
merged 3 commits into from
Oct 19, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions HOWTO.md
Original file line number Diff line number Diff line change
Expand Up @@ -364,6 +364,10 @@ Transformer(

We don't have the exact same interfaces, but we have something fairly close with the [model_factory](xformers/factory/model_factory.py).

It’s worth noting that xFormer’s blocks expect tensors to be batch first, while Pytorch’s transformers uses a sequence first convention. Don’t forget to permute if you use xFormers’s blocks as drop-in replacements.

Similarly, the attention masks conventions are different: in Pytorch, the mask is *True* when an element should *not* be attended to, whereas in xFormer it’s the opposite. Don’t forget to negate your attention masks to use xFormers’ blocks as drop-in replacements.

The equivalent with xFormers would look like the following. You can think of it as a declaration of the sequence of blocks that you would like instantiated.

```python
Expand Down
10 changes: 6 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,11 @@
<img src="./docs/assets/logo.png" width=800>

![PyPI](https://img.shields.io/pypi/v/xformers)
[![Documentation Status](https://readthedocs.org/projects/xformers/badge/?version=latest)](https://xformers.readthedocs.io/en/latest/?badge=latest)
<!-- FIXME @lefaudeux - PyPI package -->
<!-- ![PyPI](https://img.shields.io/pypi/v/xformers)
![PyPI - License](https://img.shields.io/pypi/l/xformers) -->

[![Documentation Status](https://github.com/facebookresearch/xformers/actions/workflows/gh-pages.yml/badge.svg)](https://github.com/facebookresearch/xformers/actions/workflows/gh-pages.yml/badge.svg)
[![CircleCI](https://circleci.com/gh/facebookresearch/xformers.svg?style=shield)](https://app.circleci.com/pipelines/github/facebookresearch/xformers/)
![PyPI - License](https://img.shields.io/pypi/l/xformers)
[![PRs Welcome](https://img.shields.io/badge/PRs-welcome-brightgreen.svg)](CONTRIBUTING.md)
--------------------------------------------------------------------------------

Expand All @@ -13,7 +15,7 @@ xFormers is a modular and field agnostic library to flexibly generate transforme

## Getting started

The full [documentation](https://xformers.readthedocs.io/) contains instructions for getting started, deep dives and tutorials about the various APIs.
The full [documentation](https://facebookresearch.github.io/xformers/) contains instructions for getting started, deep dives and tutorials about the various APIs.
If in doubt, please check out the [HOWTO](HOWTO.md). Only some general considerations are laid out in the README.

### Installation
Expand Down
1 change: 0 additions & 1 deletion xformers/triton/softmax.py
Original file line number Diff line number Diff line change
Expand Up @@ -330,7 +330,6 @@ def _softmax_dispatch(x: torch.Tensor, log: bool, mask: Optional[torch.Tensor],
and x.is_cuda
and not _triton_registered_overflow
):
# pyre-ignore[16]: Pyre is unable to find the `apply` method.
return _softmax_triton.apply(x, mask, log, causal)
except triton.code_gen.OutOfResources:
# Catch cases where the current GPU does not have enough registers to hold a full tensor line
Expand Down