BetterTransformer Demo

This repo is the source code behind the Gradio Space demo of BetterTransformer integration with 🤗 Transformers, using 🤗 Optimum library.

Built on either TorchServe or HF's Inference Endpoints, this repo helps to understand in a visual and interactive way where BetterTransformer can be useful in production.

By default, this demo uses distilbert-base-uncased-finetuned-sst-2-english with a maximum batch size of 8.

Run on an AWS instance

The example is run on an AWS EC2 g4dn instance, which uses a T4 NVIDIA GPU. Using a basic ubuntu instance, run

sudo apt update && sudo apt upgrade

Then, we recommend installing docker following the Nvidia Container Toolkit guide, and installing the NVIDIA drivers through the package manager and nvidia drivers (for example nvidia-driver-520). You may need to add yourself in the docker group (usermod -aG docker ubuntu). You may need to reboot after the install.

Finally, download the Dockerfile needed to run the TorchServe API:

wget https://github.com/fxmarty/bettertransformer_demo/raw/main/Dockerfile

Build the docker (for BetterTransformer, use ts_config_bettertransformer.properties and distilbert_sst2_bettertransformer args and --build-arg USE_BETTERTRANSFORMER=yes):

docker build -f Dockerfile \
--build-arg PROP_PATH=./ts_config_vanilla.properties \
--build-arg MAR_NAME=distilbert_sst2_vanilla \
-t bettertransformer-demo .

Run the TorchServe server:

docker run --rm -it --gpus all -p 8080:8080 -p 8081:8081 bettertransformer-demo:latest

To run in background, use nohup without the -it argument.

Check the server is successful running

Use curl http://127.0.0.1:8080/ping to check the TorchServe server is running well.

You can as well try:

# outdated example, will not work
import requests

headers = {"Content-Type": "text/plain"}
address = "http://127.0.0.1:8080/predictions/my_tc"  # change this IP if needed
data = "this is positive lol"

response = requests.post(address, headers=headers, data=data)

print(response.status_code)
print(response.text)

Run in a Space

An a demo for BetterTransformer is available at: https://huggingface.co/spaces/fxmarty/bettertransformer-demo . I'll host the demo for a week with two AWS EC2 instances, but since hosting the Space is ~1$/hour, please host one yourself in the future if you want to reproduce.

In any case, example ouputs are available in the Space, to get an idea of the gains of BetterTransformer for latency/throughput!

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
model_store		model_store
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
index_to_name.json		index_to_name.json
setup_config.json		setup_config.json
transformer_text_classification_handler.py		transformer_text_classification_handler.py
ts_config_bettertransformer.properties		ts_config_bettertransformer.properties
ts_config_vanilla.properties		ts_config_vanilla.properties

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BetterTransformer Demo

Run on an AWS instance

Check the server is successful running

Run in a Space

About

Releases

Packages

Languages

License

fxmarty/bettertransformer_demo

Folders and files

Latest commit

History

Repository files navigation

BetterTransformer Demo

Run on an AWS instance

Check the server is successful running

Run in a Space

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages