Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert the OpenFace model from Lua torch to Pytorch #462

Merged
merged 6 commits into from
Oct 4, 2024

Conversation

T0ny8576
Copy link
Contributor

@T0ny8576 T0ny8576 commented Sep 20, 2024

What does this PR do?

This PR rewrites the OpenFace model in PyTorch and provides scripts to convert the trained model weights nn4.small2.v1.t7 to a PyTorch state_dict. This PR also provides examples of comparing images and training classifiers using the new model.

Summary of Changes:

  • openface/openfacenet.py: Add PyTorch model definition
  • openface/align_dlib.py: Support dlib's CNN face detector; Make upsampling count a tunable parameter
  • models/get-models.sh: Download dlib's mmod_human_face_detector model and the converted PyTorch OpenFace model weights
  • batch-represent/batch_represent.py: Generate representations for an image dataset and store the data and labels in .csv files
  • demos/compare_new.py, demos/classifier_new.py: Add new examples of using the PyTorch model
  • conversion/test_luatorch.lua, conversion/convert_to_pytorch.py: Add scripts to convert the trained model weights to a PyTorch state_dict (not necessary for using the PyTorch model)

Where should the reviewer start?

  • Build a Docker image from Dockerfile under the project root directory
sudo docker build . -t newface
  • Run a Docker container with GPU enabled
sudo docker run --rm -it --gpus all newface

How should this PR be tested?

  • Run the comparison demo
python3 demos/compare_new.py images/examples/{lennon*,clapton*}
  • Generate representations in batches for an image dataset, e.g. a raw image directory data/mydataset/raw/
python3 batch-represent/batch_represent.py -i data/mydataset/raw/ -o data/mydataset/feats/ --align_out data/mydataset/aligned/
  • Train a new SVM classifier
python3 demos/classifier_new.py train data/mydataset/feats/
  • Run the new classifier
python3 demos/classifier_new.py infer data/mydataset/feats/classifier.pkl data/mydataset/test/*

Note that only the new PyTorch model and the new examples should work. Backward compatibility with the previous version has not been carefully tested yet. The model training and the web demo are currently not supported in the update.

Questions:

  • Do the docs need to be updated?

Yes

  • Does this PR add new (Python) dependencies?

Yes

@bamos
Copy link
Collaborator

bamos commented Sep 20, 2024

Amazing! It looks really good so far

@T0ny8576
Copy link
Contributor Author

T0ny8576 commented Sep 20, 2024 via email

@brmarkus
Copy link

Please don't mandate systems using NVIDIA GPUs only (probably except for (re-)training)...

@jaharkes
Copy link
Member

The code changes seem to still support CPU, if not args.cpu is tested in various places to skip loading to gpu. Not sure if that was tested though.
The only other CUDA related change seems to be switching the base container in the Dockerfile from a very old Ubuntu 14.04 based image to a more recent nvidia:cuda container. I assume that container will still run on a machine without a GPU, you may have to force the --cpu flag.
An alternative approach could be to use torch.cuda.is_available() in more places so that the cpu argument isn't really necessary.

@brmarkus
Copy link

Sounds great!
Using such a --cpu flag might need to be documented on prominent places, ideally as part of this change-request.

@T0ny8576 T0ny8576 marked this pull request as ready for review September 26, 2024 19:22
@teiszler teiszler merged commit 0514752 into cmusatyalab:master Oct 4, 2024
@crsimx
Copy link

crsimx commented Oct 19, 2024

Hello all, @T0ny8576
I found something strange from last master or maybe I don't get it.

From docker everything works perfectly with

python3 batch-represent/batch_represent.py -i data/mydataset/raw/ -o data/mydataset/feats/ --align_out data/mydataset/aligned/

and there is a python C type process on nvidia-smi on ubuntu while running

But if i m installing it manually and run it from local it never use gpu and not create any process on gpu and its very slow, it means its working only with cpu ? But im never using --cpu flag.

How can I enable gpu - cuda without docker ?
Thanks in advance !

@brmarkus
Copy link

Can you describe first what you are exacty using and doing?

The Dockerfile is now based on a very powerful nvidia:cuda container, see "https://github.com/cmusatyalab/openface/blob/master/Dockerfile":

FROM nvidia/cuda:11.8.0-cudnn8-devel-ubuntu22.04

If you only follow the installation steps from within the Dockerfile, you miss the major, central setup from the base container.

Not sure I traced it correctly to here, "https://gitlab.com/nvidia/container-images/cuda/blob/master/doc/supported-tags.md#cuda-1180", but the link "https://gitlab.com/nvidia/container-images/cuda/blob/master/dist/11.8.0/ubuntu22.04/devel/cudnn8/Dockerfile" seems too old for the GitLab repo, maybe that got migrated already 2 years ago.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants