Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Install git in Docker image? #238

Closed
m90 opened this issue Jun 22, 2020 · 13 comments
Closed

Install git in Docker image? #238

m90 opened this issue Jun 22, 2020 · 13 comments

Comments

@m90
Copy link

m90 commented Jun 22, 2020

Many CI providers allow the use of arbitrary Docker images for running certain tasks, but all of them require git to be installed in container so that they can do SCM related tasks like checking out tags et al.

Would it be an option to include git in the final fsfe/reuse image so that it can be used in CI directly?


NB: I'd be happy to add a PR for this, but wanted to ask beforehand if you even want it.

@mxmehl
Copy link
Member

mxmehl commented Jun 22, 2020

Thanks for the suggestion. May I ask for a use case?

The reason why I am a bit hesitant is that the REUSE Docker image should be as slim as possible. It's being used in many CI pipelines, and to reduce build time and/or disk/download space, a large image should be avoided.

@m90
Copy link
Author

m90 commented Jun 22, 2020

So my exact use case is using this image in a CircleCI pipeline where it fails when you build a Git Tag which gives me:

Either git or ssh (required by git to clone through SSH) is not installed in the image. Falling back to CircleCI's native git client but the behavior may be different from official git. If this is an issue, please use an image that has official git and ssh installed.
Enumerating objects: 1014, done.
Counting objects: 100% (1014/1014), done.
Compressing objects: 100% (427/427), done.
Total 12542 (delta 599), reused 936 (delta 550), pack-reused 11528

object not found

If I use an image that has git installed and use pip to install reuse it works fine. I did not try apk add git in the fsfe/reuse container though yet. Maybe that would be the safest way for me.

@carmenbianca
Copy link
Member

https://github.com/fsfe/reuse-tool/blob/master/Dockerfile

Git is installed. It's probably SSH that's missing.

So my exact use case is using this image in a CircleCI pipeline where it fails when you build a Git Tag

Can you link to the circleci configuration file for this?

@m90
Copy link
Author

m90 commented Jun 22, 2020

Isn't git just installed in the builder stage and will be excluded from the final image as it's not copied over?

This is the circle config that failed on the tag (branches would work): https://github.com/offen/offen/blob/14244a5eda0fb2cb6fbc5129abc4912be18306ab/.circleci/config.yml

This is our current fix that works against branches and tags: offen/offen@f3cad81#diff-1d37e48f9ceff6d8030570cd36286a61

@carmenbianca
Copy link
Member

carmenbianca commented Jun 22, 2020

Found it. Normally in a CI system, you take the following steps:

  1. Clone (and prepare) the repo.
  2. Activate the Docker image.
  3. Do the thing that is specific to the Docker image.

You appear to have swapped step 1 and 2. You first activate the Docker image, then check out the repo, and then run reuse lint.

Because the Docker image doesn't have SSH, it can't do Git checkouts.

I am not sure if this can be circumvented within the CircleCI configuration.

@mxmehl Would you argue that being able to clone repositories is important functionality in the Docker image? Git and Mercurial are already installed as runtime dependencies.

@mxmehl
Copy link
Member

mxmehl commented Jun 22, 2020

I see the same issue as @carmenbianca. Checking out any VCS is not really what an image like REUSE is supposed to do. You normally would want the CI to do that for you in the very first step, and then ask specialised images to apply their magic.

I am not familiar enohgh with CircleCI to see how it could work, but there are topic in their forum about tags already

@m90
Copy link
Author

m90 commented Jun 22, 2020

So the Circle way to do this would be using what they call a "machine" executor, which runs things in a VM instead of inside Docker. Then I could use the fsfe/reuse image in that VM to run the lint step, but I would assume this is much heavier than just using a Python based image, install from pip and then lint like we do right now.

Thanks for your input!

@mxmehl
Copy link
Member

mxmehl commented Jun 22, 2020

Woah, that's really not user-friendly by CircleCI...

@carmenbianca I currently would see no big problem with installing ssh into the Docker image except that we would install something that's not fundamentally necessary for runtime. According to my test, openssh-client would add 4M to the uncompressed image. What do you think?

@carmenbianca
Copy link
Member

I don't have a strong opinion on this. SSH appears to be a nice-to-have for cases—like above—where it's hard to separate out the preparatory stages from actually running the tool. I have personally experienced this a few times with other [Docker] images, where I really wished that the image would just contain Git or GNU core utils, so that I could simply interact with the repository without shuffling about.

The difference here is that it isn't so much trying to interact with the repository, but downloading it. That's something that really really ought to be handled by the CI system. I'm honestly surprised CircleCI doesn't handle that well. Or maybe it does, but none of us know about it?

Adding a workaround for CircleCI seems like a sufficient enough reason to add SSH, though, especially given that this tool is meant to run in CI environments.

@m90
Copy link
Author

m90 commented Jun 23, 2020

I'm honestly surprised CircleCI doesn't handle that well.

The docs mention some requirements here https://circleci.com/docs/2.0/custom-images/#required-tools-for-primary-containers which would probably mean installing even more here (tar, gzip, ca-certs?)

Using the fsfe/reuse image as a job executor is indeed a bit of a hack, but I gotta admit it'd be a nice one if it'd work.

I guess the best solution would be if someone (not necessarily you) would create a dedicated reuse-circleci image so that you don't have to bloat the image to suit 3rd parties but people would still get a convenient way of running reuse in such a scenario. Alternatively you could also publish fsfe/reuse:0.1.2-circleci in parallel to the default versions or similar.

@mxmehl
Copy link
Member

mxmehl commented Jun 23, 2020

I have created the image fsfe/reuse:latest-extra which contains openssh-client. I decided against naming it after the CI because there might be cases in the future where extra applications are needed. It's built from Dockerfile-extra. Could you please test whether it works for you?

The current image has been built manually by me, with the next update of the latest tag it should happen automatically. It will not be built for all other tags (dev, versions) as the Docker autobuild do not allow a more flexible way with the simple Dockerfile-extra.

@m90
Copy link
Author

m90 commented Jun 23, 2020

Thank you for this! We'll start using this over at our repo again, but I probably won't be able to tell you if it actually works before we do the next release (i.e. push the next tag). That should be relatively soon though and I'll keep you updated in here.

@mxmehl
Copy link
Member

mxmehl commented Jun 23, 2020

Alright, I will close this issue then. Please report back and reopen if it fails

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants