Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update cache after adding the repository, not before installing the package #71

Merged

Conversation

veger
Copy link
Contributor

@veger veger commented Nov 22, 2016

When the repo is added, a update cache is required to be able install Marathon.

When installing/updating Marathon a cache update is not required, as the OS will keep the cache up-to-date.

This prevents changes when nothing actually changed (not idempotent).

(similar PR as AnsibleShipyard/ansible-marathon#39)

@ernestas-poskus ernestas-poskus merged commit 4506777 into AnsibleShipyard:master Nov 22, 2016
@lhoss
Copy link
Contributor

lhoss commented Nov 22, 2016

nice 👍
and the best, I just used the fix in a prod. deployment, as on one server the apt update_cache indeed wasn't done (before the fix), thus resulting in error:

TASK [teralytics.mesos : Add mesosphere repo] **********************************
ok: [node01] => {"changed": false, "repo": "deb http://repos.mesosphere.com/ubuntu trusty main", "state": "present"}

TASK [teralytics.mesos : Install Debian OS packages] ***************************
failed: [node01] (item=[u'wget', u'curl', u'unzip', u'python-setuptools', u'python-dev', u'mesos=1.0.1-2.0.93.ubuntu1404']) => {"failed": true, "item": ["wget", "curl", "unzip", "python-setuptools", "python-dev", "mesos=1.0.1-2.0.93.ubuntu1404"], "msg": "Could not fetch updated apt files"}

with the fix, re-ran it, and it worked 👍

@ernestas-poskus
Copy link
Member

🆒 🍸

@lhoss
Copy link
Contributor

lhoss commented Nov 24, 2016

I'm afraid to say, but removing the update_cache from the apt broke my setup on another cluster, where a proxy is in use (could be also that the apt_repositorydoesn't ensure the update_cache in case it's present!?)
I finally propose to fix all imaginable cases, by re-adding it there, plus avoid a dumb 2x 'apt-get' update, by adding attribute cache_valid_time:
->

  apt: pkg={{ item }} state=present update_cache=yes cache_valid_time=3600

I can provide the PR for ansible-mesos!
Happy if @veger can apply same fix in ansible-marathon (any other roles!?) Thx!

@lhoss
Copy link
Contributor

lhoss commented Nov 24, 2016

the promised (and tested) fix: #72

@veger
Copy link
Contributor Author

veger commented Nov 24, 2016

Weird that the change broke your system... Maybe apt is not keeping your cache up-to-date due to some configuration issue? You PR indeed fixes the issue, but it is not idempotent: with it there is a change every hour

@lhoss
Copy link
Contributor

lhoss commented Nov 25, 2016

actually @veger I just found your comment AnsibleShipyard/ansible-marathon#39 (comment)
and I had actually the same issue in the apttask, exact last line:

"stderr": "E: There are problems and -y was used without --force-yes\n", "stdout":
...
"WARNING: The following packages cannot be authenticated!", "  libevent-core-2.0-5 libevent-extra-2.0-5 libevent-openssl-2.0-5", "  libevent-pthreads-2.0-5 libevent-dev mesos"]

This made the apt task fail (ps: using force would also work, but IMO a worse fix!)

So I propose either my fix, or actually maybe easier to undo both our PR changes (and go back to just have the update_cache in apt)

ps: Would be good to know what's actually the best practice for this super common case!? We could check how it's done in other well knows roles!
For ex following roles (Im also using), only use the update_cache (plus the valid time) in the apt:

@ernestas-poskus
Copy link
Member

Please either follow easier solution or best practices

@veger
Copy link
Contributor Author

veger commented Nov 25, 2016

Ah... authentication issue.

We had the same for our custom/internal repo, which requires HTTPS and a valid SSL key(chain).
We ended up with (converted from several roles into to pseudo(-ish)-code):

- name: Ensure certificate is available
  copy:
    src=/path/to/cert.crt
   dest: /usr/local/share/ca-certificates/

# defined as a handler and force with meta: flush_handlers
- name: update certificates
  shell: update-ca-certificates

  # TODO Use apt_key module (when client certificates or https://<username@password:our.repo.tld/repository.gpg.key is supported)
- name: Check if repository key is available
  shell: apt-key export <pgp key>
  register: our_repo_key
- name: Ensure repository key is available
  shell: curl <params to fetch key from server> | apt-key add -
  when: our_repo_key.stdout.find('-----BEGIN PGP PUBLIC KEY BLOCK-----') == -1

# File our_repo.conf contains: Acquire::https::our.repo.tld::SslCert  "/usr/local/share/ca-certificates/our_repo.crt";
- name: Ensure apt uses our_repo_key client certificate
  copy:
    src: our_repo.conf
    dest: /etc/apt/apt.conf.d/80our_repo

- name: Ensure HMMS repository is available
  apt_repository:
    repo: deb https://our.repo.tld/ubuntu trusty main
    state: present
    update_cache: yes

But this is a bit too much for a generic role. Maybe you can add this before you use the role?

@lhoss
Copy link
Contributor

lhoss commented Nov 25, 2016

@veger thx for above snippet (useful indeed when using an internal repos), but in our case no internal-repos is involved, just a proxy.

Anyway I actually just double checked what happens if the apt_repository task includes the update_cache in all cases, but it does not (not really idempotent behaviour btw)!
See http://docs.ansible.com/ansible/apt_repository_module.html#options
Furthermore update_cache=yes is even the default !

So to ensure that we actually see the new (mesos version) pkg in apt, on a node that already had an older mesos version installed (and didn't have 'apt-get update' for some time), we def. need an explicit update_cache=yes on the apt task!

About the extra cache_valid_time it doesn't matter for me:

  • with the cache_valid_time (like added in my PR, and used often by community to avoid a potential sequential update)
  • without it (situation like before this PR-71)

I hope we can fix this without loosing more time. Thank you both 👍

@veger
Copy link
Contributor Author

veger commented Nov 25, 2016

and didn't have 'apt-get update' for some time

Normally the OS handles this? (At least Ubuntu/Debian do so) Or did you disable this functionality on your machines?

cache_valid_time is useful when multiple tasks execute apt-get update (does it break when a repo is added after an initial apt-get update? Well... for another time/story).

In our situation we execute our Ansible playbook on a daily basis (to stay up-to-date). We would like to see no changes when everything was ok (so we can add events to notify our ops team) For this cache_valid_time does not help, I suppose?

About a solution: I do not know. Are proxies supported by this role? Or does the user have a own responsibility for getting the updates available in special stituations? I guess the owners/maintainers of this role need to decide.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants