Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

secure-docker-daemon download problem #1

Closed
rolele opened this issue Oct 11, 2016 · 13 comments
Closed

secure-docker-daemon download problem #1

rolele opened this issue Oct 11, 2016 · 13 comments

Comments

@rolele
Copy link

rolele commented Oct 11, 2016

Hi, great project! thanks a lot.
I have a problem when trying to provision the machines

vagrant provision worker3                                                                                                                                                                                    9:22
==> worker3: Running provisioner: shell (host_shell)...
[stdout] - downloading role 'secure-docker-daemon', owned by ansible

[stderr]  [WARNING]: - ansible.secure-docker-daemon was NOT installed successfully: -
sorry, ansible.secure-docker-daemon was not found on
https://galaxy.ansible.com.
ERROR! - you can use --ignore-errors to skip failed roles and finish processing the list.

==> worker3: Running provisioner: swarm (ansible)...
==> worker3: Vagrant has detected a host range pattern in the `groups` option.
==> worker3: Vagrant doesn't fully check the validity of these parameters!
==> worker3:
==> worker3: Please check https://docs.ansible.com/ansible/intro_inventory.html#hosts-and-groups
==> worker3: for more information.
    worker3: Running ansible-playbook...
PYTHONUNBUFFERED=1 ANSIBLE_FORCE_COLOR=true ANSIBLE_HOST_KEY_CHECKING=false ANSIBLE_SSH_ARGS='-o UserKnownHostsFile=/dev/null -o IdentitiesOnly=yes -o ControlMaster=auto -o ControlPersist=60s' ansible-playbook --connection=ssh --timeout=30 --limit="all" --inventory-file=/Users/others/working/aa_working/ansible/vagrant-ansible-docker-swarm/.vagrant/provisioners/ansible/inventory -vv ansible/swarm.yml
No config file found; using defaults
ERROR! the role 'ansible.secure-docker-daemon' was not found in /Users/others/working/aa_working/ansible/vagrant-ansible-docker-swarm/ansible/roles:/Users/others/working/aa_working/ansible/vagrant-ansible-docker-swarm/ansible:/etc/ansible/roles

The error appears to have been in '/Users/others/working/aa_working/ansible/vagrant-ansible-docker-swarm/ansible/swarm.yml': line 17, column 7, but may
be elsewhere in the file depending on the exact syntax problem.

The offending line appears to be:

  roles:
    - role: ansible.secure-docker-daemon
      ^ here

Ansible failed to complete successfully. Any error output should be
visible above. Please fix these errors and try again.

it seems that secure-docker-daemon is not on galaxy anymore.
Could you push the secure-docker-daemon role that you have downloaded locally on github?
Otherwise, is there an easy way to make it work in unsecure mode?

@rolele rolele changed the title provided hosts list is empty, only localhost is available secure-docker-daemon download problem Oct 11, 2016
@jamesdmorgan
Copy link
Owner

jamesdmorgan commented Oct 12, 2016

Hi,

Thanks for the interest. The role is available on github. Unsure why it's dropped off Galaxy. Its not actually being actively used in the project. I had wanted to speed up the build process and copy images from one Docker Daemon to another...

I'll have a look into it. In the meantime, you could clone the role from github into the roles directory. I have changed it so by default it skips that role.

I have removed the role from my local roles dir and trashed the VMs. There is an issue with my guest additions that i'll look into but the boxes were created. Running provision now ignores the failed module..

$ vagrant provision
==> worker3: Running provisioner: shell (host_shell)...
[stdout] - downloading role 'secure-docker-daemon', owned by ansible

[stderr]  [WARNING]: - ansible.secure-docker-daemon was NOT installed successfully: -
sorry, ansible.secure-docker-daemon was not found on
https://galaxy.ansible.com.

==> worker3: Running provisioner: swarm (ansible)...
TASK [debug] *******************************************************************
task path: /Users/jamesdmorgan/Documents/Projects/vagrant-ansible-docker-swarm/ansible/swarm.yml:101
ok: [manager1] => {
    "docker_swarm_info.Swarm": {
        "Cluster": {
            "CreatedAt": "2016-10-12T07:17:48.055902593Z",
            "ID": "5keimcdlv3tseov2cm23p6495",
            "Spec": {
                "CAConfig": {
                    "NodeCertExpiry": 7776000000000000
                },
                "Dispatcher": {
                    "HeartbeatPeriod": 5000000000
                },
                "Name": "default",
                "Orchestration": {
                    "TaskHistoryRetentionLimit": 5
                },
                "Raft": {
                    "ElectionTick": 3,
                    "HeartbeatTick": 1,
                    "LogEntriesForSlowFollowers": 500,
                    "SnapshotInterval": 10000
                },
                "TaskDefaults": {}
            },
            "UpdatedAt": "2016-10-12T07:17:48.086711966Z",
            "Version": {
                "Index": 11
            }
        },
        "ControlAvailable": true,
        "Error": "",
        "LocalNodeState": "active",
        "Managers": 3,
        "NodeAddr": "192.168.77.21",
        "NodeID": "4m9esoo3rp766a5mwwgbdglf9",
        "Nodes": 7,
        "RemoteManagers": [
            {
                "Addr": "192.168.77.23:2377",
                "NodeID": "9ygxeonuyzexvz77i9ai6qilt"
            },
            {
                "Addr": "192.168.77.22:2377",
                "NodeID": "0suvcuim568ag1c8zhdgt93ab"
            },
            {
                "Addr": "192.168.77.21:2377",
                "NodeID": "4m9esoo3rp766a5mwwgbdglf9"
            }
        ]
    }
}

J

@rolele
Copy link
Author

rolele commented Oct 13, 2016

thanks it helped a lot.
I could start the cluster and I could see that all the secure-docker-daemon tasks have been skipped.

This project uses the last features of docker 1.12 so I have learned a lot since I have been playing with it.

I could start the stack without monitoring

[root@manager1 vagrant]# docker service ls
ID            NAME       REPLICAS  IMAGE               COMMAND
32tspkwqemq3  mongo      1/1       mongo:3.2
6vfw9cpeld4g  redis      1/1       redis:3.0.7-alpine
7tvem43a79z0  www        3/3       lucj/demo-www:1.0
8mh8w49ar4gn  docker-ui  global    uifd/ui-for-docker
ayrjcoyqk2j4  api        3/3       lucj/demo-api:1.0

but as soon as I run

± vagrant provision --provision-with monitoring                                                                 14:54
==> worker3: Running provisioner: monitoring (ansible)...
==> worker3: Vagrant has detected a host range pattern in the `groups` option.
==> worker3: Vagrant doesn't fully check the validity of these parameters!
==> worker3:
==> worker3: Please check https://docs.ansible.com/ansible/intro_inventory.html#hosts-and-groups
==> worker3: for more information.
    worker3: Running ansible-playbook...
PYTHONUNBUFFERED=1 ANSIBLE_FORCE_COLOR=true ANSIBLE_HOST_KEY_CHECKING=false ANSIBLE_SSH_ARGS='-o UserKnownHostsFile=/dev/null -o IdentitiesOnly=yes -o ControlMaster=auto -o ControlPersist=60s' ansible-playbook --connection=ssh --timeout=3
0 --limit="all" --inventory-file=/Users/others/working/aa_working/ansible/vagrant-ansible-docker-swarm/.vagrant/provisioners/ansible/inventory --sudo -vv ansible/monitoring.yml
No config file found; using defaults

PLAYBOOK: monitoring.yml *******************************************************
1 plays in ansible/monitoring.yml

PLAY [managers[0]] *************************************************************

TASK [setup] *******************************************************************
ok: [manager1]

TASK [influxdb : Get existing services] ****************************************
task path: /Users/others/working/aa_working/ansible/vagrant-ansible-docker-swarm/ansible/roles/influxdb/tasks/main.yml:2
changed: [manager1] => {"changed": true, "cmd": "docker service ls --filter name=influxdb | tail -n +2", "delta": "0:00:00.019809", "end": "2016-10-13 04:55:10.901074", "rc": 0, "start": "2016-10-13 04:55:10.881265", "stderr": "", "stdout
": "", "stdout_lines": [], "warnings": []}

TASK [influxdb : Stopped existing influxdb service] ****************************
task path: /Users/others/working/aa_working/ansible/vagrant-ansible-docker-swarm/ansible/roles/influxdb/tasks/main.yml:7
skipping: [manager1] => {"changed": false, "skip_reason": "Conditional check failed", "skipped": true}

TASK [influxdb : Running Influxdb] *********************************************
task path: /Users/others/working/aa_working/ansible/vagrant-ansible-docker-swarm/ansible/roles/influxdb/tasks/main.yml:14
changed: [manager1] => {"changed": true, "cmd": "docker service create --name influxdb --network appnet -p 8083:8083/tcp -p 8086:8086/tcp --env \"PRE_CREATE_DB=riemann-local\" --env \"INFLUXDB_INIT_PWD=password\" --env \"CONSUL_SERVICE_PO
RT=8086\" --log-driver syslog --log-opt tag='{{.ImageName}}/{{.Name}}/{{.ID}}' --constraint 'node.labels.influxdb == true' tutum/influxdb:0.12", "delta": "0:00:00.022164", "end": "2016-10-13 04:55:11.212663", "rc": 0, "start": "2016-10-13
 04:55:11.190499", "stderr": "", "stdout": "5468dkxbalkhnhshna7wsw4or", "stdout_lines": ["5468dkxbalkhnhshna7wsw4or"], "warnings": []}

TASK [riemann : Build riemann container] ***************************************
task path: /Users/others/working/aa_working/ansible/vagrant-ansible-docker-swarm/ansible/roles/riemann/tasks/main.yml:2
fatal: [manager1]: FAILED! => {"changed": false, "failed": true, "msg": "Error: docker-py version is 1.10.3. Minimum version required is 1.7.0."}

NO MORE HOSTS LEFT *************************************************************
        to retry, use: --limit @ansible/monitoring.retry

PLAY RECAP *********************************************************************
manager1                   : ok=3    changed=2    unreachable=0    failed=1

Ansible failed to complete successfully. Any error output should be
visible above. Please fix these errors and try again.

It did not start the monitoring stack because of a python problem.
Influxdb service register but 0 container started

[root@manager1 vagrant]# docker service ls
ID            NAME       REPLICAS  IMAGE                COMMAND
32tspkwqemq3  mongo      1/1       mongo:3.2
5468dkxbalkh  influxdb   0/1       tutum/influxdb:0.12
6vfw9cpeld4g  redis      1/1       redis:3.0.7-alpine
7tvem43a79z0  www        3/3       lucj/demo-www:1.0
8mh8w49ar4gn  docker-ui  global    uifd/ui-for-docker
ayrjcoyqk2j4  api        3/3       lucj/demo-api:1.0

@jamesdmorgan
Copy link
Owner

jamesdmorgan commented Oct 13, 2016

Hi,

Glad you made some progress. It looks like the versions have moved on since I last tested monitoring. There is an issue with Ansible and the latest version of docker-py ansible/ansible#17495 though using v1.9.0 works. I have updated the project and it gets further, though seeing issues with running services. Will investigate

James

@jamesdmorgan
Copy link
Owner

jamesdmorgan commented Oct 13, 2016

Similar to moby/moby#27208 I originally used docker 1.12.0 I believe.

"Error response from daemon: rpc error: code = 3 desc = EndpointSpec: duplicate published ports provided"

TASK [riemann : Running riemann service] ***************************************
task path: /Users/jamesdmorgan/Documents/Projects/vagrant-ansible-docker-swarm/ansible/roles/riemann/tasks/main.yml:38
fatal: [manager1]: FAILED! => {"changed": true, "cmd": "docker service create --name riemann --network appnet --mount type=bind,src=/vagrant/riemann/src/,dst=/opt/riemann/etc/ -p 5555:5555/tcp -p 5555:5555/udp -p 5556:5556/tcp --env \"RIEMANN_INFLUXDB_DBHOST: influxdb\" --env \"RIEMANN_INFLUXDB_DBNAME: riemann-local\" --env \"RIEMANN_INFLUXDB_USER: root\" --env \"RIEMANN_INFLUXDB_PASSWORD: password\" riemann:latest", "delta": "0:00:00.032069", "end": "2016-10-13 08:07:45.824968", "failed": true, "rc": 1, "start": "2016-10-13 08:07:45.792899", "stderr": "Error response from daemon: rpc error: code = 3 desc = EndpointSpec: duplicate published ports provided", "stdout": "", "stdout_lines": [], "warnings": []}

@rolele
Copy link
Author

rolele commented Oct 13, 2016

In my case I do not want to use riemann of collectd. I have tested telegraf (which embed statsd) which collects all stats of your docker containers and system and is implemented by the same company as influxdb. So I will adapt the project to see if I can have a different monitoring stack.

One question: I edited the requirement.txt

---
- name: secure-docker-daemon
  src: https://github.com/ansible/role-secure-docker-daemon.git
  version: origin/master

How case I enable the secure-docker-daemon plugin back, please?

@jamesdmorgan
Copy link
Owner

jamesdmorgan commented Oct 13, 2016

Hi, ok sounds good. The issue i'm seeing seems to relate to moby/swarmkit#1495 so its possible it'll come up again if registering the same udp & tcp ports.

To turn the secure daemon role you should add

docker_secure: true

In the group vars file. Its not been properly tested though the role didn't break when I ran it last

@jamesdmorgan
Copy link
Owner

Re telegraf you could do

roles:
    - { role: common, tags: [common] }
    - { role: influxdb, tags: [influxdb] }
    - { role: riemann, tags: [riemann], when: riemann_enabled | default(true) }
    - { role: collectd, tags: [collectd], when: collectd_enabled | default(true) }
    - { role: telegraf, tags: [telegraf], when: telegraf_enabled | default(false) }
    - { role: grafana, tags: [grafana] }

group_vars/all.yml

riemann_enabled: false
collectd_enabled: false
telegraf_enabled: true

or something like that

@rolele
Copy link
Author

rolele commented Oct 13, 2016

thanks @jamesdmorgan
I encounter some trouble while trying to have a config file template mounted by my service.
I created a "templates" folder and created my custom config template.

---
name: influxdb
conf_file: influxdb.conf
conf_dir: /etc/influxdb
conf_path: "{{conf_dir}}/{{conf_file}}"
service_definition: >
  --name {{ name }}
  --network appnet
  -p 8083:8083/tcp
  -p 8086:8086/tcp
  --constraint 'node.labels.influxdb == true'
  --mount type=bind,src={{conf_path}},dst={{conf_path}}
  influxdb -config {{conf_path}}

I created a task to make sure the config file was copied to the host before mounting the service,

- name: Ensures {{ conf_dir }} dir exists
  file:
    path: "{{ conf_dir }}"
    state: directory

- name: Config file is present
  template:
    src: "{{ conf_file }}" 
    dest: "{{ conf_dir }}/{{ conf_file }}"
  register: config_result

- name: Running Influxdb
  become: yes
  become_user: root
  shell: >
    docker service create {{ service_definition }}

The problem is that the service will run on a random node and this particular node will not have the config file.
Either I copy the config file on each node (which is a very bad solution)
I will try Flocker https://docs.clusterhq.com/en/latest/docker-integration/manual-install.html
which look like the right solution.
Flocker provide a way to create a named volume that will follow the container on the swarm cluster anywhere the container will be deployed

@jamesdmorgan
Copy link
Owner

Hi,

The constraint

--constraint 'node.labels.influxdb == true'

Should restrict it to nodes that have that label. Currently, it should only be 1 host. Persistent volumes are a big issue and one that could be addressed with something like Flocker. I haven't played with it yet but looked into it a while back. Interested to know how you get on.

To double check the labels you an inspect the nodes from manager1. But Influx was restricted for the reason.

@rolele
Copy link
Author

rolele commented Oct 15, 2016

Hi,
because you were interested to see what I did with flocker I created a pull request so you can have a look. This is not something you will want to merge at all. This is very experimental.
Cheers

@jamesdmorgan
Copy link
Owner

Hey

Thanks look forward to having a look. Will take a look tomorrow

James

On Saturday, 15 October 2016, rolele notifications@github.com wrote:

Hi,
because you were interested to see what I did with flocker I created a
pull request so you can have a look. This is not something you will want to
merge at all. This is very experimental.
Cheers


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#1 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AGTV5U6EfpkaozM-g9PZIkHNUl9KFUEMks5q0GpAgaJpZM4KUMEM
.

@pilasguru
Copy link
Contributor

pilasguru commented Jul 10, 2017

The ansible.secure-docker-daemon do not exists at galaxy, but there is alexinthesky.secure-docker-daemon role.

I changed ansible/requirements.yml & role and all run OK !

I have opened a Pull Request about this and other fixes.

I love your work @jamesdmorgan with this environment, and now I start to learn Docker Swarm, THANK YOU!

Rodolfo

@jamesdmorgan
Copy link
Owner

Thanks, Rodolfo I've merged the PR. Cheers

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants