Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Service address is not updated if the service is already registered #264

Closed
MarkusMattinen opened this issue Jan 2, 2017 · 3 comments
Closed

Comments

@MarkusMattinen
Copy link

MarkusMattinen commented Jan 2, 2017

Related to #12 and #215.

If a container with ContainerPilot is stopped non-gracefully and restarted, the service definition is still registered on the discovery backend, but the actual IP address of the container may have changed. ContainerPilot only sees that a service is already registered with the same ID and starts refreshing the TTL, but the address is not updated to the discovery backend.

This is easy to reproduce because Docker seems to allocate smallest IP addresses first, for example with the following steps:

  • Start any container A.
  • Start container B with ContainerPilot.
  • Stop container A.
  • Kill and restart container B. It should now have A's previous IP address.

Tested on Docker 1.12.5 and ContainerPilot 2.6.0.

A similar situation can also occur if ContainerPilot fails to deregister the service during a graceful shutdown (e.g. because of a temporary network problem).

@tgross tgross added the bug label Jan 3, 2017
@tgross
Copy link
Contributor

tgross commented Jan 3, 2017

Ok, this is arising from the workflow we have for registration. We never register a service until it's been marked healthy, and so we end up doing the following in SendHeartbeat:

  • <checkID> is service name + hostname (container name)
  • POST /v1/agent/check/pass/<checkID> to mark the check as passed. On error (we haven't seen the service before):
    • POST /v1/agent/service/register
    • Retry POST /v1/agent/check/pass/<checkID> to mark the check as passed.

The advantage of this design is that we don't have to carry around state about whether we've registered previously and we get automatic retries of registration if the initial registration fails for any reason.

But what you're seeing is definitely going to be a problem in unsupervised Docker environments. (On Triton your instance would get the same IP on that restart.) It will also be a problem if we change the binding address for the service via the interfaces configuration.

It seems like the best way to fix this would be to always attempt to register the service before the first health check is fired. I've verified that registration will update in Consul, so we can retry registration on a loop until it succeeds or we get a message that the service is already registered (see below).

There will be a behavior change in that the service will be registered and "unhealthy" until the first health check succeeds, but that merely reverts the behavior we improved in #225 so I wouldn't really consider that a backwards compatibility issue.

FWIW, it looks like this will be the case w/ the etcd discovery backend as well.


Demo to verify service registration API supports silent update:

$ curl -so /dev/null -w "%{http_code}\n" -d '{"Name": "test1", "Address": "192.168.1.1"}' http://localhost:8500/v1/agent/service/register
200

$ curl -s http://localhost:8500/v1/agent/services | jq
{
  "consul": {
    "ID": "consul",
    "Service": "consul",
    "Tags": [],
    "Address": "",
    "Port": 8300,
    "EnableTagOverride": false,
    "CreateIndex": 0,
    "ModifyIndex": 0
  },
  "test1": {
    "ID": "test1",
    "Service": "test1",
    "Tags": null,
    "Address": "192.168.1.1",
    "Port": 0,
    "EnableTagOverride": false,
    "CreateIndex": 0,
    "ModifyIndex": 0
  }
}

$ curl -so /dev/null -w "%{http_code}\n" -d '{"Name": "test1", "Address": "192.168.1.2"}' http://localhost:8500/v1/agent/service/register
200

$ curl -s http://localhost:8500/v1/agent/services | jq
{
  "consul": {
    "ID": "consul",
    "Service": "consul",
    "Tags": [],
    "Address": "",
    "Port": 8300,
    "EnableTagOverride": false,
    "CreateIndex": 0,
    "ModifyIndex": 0
  },
  "test1": {
    "ID": "test1",
    "Service": "test1",
    "Tags": null,
    "Address": "192.168.1.2",
    "Port": 0,
    "EnableTagOverride": false,
    "CreateIndex": 0,
    "ModifyIndex": 0
  }
}

@tgross
Copy link
Contributor

tgross commented Jan 27, 2017

Opened #275

@tgross
Copy link
Contributor

tgross commented Jan 27, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants