Skip to content
This repository has been archived by the owner on Oct 16, 2020. It is now read-only.

cloud-init by default sets the fully qualified hostname on AWS #1272

Closed
mvanholsteijn opened this issue May 9, 2016 · 8 comments
Closed

Comments

@mvanholsteijn
Copy link

I have created a VPC with a private hosted zone.

When a CoreOS machine is booted, the name of the machine is set to the hostname as specified by the AWS hostname meta information eg. http://169.254.169.254/latest/meta-data/hostname.

This contains fully qualified domain name. (ip_x_x_x_x.). Unfortunately, this name does not resolve in DNS.

When I read the cloud-init for CoreOS, it explicitly states that the hostname should be set to the non-qualified part of the name, eg. ip_x_x_x_x.

Can you please change the cloud-init to use the unqualified hostname?

@mvanholsteijn
Copy link
Author

A temporary workaround for this issue is to add the following unit in your cloud_init:

    - name: sethostname.service
      command: start
      content: |
        [Unit]
        Description=Set Hostname Workaround https://github.com/coreos/bugs/issues/1272

        [Service]
        Type=oneshot
        ExecStart=/bin/sh -c "/usr/bin/hostnamectl set-hostname $(curl -s http://169.254.169.254/latest/meta-data/hostname | cut -d' ' -f1 | cut -d. -f1)"

        [Install]
        WantedBy=local.target

@crawford
Copy link
Contributor

I tested this out on my machine with the hostname example.crawford.com. and it seems to work correctly:

$ hostnamectl set-hostname example.crawford.com.
$ hostnamectl 
   Static hostname: example.crawford.com
   ...
$ hostname --short
example
$ hostname --domain
crawford.com
$ hostname --fqdn  
example.crawford.com

So I don't see the problem. Is the issue that the hostname doesn't resolve in DNS?

@mvanholsteijn
Copy link
Author

Yes hostnamectl works correctly.

but in aws with a private DNS zone attached to the vpc (for instance .test) we cannot resolve empheral instance hostnames eg ip-10-0-1-11.test but only ip-10-0-1-11.eu-west-1.compute.internal. We do have both domains in the search configuration allowing the unqualified name ip-10-0-1-11 to resolve.

AFAIK, aws does not have an elegant way to resolve the empheral instance hostnames in the private zone, aside from adding all up address hostnames to the private zone.

Cheers, Mark

On Wed, 18 May 2016 at 23:05, Alex Crawford notifications@github.com wrote:
I tested this out on my machine with the hostname example.crawford.com. and it seems to work correctly:

$ hostnamectl set-hostname example.crawford.com.
$ hostnamectl
Static hostname: example.crawford.com
...
$ hostname --short
example
$ hostname --domain
crawford.com
$ hostname --fqdn
example.crawford.com
So I don't see the problem. Is the issue that the hostname doesn't resolve in DNS?


You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub

@crawford
Copy link
Contributor

crawford commented Jul 6, 2016

This sounds like a purely AWS issue. I don't think CoreOS should make any changes to fix this quirk. You'll want to follow up with your AWS support rep to see if they have any suggestions. I'm going to raise this issue with them as well.

@crawford crawford closed this as completed Jul 6, 2016
@rbellamy
Copy link

@crawford my team and I have been struggling with this exact issue for months now. From my research, while this may be primarily an AWS (or OpenStack) issue, when you combine AWS + cloud-init + coreos + kubernetes, it becomes a show-stopper.

When the AWS EC2 instance launches, all the machinery of cloud-init focuses on getting the hostname via the meta-data service, and when combined with a custom DHCP option set for a VPC, the hostname is configured to something that's not resolvable. And once that hostname is set to an unresolvable value, ETCD cannot find its workers and controllers, and vice-versa.

In other words, while at first blush this is clearly an AWS quirk, the fact that coreos doesn't address this quirk is a failing story for coreos.

I've been hoping that kube-aws would help me resolve the impedance mismatch between these various IaaS layers, but haven't found the love.

You can see my colleague's attempts to find resolution kubernetes-retired/kube-aws#62 (comment) and coreos/coreos-kubernetes#675 (comment) - note that each of his attempts at troubleshooting failed cluster deployment come back to this root cause.

In closing, I'm eagerly awaiting the results of your conversation with an AWS rep.

@crawford
Copy link
Contributor

@rbellamy We are no longer working on coreos-cloudinit due to fundamental design issues. We've instead invested in Ignition and coreos-metadata. Since coreos-metadata don't explicitly set the hostname (we rely on DHCP), this might just start working if you migrate to Ignition. I haven't personally tested this but I am curious to know if it works.

@rbellamy
Copy link

rbellamy commented Dec 30, 2016

Wait, @crawford - are you saying that the entire coreos-cloudinit tooling is being deprecated in favor of something completely new?

From what I can see kube-aws is still very much relying on coreos-cloudinit. When I do a search in the kube-aws repo for Ignition or coreos-metadata, I see that kube-aws has the beginnings of support for Ignition, but currently only enough to recognize it and bark at me if I try to use it. Nothing mentioned about coreos-metadata.

@mumoshu, were you aware that coreos-cloudinit is considered deprecated in favor of Ignition and coreos-metadata?

@crawford
Copy link
Contributor

Yes, coreos-cloudinit is going to be deprecated in favor of Ignition. It's not going anywhere anytime soon, but we won't be advising people to use it any more.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants