← Back to team overview

maas-devel team mailing list archive

juju, maas, ceph-mon, cloud-init: /etc/hosts overrides DNS, ceph doesn't listen on 127.0.1.1

 

Hi,

sorry for the cross post, but I think it's a legitimate case for that. This
bug could be coming from a lot of different places.

TL;DR cloud-init added 127.0.1.1 host.domain to /etc/hosts, ceph-mon
doesn't listen on 127.0.1.1, local connections to host.domain:6789 fail

Bug: https://bugs.launchpad.net/charms/+source/ceph/+bug/1365671

juju, using maas as a provider, deployed ceph and ceph-ods. 4 units in
total (3 ceph, 1 ceph-osd).

ceph is working fine, thanks.

On a ceph unit, /etc/hosts has this entry:

127.0.1.1 clipper.scapestack clipper


That fqdn, however, resolves correctly via DNS:

ubuntu@clipper:~$ dig +short clipper.scapestack
10.96.5.247


If I try to connect to ceph-mon from, which listens on port 6789, using the
hostname, or even the fqdn, that fails:

ubuntu@clipper:~$ telnet clipper.scapestack 6789
Trying 127.0.1.1...
telnet: Unable to connect to remote host: Connection refused


It fails because ceph-mon doesn't listen on 127.0.1.1:

ubuntu@clipper:~$ sudo netstat -anp|grep ceph-mon
tcp 0 0 10.96.5.247:6789 0.0.0.0:* LISTEN 62208/ceph-mon
tcp 0 0 10.96.5.247:6789 10.96.5.245:54965 ESTABLISHED 62208/ceph-mon
tcp 0 0 10.96.5.247:6789 10.96.5.244:44555 ESTABLISHED 62208/ceph-mon
tcp 0 0 10.96.5.247:6789 10.96.5.244:44565 ESTABLISHED 62208/ceph-mon
unix 2 [ ACC ] STREAM LISTENING 64110 62208/ceph-mon
/var/run/ceph/ceph-mon.clipper.asok


So we end up in a situation where I can't connect to the service that is
running on the same machine using the hostname or the fqdn, because
cloud-init was told to munge /etc/hosts. It was told so via
"manage_etc_hosts: 'localhost'" (see
https://bugs.launchpad.net/charms/+source/ceph/+bug/1365671/comments/2). I
don't know who did that: juju or maas (or something else).

In our use case we have a subordinate charm that relates to ceph and tries
to gather storage usage information. It uses the fqdn, and is failing. We
can probably workaround it, but I thought this should be brought to a wider
audience first.

I filed a bug against the charm for now, thinking that possibly the easiest
solution would be to have ceph-mon also listen on localhost (
https://bugs.launchpad.net/charms/+source/ceph/+bug/1365671).

Some thoughts:
- if you need to connect to ceph-mon, use the real ip (tricky? Try all the
interfaces on the machine?)
- have ceph-mon listen on all addresses
- don't add the fqdn to /etc/hosts, just the hostname, when adding the
127.0.1.1 entry. What would this break? Maybe hostname -f on containers?

Follow ups