group.of.nepali.translators team mailing list archive
-
group.of.nepali.translators team
-
Mailing list archive
-
Message #15954
[Bug 1602192] Re: when starting many LXD containers, they start failing to boot with "Too many open files"
This bug was fixed in the package lxd - 2.0.10-0ubuntu1~16.04.2
---------------
lxd (2.0.10-0ubuntu1~16.04.2) xenial; urgency=medium
* Fix regression in image update logic (LP: #1712455):
- 0005-Fix-regression-in-image-auto-update-logic.patch
- 0006-lxd-images-Carry-old-cached-value-on-refresh.patch
- 0007-Attempt-to-restore-the-auto_update-property.patch
* Ship a sysctl.d file that bumps inotify watches count. (LP: #1602192)
* Update debian/watch to look only at LTS releases.
-- Stéphane Graber <stgraber@xxxxxxxxxx> Tue, 22 Aug 2017 20:39:36
-0400
** Changed in: lxd (Ubuntu Xenial)
Status: Fix Committed => Fix Released
--
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1602192
Title:
when starting many LXD containers, they start failing to boot with
"Too many open files"
Status in lxd package in Ubuntu:
Fix Released
Status in lxd source package in Xenial:
Fix Released
Bug description:
== SRU
=== Rationale
LXD containers using systemd will use a very large amount of inotify watches. This means that a system will typically run out of global watches with as little as 15 Ubuntu 16.04 containers.
An easy fix for the issue is to bump the number of user watches up to
1024, making it possible to run around 100 containers before hitting
the limit again.
To do so, LXD is now shipping a sysctl.d file which bumps that
particular limit on systems that have LXD installed.
=== Testcase
1) Upgrade LXD
2) Spawn about 50 Ubuntu 16.04 containers ("lxc launch ubuntu:16.04")
3) Check that they all get an IP address ("lxc list"), that's a pretty good sign that they booted properly
=== Regression potential
Not expecting anything here. Juju has shipped a similar configuration for a while now and so have the LXD feature releases.
We pretty much just forgot to include this particular change in our
LTS packaging branch
== Original bug report
Reported by Uros Jovanovic here: https://bugs.launchpad.net/juju-core/+bug/1593828/comments/18
"...
However, if you bootstrap LXD and do:
juju bootstrap localxd lxd --upload-tools
for i in {1..30}; do juju deploy ubuntu ubuntu$i; sleep 90; done
Somewhere between 10-20-th deploy fails with machine in pending state
(nothin useful in logs) and none of the new deploys after that first
pending succeeds. Might be a different bug, but it's easy to verify
with running that for loop.
So, this particular error was not in my logs, but the controller still
ends up unable to provision at least 30 machines ..."
I can reproduce this. Looking on the failed machine I can see that
jujud isn't running, which is why juju considers the machine not up,
and in fact nothing of juju seems to be installed. There's nothing
about juju in /var/log.
Comparing cloud-init-output.log between a stuck-pending machine and
one which has started up fine, they both start with some key-
generation messages, but the successful machine then has the line:
Cloud-init v. 0.7.7 running 'init' at Tue, 12 Jul 2016 08:32:00 +0000.
Up 4.0 seconds.
...and then a whole lot of juju-installation gubbins, while the failed
machine log just stops.
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/lxd/+bug/1602192/+subscriptions