← Back to team overview

touch-packages team mailing list archive

[Bug 1347859] [NEW] Introduction of Predictable Network Interface Names (aka biosdevname) breaks working systems

 

Public bug reported:

Relatively recent linux distribution upgrades have been causing
computers' ethernet devices to be unexpectedly renamed. While I
understand that consistent device naming solves problems on some systems
(mostly multi-NIC servers and a few specialty embedded devices),
unilaterally forcing these changes on everyone is causing a lot of
frustration. Here are some of the problems I've encountered:


Interface names that were easily recognized as abbreviations for their device type have been replaced by cryptic names that have no obvious meaning whatsoever. It's easy to guess that eth0 is short for ethernet #0. What the heck is p4p1 supposed to mean? How is a human supposed to guess that the first p stands for "PCI slot", that the second p stands for "port number", and that the whole mysterious string represents an ethernet interface? This new naming convention is inferior to the old one in at least one significant respect: it makes things more difficult to understand.

One of the more useful examples of consistency that unix-like systems
have enjoyed for decades has been thrown out: the extremely well-known
ethernet device names. This creates yet another hurtle for users and
admins when switching between different operating systems or trying to
apply general-purpose unix knowledge.

A lot of documentation has been broken. I have no idea how many manuals,
forum posts, bug reports, printed instructions, email messages, personal
notes, books, and other forms of documentation in the world refer to a
unix ethernet device as eth0, but I'll bet the number is huge. All that
valuable guidance has just been rendered misleading or even useless to
anyone who doesn't keep up with the latest distribution-specific device
naming experiments; in other words: the people who need it most.

Well-established workflows have been broken. The change trips up users
and admins who have for years been getting tasks done quickly with
commands that they could recall and execute without a second thought.
They are suddenly finding that their workflows no longer work. This
interrupts tasks that should have been quick and easy, forcing people
figure out why known-good procedures are broken, think about how to
modify their memorized commands to work on the affected systems, and
train their fingers to type those new commands as quickly as they did
the old ones. Beyond being irritating, it can eat up a bunch of time
that some of us don't have to spare.

Working systems have been broken. Tools and automation scripts,
especially those developed for site-specific use, often make the
difference between a computer that does real work and a useless generic
OS installation. Sometimes they even make the difference between a
malfunctioning headless box that can be fixed over the network and an
expensive brick. It is quite common for such software to make some
minimal assumptions about its runtime environment, like assuming that
the name of the only network device that has ever been or will ever be
present will not suddenly change after being stable for months or years.
There are also applications (e.g. Matlab) and configuration files (e.g.
smb.conf, dhclient.conf, isc-dhcp-server) that might depend on
references to eth0. Renaming a critical and ubiquitous device like this
is so very likely to cause problems that it should never, ever be done
in an upgrade without the admin's explicit consent.

Sufficient warning of the change was not given. On one of my machines,
eth0 was renamed to p4p1 when I upgraded to Ubuntu 14.04 (trusty), yet I
don't see any mention of it in the Trusty release notes, nor in any of
the notes for releases of the previous several years. Is it buried in
fine print someplace that I missed? Having to figure out for myself what
changed, why, and how to revert it (in multiple ways on each machine)
was a significant waste of my time. Multiply that by all the other
people who were affected similarly, and I'll bet we'd get an
embarrassing number of needlessly wasted person-months that could have
been saved with a simple announcement and link to documentation.


In short, the way this feature was forced on the world was an irresponsible blunder. It doesn't matter that the change was meant to address some other problem. Breaking working systems is far worse than allowing that problem to remain until someone opts in to a fix. This concept is so important that anyone who doesn't get it really has no business committing code to an operating system used by so many other people.

I am filing this bug report against multiple projects because more than
one is now overriding the kernel's device names, because each project
must be reconfigured in a different way in order to disable this
behavior, and because I believe the overall failure here lies not only
in careless feature implementation but also in careless deployment. If
the people involved in developing, integrating, testing, distributing,
and using this new behavior had all been talking with each other,
perhaps they would have coordinated better when tinkering with something
upon which others have relied for longer than many coders have been
alive.

Here are some suggestions that might have mitigated the mess caused by
eth0 renaming:


It should be disabled by default for upgrades. (Don't break working systems.)

It could be disabled by default for systems that have only one network
device. (Fixing workstations when their eth0 was renamed to p4p1 was
more than enough hassle. I do not look forward to them changing to p1p1
when their motherboards are upgraded.)

It should not override custom udev rules. (People who added or edited
/etc/udev/rules.d/70-persistent-net.rules did so for a reason.)

It should be implemented in only one place. (On some distributions, it
seems systemd and biosdevname are both doing the job, and both must be
reconfigured in order to disable it.)

It could generate names that are as understandable as the ones being
replaced. (I can easily guess that "eth" is ethernet. What the heck is
"p"?)

It should have come with sufficient warning and documentation on how to
disable it. (Where are the release notes?)


I hope that those who read this report will genuinely try to understand the significance of the trouble that has been (and continues to be) caused here, rather than responding with the same stubborn arrogance that sometimes shows up when a developer's changes are criticized by his users. I think this can be done right, but it hasn't been so far. Solving one problem by creating another set of problems is not exactly a win.  I'm sure that the open source community can do better than this.  Thank you for your attention.

** Affects: biosdevname (Ubuntu)
     Importance: Undecided
         Status: New

** Affects: systemd (Ubuntu)
     Importance: Undecided
         Status: New

** Affects: udev (Ubuntu)
     Importance: Undecided
         Status: New

** Also affects: biosdevname (Ubuntu)
   Importance: Undecided
       Status: New

** Also affects: udev (Ubuntu)
   Importance: Undecided
       Status: New

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to systemd in Ubuntu.
https://bugs.launchpad.net/bugs/1347859

Title:
  Introduction of Predictable Network Interface Names (aka biosdevname)
  breaks working systems

Status in “biosdevname” package in Ubuntu:
  New
Status in “systemd” package in Ubuntu:
  New
Status in “udev” package in Ubuntu:
  New

Bug description:
  Relatively recent linux distribution upgrades have been causing
  computers' ethernet devices to be unexpectedly renamed. While I
  understand that consistent device naming solves problems on some
  systems (mostly multi-NIC servers and a few specialty embedded
  devices), unilaterally forcing these changes on everyone is causing a
  lot of frustration. Here are some of the problems I've encountered:

  
  Interface names that were easily recognized as abbreviations for their device type have been replaced by cryptic names that have no obvious meaning whatsoever. It's easy to guess that eth0 is short for ethernet #0. What the heck is p4p1 supposed to mean? How is a human supposed to guess that the first p stands for "PCI slot", that the second p stands for "port number", and that the whole mysterious string represents an ethernet interface? This new naming convention is inferior to the old one in at least one significant respect: it makes things more difficult to understand.

  One of the more useful examples of consistency that unix-like systems
  have enjoyed for decades has been thrown out: the extremely well-known
  ethernet device names. This creates yet another hurtle for users and
  admins when switching between different operating systems or trying to
  apply general-purpose unix knowledge.

  A lot of documentation has been broken. I have no idea how many
  manuals, forum posts, bug reports, printed instructions, email
  messages, personal notes, books, and other forms of documentation in
  the world refer to a unix ethernet device as eth0, but I'll bet the
  number is huge. All that valuable guidance has just been rendered
  misleading or even useless to anyone who doesn't keep up with the
  latest distribution-specific device naming experiments; in other
  words: the people who need it most.

  Well-established workflows have been broken. The change trips up users
  and admins who have for years been getting tasks done quickly with
  commands that they could recall and execute without a second thought.
  They are suddenly finding that their workflows no longer work. This
  interrupts tasks that should have been quick and easy, forcing people
  figure out why known-good procedures are broken, think about how to
  modify their memorized commands to work on the affected systems, and
  train their fingers to type those new commands as quickly as they did
  the old ones. Beyond being irritating, it can eat up a bunch of time
  that some of us don't have to spare.

  Working systems have been broken. Tools and automation scripts,
  especially those developed for site-specific use, often make the
  difference between a computer that does real work and a useless
  generic OS installation. Sometimes they even make the difference
  between a malfunctioning headless box that can be fixed over the
  network and an expensive brick. It is quite common for such software
  to make some minimal assumptions about its runtime environment, like
  assuming that the name of the only network device that has ever been
  or will ever be present will not suddenly change after being stable
  for months or years. There are also applications (e.g. Matlab) and
  configuration files (e.g. smb.conf, dhclient.conf, isc-dhcp-server)
  that might depend on references to eth0. Renaming a critical and
  ubiquitous device like this is so very likely to cause problems that
  it should never, ever be done in an upgrade without the admin's
  explicit consent.

  Sufficient warning of the change was not given. On one of my machines,
  eth0 was renamed to p4p1 when I upgraded to Ubuntu 14.04 (trusty), yet
  I don't see any mention of it in the Trusty release notes, nor in any
  of the notes for releases of the previous several years. Is it buried
  in fine print someplace that I missed? Having to figure out for myself
  what changed, why, and how to revert it (in multiple ways on each
  machine) was a significant waste of my time. Multiply that by all the
  other people who were affected similarly, and I'll bet we'd get an
  embarrassing number of needlessly wasted person-months that could have
  been saved with a simple announcement and link to documentation.

  
  In short, the way this feature was forced on the world was an irresponsible blunder. It doesn't matter that the change was meant to address some other problem. Breaking working systems is far worse than allowing that problem to remain until someone opts in to a fix. This concept is so important that anyone who doesn't get it really has no business committing code to an operating system used by so many other people.

  I am filing this bug report against multiple projects because more
  than one is now overriding the kernel's device names, because each
  project must be reconfigured in a different way in order to disable
  this behavior, and because I believe the overall failure here lies not
  only in careless feature implementation but also in careless
  deployment. If the people involved in developing, integrating,
  testing, distributing, and using this new behavior had all been
  talking with each other, perhaps they would have coordinated better
  when tinkering with something upon which others have relied for longer
  than many coders have been alive.

  Here are some suggestions that might have mitigated the mess caused by
  eth0 renaming:

  
  It should be disabled by default for upgrades. (Don't break working systems.)

  It could be disabled by default for systems that have only one network
  device. (Fixing workstations when their eth0 was renamed to p4p1 was
  more than enough hassle. I do not look forward to them changing to
  p1p1 when their motherboards are upgraded.)

  It should not override custom udev rules. (People who added or edited
  /etc/udev/rules.d/70-persistent-net.rules did so for a reason.)

  It should be implemented in only one place. (On some distributions, it
  seems systemd and biosdevname are both doing the job, and both must be
  reconfigured in order to disable it.)

  It could generate names that are as understandable as the ones being
  replaced. (I can easily guess that "eth" is ethernet. What the heck is
  "p"?)

  It should have come with sufficient warning and documentation on how
  to disable it. (Where are the release notes?)

  
  I hope that those who read this report will genuinely try to understand the significance of the trouble that has been (and continues to be) caused here, rather than responding with the same stubborn arrogance that sometimes shows up when a developer's changes are criticized by his users. I think this can be done right, but it hasn't been so far. Solving one problem by creating another set of problems is not exactly a win.  I'm sure that the open source community can do better than this.  Thank you for your attention.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/biosdevname/+bug/1347859/+subscriptions


Follow ups

References