← Back to team overview

sts-sponsors team mailing list archive

[Bug 1847924] Re: Introduce broken state parsing to mdadm

 

** Description changed:

  [Impact]
  
  * Currently, mounted raid0/md-linear arrays have no indication/warning
  when one or more members are removed or suffer from some non-recoverable
  error condition. The mdadm tool shows "clean" state regardless if a
  member was removed.
  
  * The patch proposed in this SRU addresses this issue by introducing a
  new state "broken", which is analog to "clean" but indicates that array
  is not in a good/correct state. The commit, available upstream as
  43ebc910 ("mdadm: Introduce new array state 'broken' for raid0/linear")
  [0], was extensively discussed and received a good amount of
  reviews/analysis by both the current mdadm maintainer as well as an old
  maintainer.
  
  * One important note here is that this patch requires a counter-part in the kernel to be fully functional, which was SRUed in LP: #1847773.
  It works fine/transparently without this kernel counter-part though.
  
  [Test case]
  
  * To test this patch, create a raid0 or linear md array on Linux using
  mdadm, like: "mdadm --create md0 --level=0 --raid-devices=2 /dev/nvme0n1
  /dev/nvme1n1";
  
  * Format the array using a FS of your choice (for example ext4) and
  mount the array;
  
  * Remove one member of the array, for example using sysfs interface (for
  nvme: echo 1 > /sys/block/nvme0n1/device/device/remove, for scsi: echo 1
  > /sys/block/sdX/device/delete);
  
  * Without this patch, the array state shown by "mdadm --detail" is
  "clean", regardless a member is missing/failed.
  
  [Regression potential]
  
  * There's not much potential regression here; we just exhibit arrays'
  state as "broken" if they have one or more missing/failed members; we
  believe the most common "issue" that could be reported from this patch
  is if an userspace tool rely on the array status as being always "clean"
  even for broken devices, then such tool may behave differently with this
  patch.
  
  * Note that we *proactively* skipped Xenial SRU here, in order to
  prevent potential regressions - Xenial mdadm tool lacks code
  infrastructure used by this patch, so the decision was for
  safety/stability, by only SRUing Bionic / Disco / Eoan mdadm versions.
  
  [0]
  https://git.kernel.org/pub/scm/utils/mdadm/mdadm.git/commit/?id=43ebc910
+ 
+ [other info]
+ 
+ As mdadm for focal hasn't been merged yet, this will need to be added
+ there during or after merge.

-- 
You received this bug notification because you are a member of STS
Sponsors, which is subscribed to the bug report.
https://bugs.launchpad.net/bugs/1847924

Title:
  Introduce broken state parsing to mdadm

Status in mdadm package in Ubuntu:
  In Progress
Status in mdadm source package in Bionic:
  In Progress
Status in mdadm source package in Disco:
  In Progress
Status in mdadm source package in Eoan:
  In Progress
Status in mdadm source package in Focal:
  In Progress
Status in mdadm package in Debian:
  Unknown

Bug description:
  [Impact]

  * Currently, mounted raid0/md-linear arrays have no indication/warning
  when one or more members are removed or suffer from some non-
  recoverable error condition. The mdadm tool shows "clean" state
  regardless if a member was removed.

  * The patch proposed in this SRU addresses this issue by introducing a
  new state "broken", which is analog to "clean" but indicates that
  array is not in a good/correct state. The commit, available upstream
  as 43ebc910 ("mdadm: Introduce new array state 'broken' for
  raid0/linear") [0], was extensively discussed and received a good
  amount of reviews/analysis by both the current mdadm maintainer as
  well as an old maintainer.

  * One important note here is that this patch requires a counter-part in the kernel to be fully functional, which was SRUed in LP: #1847773.
  It works fine/transparently without this kernel counter-part though.

  [Test case]

  * To test this patch, create a raid0 or linear md array on Linux using
  mdadm, like: "mdadm --create md0 --level=0 --raid-devices=2
  /dev/nvme0n1 /dev/nvme1n1";

  * Format the array using a FS of your choice (for example ext4) and
  mount the array;

  * Remove one member of the array, for example using sysfs interface
  (for nvme: echo 1 > /sys/block/nvme0n1/device/device/remove, for scsi:
  echo 1 > /sys/block/sdX/device/delete);

  * Without this patch, the array state shown by "mdadm --detail" is
  "clean", regardless a member is missing/failed.

  [Regression potential]

  * There's not much potential regression here; we just exhibit arrays'
  state as "broken" if they have one or more missing/failed members; we
  believe the most common "issue" that could be reported from this patch
  is if an userspace tool rely on the array status as being always
  "clean" even for broken devices, then such tool may behave differently
  with this patch.

  * Note that we *proactively* skipped Xenial SRU here, in order to
  prevent potential regressions - Xenial mdadm tool lacks code
  infrastructure used by this patch, so the decision was for
  safety/stability, by only SRUing Bionic / Disco / Eoan mdadm versions.

  [0]
  https://git.kernel.org/pub/scm/utils/mdadm/mdadm.git/commit/?id=43ebc910

  [other info]

  As mdadm for focal hasn't been merged yet, this will need to be added
  there during or after merge.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/mdadm/+bug/1847924/+subscriptions