← Back to team overview

curtin-dev team mailing list archive

[Merge] ~dbungert/curtin:lp-2061073 into curtin:master

 

Dan Bungert has proposed merging ~dbungert/curtin:lp-2061073 into curtin:master.

Commit message:
do not squash

Requested reviews:
  curtin developers (curtin-dev)
Related bugs:
  Bug #2061073 in curtin (Ubuntu): "When setting up RAID installation, curtin failed on/after wiping RAID block device backed by IMSM"
  https://bugs.launchpad.net/ubuntu/+source/curtin/+bug/2061073

For more details, see:
https://code.launchpad.net/~dbungert/curtin/+git/curtin/+merge/469319

In the crash logs of LP: #2061073, we can see that after we wipefs the 'disk' (raid), some data we were trying to read about the partitions is missing.  Two things appear incorrect:

* the sysfs reader knows that file entries might be missing, but attempts to proceed anyhow, so most or all the time we have those missing files we're going to crash like in this bug.  Handle that more smoothly by logging the scenario
* prior to that, quick_zero() attempts to do the recursive wipe by wiping the disk, then the partitions, but that means we're racing the kernel recognizing that these partitions have vanished so attempting to read info about the about-to-disappear partitions is likely unreliable
* new integration test added attempting to produce this case
-- 
Your team curtin developers is requested to review the proposed merge of ~dbungert/curtin:lp-2061073 into curtin:master.
diff --git a/curtin/block/__init__.py b/curtin/block/__init__.py
index 2e19467..39cf1cc 100644
--- a/curtin/block/__init__.py
+++ b/curtin/block/__init__.py
@@ -1000,21 +1000,28 @@ def sysfs_partition_data(blockdev=None, sysfs_path=None):
         sysfs_prefix = sys_block_path(parent)
         partnum = int(partnum)
 
+    keys = {'partition', 'start', 'size'}
     ptdata = []
     for part_sysfs in get_sysfs_partitions(sysfs_prefix):
         data = {}
-        for sfile in ('partition', 'start', 'size'):
+        for sfile in keys:
             dfile = os.path.join(part_sysfs, sfile)
             if not os.path.isfile(dfile):
                 continue
             data[sfile] = int(util.load_file(dfile))
         if partnum is None or data['partition'] == partnum:
-            ptdata.append((
-                path_to_kname(part_sysfs),
-                data['partition'],
-                data['start'] * SECTOR_SIZE_BYTES,
-                data['size'] * SECTOR_SIZE_BYTES,
-                ))
+            if data.keys() == keys:
+                ptdata.append((
+                    path_to_kname(part_sysfs),
+                    data['partition'],
+                    data['start'] * SECTOR_SIZE_BYTES,
+                    data['size'] * SECTOR_SIZE_BYTES,
+                    ))
+            else:
+                LOG.debug(
+                    "sysfs_partition_data: "
+                    "skipping {part_sysfs} - incomplete sysfs read"
+                )
 
     return ptdata
 
@@ -1231,7 +1238,6 @@ def quick_zero(path, partitions=True, exclusive=True):
     if this is a block device and partitions is true, then
     zero 1M at front and end of each partition.
     """
-    util.subp(['wipefs', '--all', '--force', path])
     buflen = 1024
     count = 1024
     zero_size = buflen * count
@@ -1257,6 +1263,8 @@ def quick_zero(path, partitions=True, exclusive=True):
             zero_file_at_offsets,
             path, offsets, buflen=buflen, count=count, exclusive=exclusive)
 
+    util.subp(['wipefs', '--all', '--force', path])
+
 
 def zero_file_at_offsets(path, offsets, buflen=1024, count=1024, strict=False,
                          exclusive=True):
diff --git a/tests/integration/test_block_meta.py b/tests/integration/test_block_meta.py
index f1c86ee..c8113b3 100644
--- a/tests/integration/test_block_meta.py
+++ b/tests/integration/test_block_meta.py
@@ -1384,3 +1384,18 @@ table-length: 256'''.encode()
         self.assertPartitions(
             PartData(number=1, offset=1 << 20, size=1 << 20, boot=False,
                      partition_type='E3C9E316-0B5C-4DB8-817D-F92DF00215AE'))
+
+    def test_quick_zero_loop(self):
+        """ attempt to provoke ordering problems in partition wiping with
+            superblock-recursive """
+        self.img = self.tmp_path('image.img')
+        config = StorageConfigBuilder(version=2)
+        config.add_image(path=self.img, create=True, size='20M',
+                         ptable='gpt', wipe='superblock-recursive')
+        parts = []
+        for i in range(1, 19):
+            config.add_part(number=i, offset=i << 20, size=1 << 20)
+            parts.append(PartData(number=1, offset=i << 20, size=1 << 20))
+        for run in range(5):
+            self.run_bm(config.render())
+            self.assertPartitions(*parts)

Follow ups