kernel-packages team mailing list archive
-
kernel-packages team
-
Mailing list archive
-
Message #00491
[Bug 1202994] Re: EXT4 filesystem corruption with uninit_bg and error=continue
** Changed in: linux (Ubuntu)
Importance: Undecided => High
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1202994
Title:
EXT4 filesystem corruption with uninit_bg and error=continue
Status in “linux” package in Ubuntu:
Confirmed
Bug description:
There was a long and complicated sequence of activities involving
mdadm, lvm, and specifically pvmove leading up to the point where the
corruption was discovered, but I suspect most were irrelevant. AFAICT,
the bug was triggered by the following simple operations:
* the FS was unmounted & remounted -- thus, the journal was fresh and hadn't wrapped (which other reports appear to indicate would have prevented the bug showing up)
* the FS options include uninit_bg AND error=continue
* a bunch of files were then copied onto the FS -- this was the last write operation on the FS.
Later, e2fsck indicated a bunch of problems, including corrupted group
descriptors. Specifically, it fould that many blocks were now claimed
by two files; in each case, one was an old file and one was one of
those newly copied, and the contents matched the expected data for
latter.
So I think this starts with an instance of the miscalculation of
checksums in uninit_bg blocks (fixed by Ted Ts'o last June), followed
by the (invalid or uninitialised) bitmap being used anyway (because
error=continue) and the blocks it appeared to show as free then being
allocated to new files.
Jul 15 18:01:03 redshift kernel: [ 9332.021245] EXT4-fs error (device dm-1): ext4_mb_generate_buddy:739: group 2968, 8105 clusters in bitmap, 0 in gd
...
Jul 16 18:05:14 redshift kernel: [95982.560034] EXT4-fs (dm-1): error count: 1
Jul 16 18:05:14 redshift kernel: [95982.560044] EXT4-fs (dm-1): initial error at 1373907663: ext4_mb_generate_buddy:739
Jul 16 18:05:14 redshift kernel: [95982.560053] EXT4-fs (dm-1): last error at 1373907663: ext4_mb_generate_buddy:739
...
Jul 16 20:53:19 redshift kernel: [106068.077526] EXT4-fs (dm-1): ext4_check_descriptors: Checksum for group 0 failed (47831!=4825)
Jul 16 20:53:19 redshift kernel: [106068.077540] EXT4-fs (dm-1): ext4_check_descriptors: Checksum for group 1 failed (14670!=8882)
I see that in an astonishing display of synchronicity, Darrick J Wong
filed a patch at 17 Jul 2013 04:02 -- the very next day, or maybe
even the same day, depending on timezone -- to prevent the knockon
effects (see "[PATCH] ext4: Prevent massive fs corruption if verifying
the block bitmap fails" at http://permalink.gmane.org/gmane.comp.file-
systems.ext4/39535 ).
But what puzzles me is that the initial triggering bug is still in
this kernel (vmlinuz-3.2.0-49-generic), when according to this
conversation https://bugzilla.kernel.org/show_bug.cgi?id=42723#c8 the
fix was backported to 3.2.20? Is it possible that there is another way
of getting the "ext4_mb_generate_buddy:739" error?
I have kept an e2image dump of the corrupted FS in case it's of any
use to EXT4 developers, but it's not attached, as even in QCOW2 format
it's ~1Gb.
ProblemType: Bug
DistroRelease: Ubuntu 12.04
Package: linux-image-3.2.0-49-generic 3.2.0-49.75
ProcVersionSignature: Ubuntu 3.2.0-49.75-generic 3.2.46
Uname: Linux 3.2.0-49-generic x86_64
AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.24.
ApportVersion: 2.0.1-0ubuntu17.3
Architecture: amd64
AudioDevicesInUse:
USER PID ACCESS COMMAND
/dev/snd/controlC1: dsg 7005 F.... pulseaudio
/dev/snd/controlC0: dsg 7005 F.... pulseaudio
CRDA:
country AW:
(2402 - 2482 @ 40), (N/A, 20)
(5170 - 5250 @ 40), (N/A, 20)
(5250 - 5330 @ 40), (N/A, 20), DFS
(5490 - 5710 @ 40), (N/A, 27), DFS
Card0.Amixer.info:
Card hw:0 'SB'/'HDA ATI SB at 0xfe024000 irq 16'
Mixer name : 'Realtek ALC892'
Components : 'HDA:10ec0892,1458a102,00100302'
Controls : 46
Simple ctrls : 21
Card1.Amixer.info:
Card hw:1 'HDMI'/'HDA ATI HDMI at 0xfdefc000 irq 19'
Mixer name : 'ATI RS690/780 HDMI'
Components : 'HDA:1002791a,00791a00,00100000'
Controls : 4
Simple ctrls : 1
Card1.Amixer.values:
Simple mixer control 'IEC958',0
Capabilities: pswitch pswitch-joined penum
Playback channels: Mono
Mono: Playback [on]
Date: Thu Jul 18 19:04:57 2013
HibernationDevice: RESUME=UUID=2ab26064-3b90-475d-b3c2-51a70c2d990a
InstallationMedia: Kubuntu 12.04.1 LTS "Precise Pangolin" - Release amd64 (20120822.2)
MachineType: Gigabyte Technology Co., Ltd. GA-890GPA-UD3H
MarkForUpload: True
ProcEnviron:
LANGUAGE=en_GB
TERM=xterm
PATH=(custom, no user)
LANG=en_GB.UTF-8
SHELL=/bin/bash
ProcFB: 0 radeondrmfb
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-3.2.0-49-generic root=/dev/mapper/system-kubuntu ro quiet splash vt.handoff=7
RelatedPackageVersions:
linux-restricted-modules-3.2.0-49-generic N/A
linux-backports-modules-3.2.0-49-generic N/A
linux-firmware 1.79.4
RfKill:
0: phy0: Wireless LAN
Soft blocked: yes
Hard blocked: no
SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 07/23/2010
dmi.bios.vendor: Award Software International, Inc.
dmi.bios.version: FD
dmi.board.name: GA-890GPA-UD3H
dmi.board.vendor: Gigabyte Technology Co., Ltd.
dmi.board.version: x.x
dmi.chassis.type: 3
dmi.chassis.vendor: Gigabyte Technology Co., Ltd.
dmi.modalias: dmi:bvnAwardSoftwareInternational,Inc.:bvrFD:bd07/23/2010:svnGigabyteTechnologyCo.,Ltd.:pnGA-890GPA-UD3H:pvr:rvnGigabyteTechnologyCo.,Ltd.:rnGA-890GPA-UD3H:rvrx.x:cvnGigabyteTechnologyCo.,Ltd.:ct3:cvr:
dmi.product.name: GA-890GPA-UD3H
dmi.sys.vendor: Gigabyte Technology Co., Ltd.
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1202994/+subscriptions
References