kernel-packages team mailing list archive

Thread
Date
[Bug 1382801] Re: XFS: mount hangs for corrupted filesystem

To: kernel-packages@xxxxxxxxxxxxxxxxxxx
From: Rafael David Tinoco <rafael.tinoco@xxxxxxxxxxxxx>
Date: Wed, 22 Oct 2014 04:10:46 -0000
Reply-to: Bug 1382801 <1382801@xxxxxxxxxxxxxxxxxx>
Sender: bounces@xxxxxxxxxxxxx
For all interested,

It looks like the following stack trace (also related to this problem):

[<ffffffff817557fe>] dump_stack+0x46/0x58
[<ffffffffa028ffdf>] xfs_error_report+0x3f/0x50 [xfs]
[<ffffffffa02aae97>] ? xfs_free_extent+0xd7/0x120 [xfs]
[<ffffffffa02a8496>] xfs_free_ag_extent+0x4b6/0x720 [xfs]
[<ffffffffa02aae97>] xfs_free_extent+0xd7/0x120 [xfs]
[<ffffffffa02e0b00>] xlog_recover_process_efi+0x170/0x1b0 [xfs]
[<ffffffffa02e2156>] xlog_recover_process_efis.isra.11+0x76/0xd0 [xfs]
[<ffffffffa02e69ba>] xlog_recover_finish+0x2a/0xd0 [xfs]
[<ffffffffa02ebb34>] xfs_log_mount_finish+0x34/0x50 [xfs]
[<ffffffffa02a0221>] xfs_mountfs+0x481/0x710 [xfs]
[<ffffffffa02a131d>] ? xfs_mru_cache_create+0x15d/0x1a0 [xfs]
[<ffffffffa02a3707>] xfs_fs_fill_super+0x2c7/0x340 [xfs]
[<ffffffff811cd4a9>] mount_bdev+0x1b9/0x200
[<ffffffffa02a3440>] ? xfs_parseargs+0xb30/0xb30 [xfs]
[<ffffffffa02a16f5>] xfs_fs_mount+0x15/0x20 [xfs]
[<ffffffff811ce123>] mount_fs+0x43/0x1b0
[<ffffffff811e9bf6>] vfs_kern_mount+0x76/0x130
[<ffffffff811eb3a4>] do_new_mount+0xa4/0x1f0
[<ffffffff811ec706>] do_mount+0x216/0x260
[<ffffffff811ecad0>] SyS_mount+0x90/0xe0
[<ffffffff8176ae2d>] system_call_fastpath+0x1a/0x1f

With the following observed error:

XFS: Internal error XFS_WANT_CORRUPTED_GOTO at line 1602 of file
.../xfs_alloc.c. Caller 0xffffffffa02aae97

Can be linked to a specific recent upstream commit that fixes XFS wrong
behavior for suspending/resuming under medium/heavy workload:

commit 8018ec083c72443cc74fd2d08eb7c5dddc13af53
Author: Brian Foster <bfoster@xxxxxxxxxx>
Date: Tue Sep 9 11:44:46 2014 +1000

xfs: mark all internal workqueues as freezable

Workqueues must be explicitly set as freezable to ensure they are frozen
in the assocated part of the hibernation/suspend sequence. Freezing of
workqueues and kernel threads is important to ensure that modifications
are not made on-disk after the hibernation image has been created.
Otherwise, the in-memory state can become inconsistent with what is on
disk and eventually lead to filesystem corruption. We have reports of
free space btree corruptions that occur immediately after restore from
hibernate that suggest the xfs-eofblocks workqueue could be causing
such problems if it races with hibernation.

Mark all of the internal XFS workqueues as freezable to ensure nothing
changes on-disk once the freezer infrastructure freezes kernel threads
and creates the hibernation image.

Signed-off-by: Brian Foster <bfoster@xxxxxxxxxx>
Reported-by: Carlos E. R. <carlos.e.r@xxxxxxxxxxxx>
Reviewed-by: Dave Chinner <dchinner@xxxxxxxxxx>
Signed-off-by: Dave Chinner <david@xxxxxxxxxxxxx>

But also make sure xfs_freeze works as expected (freezing all internal
workqueues).

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1382801

Title:
  XFS: mount hangs for corrupted filesystem

Status in “linux” package in Ubuntu:
  Confirmed

Bug description:
  It was brought to my attention this situation:

  --------
  mount hangs at the following stack:
  crash> bt 2882
  PID: 2882 TASK: ffff88084e75c800 CPU: 7 COMMAND: "mount"
  #0 [ffff880036a73b38] schedule at ffffffff8175e320
  #1 [ffff880036a73bc0] xfs_ail_push_all_sync at ffffffffa02e5478 [xfs]
  #2 [ffff880036a73c30] xfs_log_quiesce at ffffffffa02e0b67 [xfs]
  #3 [ffff880036a73c50] xfs_log_unmount at ffffffffa02e0bb6 [xfs]
  #4 [ffff880036a73c70] xfs_mountfs at ffffffffa029332a [xfs]
  #5 [ffff880036a73ce0] xfs_fs_fill_super at ffffffffa0296707 [xfs]
  #6 [ffff880036a73d20] mount_bdev at ffffffff811cd4a9
  #7 [ffff880036a73db0] xfs_fs_mount at ffffffffa02946f5 [xfs]
  #8 [ffff880036a73dc0] mount_fs at ffffffff811ce123
  #9 [ffff880036a73e10] vfs_kern_mount at ffffffff811e9bf6
  #10 [ffff880036a73e60] do_new_mount at ffffffff811eb3a4
  #11 [ffff880036a73ec0] do_mount at ffffffff811ec706
  #12 [ffff880036a73f20] sys_mount at ffffffff811ecad0
  #13 [ffff880036a73f80] system_call_fastpath at ffffffff8176ae2d
  RIP: 00007f2340eb6c2a RSP: 00007fff25675368 RFLAGS: 00010206
  RAX: 00000000000000a5 RBX: ffffffff8176ae2d RCX: 0000000000000026
  RDX: 0000000000b04c20 RSI: 0000000000b04bf0 RDI: 0000000000b04bd0
  RBP: 00000000c0ed0400 R8: 0000000000b04c70 R9: 0000000000000001
  R10: ffffffffc0ed0400 R11: 0000000000000202 R12: 0000000000b04bf0
  R13: 0000000000b04b50 R14: 0000000000000400 R15: 0000000000000000
  ORIG_RAX: 00000000000000a5 CS: 0033 SS: 002b

  The corresponding disk is /dev/sdd1, any IO (xfs_check, etc) also
  hangs and had "D" state.

  This reproducible with 3.11 and 3.13 kernel both.

  The storage node is out of service because of this problem
  --------

  I'm still asking for more data (sosreport and kernel dump).

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1382801/+subscriptions
References

[Bug 1382801] [NEW] XFS: mount hangs for corrupted filesystem
From: Rafael David Tinoco, 2014-10-18