group.of.nepali.translators team mailing list archive
-
group.of.nepali.translators team
-
Mailing list archive
-
Message #28243
[Bug 1793901] Re: kernel oops in bcache module
** Also affects: linux (Ubuntu Xenial)
Importance: Undecided
Status: New
** Also affects: linux (Ubuntu Trusty)
Importance: Undecided
Status: New
** Changed in: linux (Ubuntu Xenial)
Importance: Undecided => Medium
** Changed in: linux (Ubuntu Trusty)
Importance: Undecided => Medium
--
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1793901
Title:
kernel oops in bcache module
Status in linux package in Ubuntu:
Fix Committed
Status in linux source package in Trusty:
New
Status in linux source package in Xenial:
New
Bug description:
SRU Justification
=================
[Impact]
Some users see panics like the following when performing fstrim on a
bcached volume:
[ 529.803060] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
[ 530.183928] #PF error: [normal kernel read fault]
[ 530.412392] PGD 8000001f42163067 P4D 8000001f42163067 PUD 1f42168067 PMD 0
[ 530.750887] Oops: 0000 [#1] SMP PTI
[ 530.920869] CPU: 10 PID: 4167 Comm: fstrim Kdump: loaded Not tainted 5.0.0-rc1+ #3
[ 531.290204] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, BIOS P89 12/27/2015
[ 531.693137] RIP: 0010:blk_queue_split+0x148/0x620
[ 531.922205] Code: 60 38 89 55 a0 45 31 db 45 31 f6 45 31 c9 31 ff 89 4d 98 85 db 0f 84 7f 04 00 00 44 8b 6d 98 4c 89 ee 48 c1 e6 04 49 03 70 78 <8b> 46 08 44 8b 56 0c 48
8b 16 44 29 e0 39 d8 48 89 55 a8 0f 47 c3
[ 532.838634] RSP: 0018:ffffb9b708df39b0 EFLAGS: 00010246
[ 533.093571] RAX: 00000000ffffffff RBX: 0000000000046000 RCX: 0000000000000000
[ 533.441865] RDX: 0000000000000200 RSI: 0000000000000000 RDI: 0000000000000000
[ 533.789922] RBP: ffffb9b708df3a48 R08: ffff940d3b3fdd20 R09: 0000000000000000
[ 534.137512] R10: ffffb9b708df3958 R11: 0000000000000000 R12: 0000000000000000
[ 534.485329] R13: 0000000000000000 R14: 0000000000000000 R15: ffff940d39212020
[ 534.833319] FS: 00007efec26e3840(0000) GS:ffff940d1f480000(0000) knlGS:0000000000000000
[ 535.224098] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 535.504318] CR2: 0000000000000008 CR3: 0000001f4e256004 CR4: 00000000001606e0
[ 535.851759] Call Trace:
[ 535.970308] ? mempool_alloc_slab+0x15/0x20
[ 536.174152] ? bch_data_insert+0x42/0xd0 [bcache]
[ 536.403399] blk_mq_make_request+0x97/0x4f0
[ 536.607036] generic_make_request+0x1e2/0x410
[ 536.819164] submit_bio+0x73/0x150
[ 536.980168] ? submit_bio+0x73/0x150
[ 537.149731] ? bio_associate_blkg_from_css+0x3b/0x60
[ 537.391595] ? _cond_resched+0x1a/0x50
[ 537.573774] submit_bio_wait+0x59/0x90
[ 537.756105] blkdev_issue_discard+0x80/0xd0
[ 537.959590] ext4_trim_fs+0x4a9/0x9e0
[ 538.137636] ? ext4_trim_fs+0x4a9/0x9e0
[ 538.324087] ext4_ioctl+0xea4/0x1530
[ 538.497712] ? _copy_to_user+0x2a/0x40
[ 538.679632] do_vfs_ioctl+0xa6/0x600
[ 538.853127] ? __do_sys_newfstat+0x44/0x70
[ 539.051951] ksys_ioctl+0x6d/0x80
[ 539.212785] __x64_sys_ioctl+0x1a/0x20
[ 539.394918] do_syscall_64+0x5a/0x110
[ 539.568674] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[Fix]
Under certain conditions, the test for whether an operation should be
written back to the underlying device was incorrect. Specifically, in
should_writeback(), we were hitting a case where an optimisation for
partial stripe conditions was returning true and so should_writeback()
was returning true early. This caused the code to go down an incorrect
path and create bios that contained NULL pointers.
To fix this issue, make sure that should_writeback() on a discard op
never returns true.
[Test Case]
We have observed it on some systems where both:
1) LVM/devmapper is involved (bcache backing device is LVM volume) and
2) writeback cache is involved (bcache cache_mode is writeback)
Not every machine exhibits the bug. On one machine that does exhibit
the bug, we can reliably reproduce it with:
# echo writeback > /sys/block/bcache0/bcache/cache_mode
# mount /dev/bcache0 /test
# for i in {0..10}; do file="$(mktemp /test/zero.XXX)"; dd if=/dev/zero of="$file" bs=1M count=256; sync; rm $file; done; fstrim -v /test
[Regression Potential]
This could affect any device where bcache is used.
In mitigation, however: the patch is simple, is limited to considering
discard operations. The patch has been accepted upstream [1] and the
maintainer will be including it in SuSE kernels [2]. A Gentoo user
validated the upstream patch independently [3].
[1] https://www.spinics.net/lists/linux-bcache/msg06997.html
[2] https://www.spinics.net/lists/linux-bcache/msg06998.html
[3] https://bugzilla.kernel.org/show_bug.cgi?id=196103#c3
[Original Description]
This was on an 18.04.1 install running the 4.15-34 generic kernel image, running from a normal ext4 root device.
I had just a short while before created a new bcache device that was mounted but to which no data had been written yet. Then without any apparent particular reason, an apport error popped up to inform of a bcache kernel oops. Crash log was uploaded but no idea how to link it, so I attach it as well.
Mostly I would like to know how concerned I should be as after a previous, successful test I wanted to move the whole install to bcache. Ideally, if this is a bug or similar, it would be nice if it could get fixed.
ProblemType: Bug
DistroRelease: Ubuntu 18.04
Package: linux-image-4.15.0-34-generic 4.15.0-34.37
ProcVersionSignature: Ubuntu 4.15.0-34.37-generic 4.15.18
Uname: Linux 4.15.0-34-generic x86_64
NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair nvidia_modeset nvidia
ApportVersion: 2.20.9-0ubuntu7.3
Architecture: amd64
CurrentDesktop: ubuntu:GNOME
Date: Sat Sep 22 18:20:22 2018
HibernationDevice: RESUME=UUID=6bcbe7fa-85b7-4baf-9b69-0558a668bcdd
InstallationDate: Installed on 2014-07-29 (1515 days ago)
InstallationMedia: It
IwConfig:
zthnhe3w6d no wireless extensions.
eth1 no wireless extensions.
lo no wireless extensions.
MachineType: System manufacturer System Product Name
ProcEnviron:
TERM=xterm-256color
PATH=(custom, no user)
XDG_RUNTIME_DIR=<set>
LANG=de_DE.UTF-8
SHELL=/bin/bash
ProcFB: 0 EFI VGA
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-4.15.0-34-generic root=UUID=ebbab625-f14e-44ba-84d5-025ed92a5b2a ro quiet splash
RelatedPackageVersions:
linux-restricted-modules-4.15.0-34-generic N/A
linux-backports-modules-4.15.0-34-generic N/A
linux-firmware 1.173.1
RfKill:
0: hci0: Bluetooth
Soft blocked: yes
Hard blocked: no
SourcePackage: linux
UpgradeStatus: Upgraded to bionic on 2018-09-07 (15 days ago)
dmi.bios.date: 10/22/2015
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: 0604
dmi.board.asset.tag: Default string
dmi.board.name: H170I-PLUS D3
dmi.board.vendor: ASUSTeK COMPUTER INC.
dmi.board.version: Rev X.0x
dmi.chassis.asset.tag: Default string
dmi.chassis.type: 3
dmi.chassis.vendor: Default string
dmi.chassis.version: Default string
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvr0604:bd10/22/2015:svnSystemmanufacturer:pnSystemProductName:pvrSystemVersion:rvnASUSTeKCOMPUTERINC.:rnH170I-PLUSD3:rvrRevX.0x:cvnDefaultstring:ct3:cvrDefaultstring:
dmi.product.family: Default string
dmi.product.name: System Product Name
dmi.product.version: System Version
dmi.sys.vendor: System manufacturer
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1793901/+subscriptions