← Back to team overview

group.of.nepali.translators team mailing list archive

[Bug 1793901] Re: kernel oops in bcache module

 

** Also affects: linux (Ubuntu Xenial)
   Importance: Undecided
       Status: New

** Also affects: linux (Ubuntu Trusty)
   Importance: Undecided
       Status: New

** Changed in: linux (Ubuntu Xenial)
   Importance: Undecided => Medium

** Changed in: linux (Ubuntu Trusty)
   Importance: Undecided => Medium

-- 
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1793901

Title:
  kernel oops in bcache module

Status in linux package in Ubuntu:
  Fix Committed
Status in linux source package in Trusty:
  New
Status in linux source package in Xenial:
  New

Bug description:
  SRU Justification
  =================

  [Impact]

  Some users see panics like the following when performing fstrim on a
  bcached volume:

  [  529.803060] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
  [  530.183928] #PF error: [normal kernel read fault]
  [  530.412392] PGD 8000001f42163067 P4D 8000001f42163067 PUD 1f42168067 PMD 0
  [  530.750887] Oops: 0000 [#1] SMP PTI
  [  530.920869] CPU: 10 PID: 4167 Comm: fstrim Kdump: loaded Not tainted 5.0.0-rc1+ #3
  [  531.290204] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, BIOS P89 12/27/2015
  [  531.693137] RIP: 0010:blk_queue_split+0x148/0x620
  [  531.922205] Code: 60 38 89 55 a0 45 31 db 45 31 f6 45 31 c9 31 ff 89 4d 98 85 db 0f 84 7f 04 00 00 44 8b 6d 98 4c 89 ee 48 c1 e6 04 49 03 70 78 <8b> 46 08 44 8b 56 0c 48
  8b 16 44 29 e0 39 d8 48 89 55 a8 0f 47 c3
  [  532.838634] RSP: 0018:ffffb9b708df39b0 EFLAGS: 00010246
  [  533.093571] RAX: 00000000ffffffff RBX: 0000000000046000 RCX: 0000000000000000
  [  533.441865] RDX: 0000000000000200 RSI: 0000000000000000 RDI: 0000000000000000
  [  533.789922] RBP: ffffb9b708df3a48 R08: ffff940d3b3fdd20 R09: 0000000000000000
  [  534.137512] R10: ffffb9b708df3958 R11: 0000000000000000 R12: 0000000000000000
  [  534.485329] R13: 0000000000000000 R14: 0000000000000000 R15: ffff940d39212020
  [  534.833319] FS:  00007efec26e3840(0000) GS:ffff940d1f480000(0000) knlGS:0000000000000000
  [  535.224098] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  [  535.504318] CR2: 0000000000000008 CR3: 0000001f4e256004 CR4: 00000000001606e0
  [  535.851759] Call Trace:
  [  535.970308]  ? mempool_alloc_slab+0x15/0x20
  [  536.174152]  ? bch_data_insert+0x42/0xd0 [bcache]
  [  536.403399]  blk_mq_make_request+0x97/0x4f0
  [  536.607036]  generic_make_request+0x1e2/0x410
  [  536.819164]  submit_bio+0x73/0x150
  [  536.980168]  ? submit_bio+0x73/0x150
  [  537.149731]  ? bio_associate_blkg_from_css+0x3b/0x60
  [  537.391595]  ? _cond_resched+0x1a/0x50
  [  537.573774]  submit_bio_wait+0x59/0x90
  [  537.756105]  blkdev_issue_discard+0x80/0xd0
  [  537.959590]  ext4_trim_fs+0x4a9/0x9e0
  [  538.137636]  ? ext4_trim_fs+0x4a9/0x9e0
  [  538.324087]  ext4_ioctl+0xea4/0x1530
  [  538.497712]  ? _copy_to_user+0x2a/0x40
  [  538.679632]  do_vfs_ioctl+0xa6/0x600
  [  538.853127]  ? __do_sys_newfstat+0x44/0x70
  [  539.051951]  ksys_ioctl+0x6d/0x80
  [  539.212785]  __x64_sys_ioctl+0x1a/0x20
  [  539.394918]  do_syscall_64+0x5a/0x110
  [  539.568674]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

  [Fix]

  Under certain conditions, the test for whether an operation should be
  written back to the underlying device was incorrect. Specifically, in
  should_writeback(), we were hitting a case where an optimisation for
  partial stripe conditions was returning true and so should_writeback()
  was returning true early. This caused the code to go down an incorrect
  path and create bios that contained NULL pointers.

  To fix this issue, make sure that should_writeback() on a discard op
  never returns true.

  
  [Test Case]

  We have observed it on some systems where both:
  1) LVM/devmapper is involved (bcache backing device is LVM volume) and
  2) writeback cache is involved (bcache cache_mode is writeback)

  Not every machine exhibits the bug. On one machine that does exhibit
  the bug, we can reliably reproduce it with:

   # echo writeback > /sys/block/bcache0/bcache/cache_mode
   # mount /dev/bcache0 /test
   # for i in {0..10}; do file="$(mktemp /test/zero.XXX)"; dd if=/dev/zero of="$file" bs=1M count=256; sync; rm $file; done; fstrim -v /test

  
  [Regression Potential]

  This could affect any device where bcache is used.

  In mitigation, however: the patch is simple, is limited to considering
  discard operations. The patch has been accepted upstream [1] and the
  maintainer will be including it in SuSE kernels [2]. A Gentoo user
  validated the upstream patch independently [3].

  
  [1] https://www.spinics.net/lists/linux-bcache/msg06997.html
  [2] https://www.spinics.net/lists/linux-bcache/msg06998.html
  [3] https://bugzilla.kernel.org/show_bug.cgi?id=196103#c3

  
  [Original Description]

  This was on an 18.04.1 install running the 4.15-34 generic kernel image, running from a normal ext4 root device.
  I had just a short while before created a new bcache device that was mounted but to which no data had been written yet. Then without any apparent particular reason, an apport error popped up to inform of a bcache kernel oops. Crash log was uploaded but no idea how to link it, so I attach it as well.
  Mostly I would like to know how concerned I should be as after a previous, successful test I wanted to move the whole install to bcache. Ideally, if this is a bug or similar, it would be nice if it could get fixed.

  ProblemType: Bug
  DistroRelease: Ubuntu 18.04
  Package: linux-image-4.15.0-34-generic 4.15.0-34.37
  ProcVersionSignature: Ubuntu 4.15.0-34.37-generic 4.15.18
  Uname: Linux 4.15.0-34-generic x86_64
  NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair nvidia_modeset nvidia
  ApportVersion: 2.20.9-0ubuntu7.3
  Architecture: amd64
  CurrentDesktop: ubuntu:GNOME
  Date: Sat Sep 22 18:20:22 2018
  HibernationDevice: RESUME=UUID=6bcbe7fa-85b7-4baf-9b69-0558a668bcdd
  InstallationDate: Installed on 2014-07-29 (1515 days ago)
  InstallationMedia: It
  IwConfig:
   zthnhe3w6d  no wireless extensions.

   eth1      no wireless extensions.

   lo        no wireless extensions.
  MachineType: System manufacturer System Product Name
  ProcEnviron:
   TERM=xterm-256color
   PATH=(custom, no user)
   XDG_RUNTIME_DIR=<set>
   LANG=de_DE.UTF-8
   SHELL=/bin/bash
  ProcFB: 0 EFI VGA
  ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-4.15.0-34-generic root=UUID=ebbab625-f14e-44ba-84d5-025ed92a5b2a ro quiet splash
  RelatedPackageVersions:
   linux-restricted-modules-4.15.0-34-generic N/A
   linux-backports-modules-4.15.0-34-generic  N/A
   linux-firmware                             1.173.1
  RfKill:
   0: hci0: Bluetooth
    Soft blocked: yes
    Hard blocked: no
  SourcePackage: linux
  UpgradeStatus: Upgraded to bionic on 2018-09-07 (15 days ago)
  dmi.bios.date: 10/22/2015
  dmi.bios.vendor: American Megatrends Inc.
  dmi.bios.version: 0604
  dmi.board.asset.tag: Default string
  dmi.board.name: H170I-PLUS D3
  dmi.board.vendor: ASUSTeK COMPUTER INC.
  dmi.board.version: Rev X.0x
  dmi.chassis.asset.tag: Default string
  dmi.chassis.type: 3
  dmi.chassis.vendor: Default string
  dmi.chassis.version: Default string
  dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvr0604:bd10/22/2015:svnSystemmanufacturer:pnSystemProductName:pvrSystemVersion:rvnASUSTeKCOMPUTERINC.:rnH170I-PLUSD3:rvrRevX.0x:cvnDefaultstring:ct3:cvrDefaultstring:
  dmi.product.family: Default string
  dmi.product.name: System Product Name
  dmi.product.version: System Version
  dmi.sys.vendor: System manufacturer

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1793901/+subscriptions