← Back to team overview

kernel-packages team mailing list archive

[Bug 1526811] Re: SRU: walker list corruption while being intensively stressed

 

A better reproducer is running stress-ng --procfs 0 on a multi-core
machine.  Without the fix, it oopses in less than a second.  With the
fix, it works perfectly, no oopsing.

Tested on 4.2.0-24-generic #29-Ubuntu, ran soak test for 600 seconds on
an 8 proc Xeon box:

stress-ng: info:  [3044] successful run completed in 600.23s (10 mins, 0.23 secs)
stress-ng: info:  [3044] stressor      bogo ops real time  usr time  sys time   bogo ops/s   bogo ops/s
stress-ng: info:  [3044]                          (secs)    (secs)    (secs)   (real time) (usr+sys time)
stress-ng: info:  [3044] procfs               8    600.00    151.83   4646.48         0.01         0.00
stress-ng: info:  [3044] procfs:


** Tags removed: verification-needed-wily
** Tags added: verification-done-wily

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1526811

Title:
  SRU: walker list corruption while being intensively stressed

Status in linux package in Ubuntu:
  Fix Committed
Status in linux source package in Wily:
  Fix Committed
Status in linux source package in Xenial:
  Fix Committed

Bug description:
  [SRU Justification][Wily]  + [Xenial]

  While stress testing with the stress-ng procfs stressor I hit a walker
  list bug.  This has been recently fixed by Herbert Xu in commit:

  The commit ba7c95ea3870fe7b847466d39a049ab6f156aa2c ("rhashtable: Fix
  sleeping inside RCU critical section in walk_stop") introduced a new
  spinlock for the walker list.  However, it did not convert all
  existing users of the list over to the new spin lock.  Some continued
  to use the old mutex for this purpose.  This obviously led to
  corruption of the list.

  [Fix]
  Clean upstream cherry pick, commit c6ff5268293ef98e48a99597e765ffc417e39fa5
  Will land in Xenial automatically (4.4)

  
  [Testcase]
  Run multiple instances of the attached code on a multicore system.  Alternatively, run stress-ng --procfs 0 on a multi-core system

  Fix will stop the above code corrupting the list and crashing.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1526811/+subscriptions