← Back to team overview

linaro-release team mailing list archive

[Bug 709245] Re: ARM SMP scheduler performance bug

 

I tried to gather more data on this and made several runs of "hdparm -Tt" for the good and bad case:
- For the good case (nosmp)
  - cached reads avg. 184.61MB/sec (MIN=181.02 MAX=186.30 STDDEV=1.62)
  - buffered reads avg. 3.64MB/sec (MIN=3.64 MAX=3.65 STDDEV=0)
- For the bad case
  - cached reads avg. 62.00MB/sec (MIN=21.98 MAX=111.55 STDDEV=27.97)
  - buffered reads avg. 1.35MB/sec (MIN=1.09 MAX=1.86 STDDEV=0.20)

So not only the SMP case is slower, there is also a high variation in
the numbers. Whatever happens, it is not a linear slowdown.

When trying various options maxcpus=1 has the same effect as nosmp
(disabling a core later does not improve things), there was one option
(which right now slips my memory) causing a higher rate of timer
interrupts which lead to a even worse buffered reads performance (as was
disabling the irqbalance daemon). Using nohlt seemed to have no effect.

When looking at the interrupts, the main difference seemed to be the use
of gp_timer when booted with nosmp or maxcpus=1 and LOCal timers in the
other case. Booting with 2 CPUs and disabling one did not change that. I
cannot recall whether the IPIs stopped being incremented completely or
only on the other CPU. But LOC definitely was still used for CPU#0.

-- 
You received this bug notification because you are a member of Linaro
Release Team, which is subscribed to the bug report.
https://bugs.launchpad.net/bugs/709245

Title:
  ARM SMP scheduler performance bug

Status in Linaro Ubuntu Evaluation Builds:
  Confirmed
Status in Linaro Linux:
  Confirmed
Status in “linux-ti-omap4” package in Ubuntu:
  Confirmed
Status in “linux-ti-omap4” source package in Maverick:
  Confirmed
Status in “linux-ti-omap4” source package in Natty:
  Confirmed
Status in “linux-ti-omap4” source package in Oneiric:
  Confirmed

Bug description:
  Original Bug name: "panda: USB disk IO slow"

  My Panda's USB seems to be significantly slower than a Beagle C4.

  hdparm shows buffered reads as ~12MB/s on the Panda, and about ~20-25MB/s on a Beagle C4 from the same
  external Lacie USB disk.

  Kernel is 2.6.37-1002-linaro-omap

  Disk shows as:

  [    5.170440] scsi 0:0:0:0: Direct-Access     LaCie    d2 quadra             PQ: 0 ANSI: 4
  [    5.172546] sd 0:0:0:0: Attached scsi generic sg0 type 0
  [    5.175415] sd 0:0:0:0: [sda] 976773168 512-byte logical blocks: (500 GB/465 GiB)

  The board is otherwise idle during the test.

  Doing perf_2.6.37-12 record -a dd if=/dev/sda of=/dev/null bs=4096
  count=100000

  shows :
      81.41%         swapper  [kernel.kallsyms]     [k] default_idle
       6.33%              dd  [kernel.kallsyms]     [k] __copy_to_user
       0.94%         swapper  [kernel.kallsyms]     [k] cpu_idle
       0.51%              dd  [kernel.kallsyms]     [k] __make_request
       0.51%  perf_2.6.37-12  [kernel.kallsyms]     [k] __copy_from_user

  which suggests it's not CPU constrained.

  Dave

To manage notifications about this bug go to:
https://bugs.launchpad.net/linaro-ubuntu/+bug/709245/+subscriptions