← Back to team overview

kernel-packages team mailing list archive

[Bug 1564949] Re: Severe latency/skew on AMD Opetron processor

 

apport information

** Tags added: apport-collected uec-images xenial

** Description changed:

  Discovered this while doing pre-release certification testing for 16.04
  on an HP ProLiant DL385p Gen8 with an AMD Opteron 6320 8-core CPU.
  
  I have some code that essentially does this:  And note, I am NOT a C
  programmer, I know enough C to read it and do some minor things, and I
  once knew C++ fairly well, about 10 years ago...
  
  gettimeofday(&tval_start, NULL);
  sleep(sleeptime);
  gettimeofday(&tval_stop, NULL);
  
  where tval_start and tval_stop are timeval structs and sleeptime is 60.
  
  Once it gets the start and stop it finds the delta minus the sleep time.
  
  In a perfect world, for example, start time would be 123456.123 and
  end time would be 123516.123 and the delta between them minus the 60
  seconds of sleep would be 0.
  
  Of course, that isn't how it works in reality so the delta may be a
  few microseconds here and there depending on what else the kernel is
  doing at any given moment.  The following, however, are on essentially
  idle Xenial systems (only processes running are whatever Ubuntu Server
  runs by default, nothing really taxing going on).
  
  On my Skylake i7 with Xenial, the time differences are never more than
  a few 10,000ths of a second: (kernel 4.4.0-15.31)
  Testing clock direction for 5 minutes...
  PASSED: Iteration 0 delta: 0.000109
  PASSED: Iteration 1 delta: 0.000068
  PASSED: Iteration 2 delta: 0.000107
  PASSED: Iteration 3 delta: 0.000216
  PASSED: Iteration 4 delta: 0.000089
  
  On a zVM instance (kernel 4.4.0-16.32) it's even better:
  PASSED: Iteration 0 delta: 0.000058
  PASSED: Iteration 1 delta: 0.000058
  PASSED: Iteration 2 delta: 0.000074
  PASSED: Iteration 3 delta: 0.000052
  PASSED: Iteration 4 delta: 0.000062
  
  But on an AMD cpu with Xenial (the only AMD CPU I have access to), the
  difference is always in the 10ths of a second, sometimes even several
  seconds... in other words, I've seen up to a 7.9 second delta with
  this code.  Here's one run that shows 3 seconds in one iteration:
  (kernel 4.4.0-15.31)
  FAILED: Iteration 0 delta: 3.057980
  FAILED: Iteration 1 delta: 0.225712
  FAILED: Iteration 2 delta: 0.241468
  FAILED: Iteration 3 delta: 0.229084
  FAILED: Iteration 4 delta: 0.223933
  
  I ran a second run on the AMD cpu and the latency was all over the place:
  FAILED: Iteration 0 delta: 9.302149
  FAILED: Iteration 1 delta: 0.624466
  FAILED: Iteration 2 delta: 1.644834
  FAILED: Iteration 3 delta: 1.011474
  FAILED: Iteration 4 delta: 0.923033
  
  After a discussion with cking and apw, deviations of as seen on the
  Intel and s390 CPUs are about what we should expect to see depending on
  what the system is doing at the moment gettimeofday() is executed.
  However, on the AMD CPU, differences of up to 9 seconds or more are NOT
  expected and highly irregular.
  
  Colin said he tested this on an AMD C60 CPU and got numbers inline with
  the Skylake and s390 chips and could not reproduce the times I am seeing
  on the Opteron.
  
  $ cat /proc/version_signature 
  Ubuntu 4.4.0-15.31-generic 4.4.6
+ --- 
+ AlsaDevices:
+  total 0
+  crw-rw---- 1 root audio 116,  1 Mar 31 17:21 seq
+  crw-rw---- 1 root audio 116, 33 Mar 31 17:21 timer
+ AplayDevices: Error: [Errno 2] No such file or directory
+ ApportVersion: 2.20-0ubuntu3
+ Architecture: amd64
+ ArecordDevices: Error: [Errno 2] No such file or directory
+ AudioDevicesInUse: Error: [Errno 2] No such file or directory
+ DistroRelease: Ubuntu 16.04
+ HibernationDevice: RESUME=UUID=b6a44b05-ebe0-4d1c-a525-69d4748960f8
+ IwConfig: Error: [Errno 2] No such file or directory
+ MachineType: HP ProLiant DL385p Gen8
+ Package: linux (not installed)
+ PciMultimedia:
+  
+ ProcEnviron:
+  TERM=xterm
+  PATH=(custom, no user)
+  XDG_RUNTIME_DIR=<set>
+  LANG=en_US.UTF-8
+  SHELL=/bin/bash
+ ProcFB:
+  
+ ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.4.0-15-generic root=UUID=70808172-98ba-4129-a185-20a112bdc4fe ro rootdelay=60
+ ProcVersionSignature: Ubuntu 4.4.0-15.31-generic 4.4.6
+ RelatedPackageVersions:
+  linux-restricted-modules-4.4.0-15-generic N/A
+  linux-backports-modules-4.4.0-15-generic  N/A
+  linux-firmware                            1.157
+ RfKill: Error: [Errno 2] No such file or directory
+ Tags:  xenial uec-images
+ Uname: Linux 4.4.0-15-generic x86_64
+ UpgradeStatus: No upgrade log present (probably fresh install)
+ UserGroups: adm cdrom dip lpadmin lxd plugdev sambashare sudo
+ _MarkForUpload: True
+ dmi.bios.date: 02/06/2014
+ dmi.bios.vendor: HP
+ dmi.bios.version: A28
+ dmi.chassis.type: 23
+ dmi.chassis.vendor: HP
+ dmi.modalias: dmi:bvnHP:bvrA28:bd02/06/2014:svnHP:pnProLiantDL385pGen8:pvr:cvnHP:ct23:cvr:
+ dmi.product.name: ProLiant DL385p Gen8
+ dmi.sys.vendor: HP

** Attachment added: "CRDA.txt"
   https://bugs.launchpad.net/bugs/1564949/+attachment/4619525/+files/CRDA.txt

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1564949

Title:
  Severe latency/skew on AMD Opetron processor

Status in linux package in Ubuntu:
  Confirmed

Bug description:
  Discovered this while doing pre-release certification testing for
  16.04 on an HP ProLiant DL385p Gen8 with an AMD Opteron 6320 8-core
  CPU.

  I have some code that essentially does this:  And note, I am NOT a C
  programmer, I know enough C to read it and do some minor things, and I
  once knew C++ fairly well, about 10 years ago...

  gettimeofday(&tval_start, NULL);
  sleep(sleeptime);
  gettimeofday(&tval_stop, NULL);

  where tval_start and tval_stop are timeval structs and sleeptime is
  60.

  Once it gets the start and stop it finds the delta minus the sleep
  time.

  In a perfect world, for example, start time would be 123456.123 and
  end time would be 123516.123 and the delta between them minus the 60
  seconds of sleep would be 0.

  Of course, that isn't how it works in reality so the delta may be a
  few microseconds here and there depending on what else the kernel is
  doing at any given moment.  The following, however, are on essentially
  idle Xenial systems (only processes running are whatever Ubuntu Server
  runs by default, nothing really taxing going on).

  On my Skylake i7 with Xenial, the time differences are never more than
  a few 10,000ths of a second: (kernel 4.4.0-15.31)
  Testing clock direction for 5 minutes...
  PASSED: Iteration 0 delta: 0.000109
  PASSED: Iteration 1 delta: 0.000068
  PASSED: Iteration 2 delta: 0.000107
  PASSED: Iteration 3 delta: 0.000216
  PASSED: Iteration 4 delta: 0.000089

  On a zVM instance (kernel 4.4.0-16.32) it's even better:
  PASSED: Iteration 0 delta: 0.000058
  PASSED: Iteration 1 delta: 0.000058
  PASSED: Iteration 2 delta: 0.000074
  PASSED: Iteration 3 delta: 0.000052
  PASSED: Iteration 4 delta: 0.000062

  But on an AMD cpu with Xenial (the only AMD CPU I have access to), the
  difference is always in the 10ths of a second, sometimes even several
  seconds... in other words, I've seen up to a 7.9 second delta with
  this code.  Here's one run that shows 3 seconds in one iteration:
  (kernel 4.4.0-15.31)
  FAILED: Iteration 0 delta: 3.057980
  FAILED: Iteration 1 delta: 0.225712
  FAILED: Iteration 2 delta: 0.241468
  FAILED: Iteration 3 delta: 0.229084
  FAILED: Iteration 4 delta: 0.223933

  I ran a second run on the AMD cpu and the latency was all over the place:
  FAILED: Iteration 0 delta: 9.302149
  FAILED: Iteration 1 delta: 0.624466
  FAILED: Iteration 2 delta: 1.644834
  FAILED: Iteration 3 delta: 1.011474
  FAILED: Iteration 4 delta: 0.923033

  After a discussion with cking and apw, deviations of as seen on the
  Intel and s390 CPUs are about what we should expect to see depending
  on what the system is doing at the moment gettimeofday() is executed.
  However, on the AMD CPU, differences of up to 9 seconds or more are
  NOT expected and highly irregular.

  Colin said he tested this on an AMD C60 CPU and got numbers inline
  with the Skylake and s390 chips and could not reproduce the times I am
  seeing on the Opteron.

  $ cat /proc/version_signature 
  Ubuntu 4.4.0-15.31-generic 4.4.6
  --- 
  AlsaDevices:
   total 0
   crw-rw---- 1 root audio 116,  1 Mar 31 17:21 seq
   crw-rw---- 1 root audio 116, 33 Mar 31 17:21 timer
  AplayDevices: Error: [Errno 2] No such file or directory
  ApportVersion: 2.20-0ubuntu3
  Architecture: amd64
  ArecordDevices: Error: [Errno 2] No such file or directory
  AudioDevicesInUse: Error: [Errno 2] No such file or directory
  DistroRelease: Ubuntu 16.04
  HibernationDevice: RESUME=UUID=b6a44b05-ebe0-4d1c-a525-69d4748960f8
  IwConfig: Error: [Errno 2] No such file or directory
  MachineType: HP ProLiant DL385p Gen8
  Package: linux (not installed)
  PciMultimedia:
   
  ProcEnviron:
   TERM=xterm
   PATH=(custom, no user)
   XDG_RUNTIME_DIR=<set>
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  ProcFB:
   
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.4.0-15-generic root=UUID=70808172-98ba-4129-a185-20a112bdc4fe ro rootdelay=60
  ProcVersionSignature: Ubuntu 4.4.0-15.31-generic 4.4.6
  RelatedPackageVersions:
   linux-restricted-modules-4.4.0-15-generic N/A
   linux-backports-modules-4.4.0-15-generic  N/A
   linux-firmware                            1.157
  RfKill: Error: [Errno 2] No such file or directory
  Tags:  xenial uec-images
  Uname: Linux 4.4.0-15-generic x86_64
  UpgradeStatus: No upgrade log present (probably fresh install)
  UserGroups: adm cdrom dip lpadmin lxd plugdev sambashare sudo
  _MarkForUpload: True
  dmi.bios.date: 02/06/2014
  dmi.bios.vendor: HP
  dmi.bios.version: A28
  dmi.chassis.type: 23
  dmi.chassis.vendor: HP
  dmi.modalias: dmi:bvnHP:bvrA28:bd02/06/2014:svnHP:pnProLiantDL385pGen8:pvr:cvnHP:ct23:cvr:
  dmi.product.name: ProLiant DL385p Gen8
  dmi.sys.vendor: HP

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1564949/+subscriptions


References