← Back to team overview

linaro-release team mailing list archive

[Bug 855050] Re: Needs to cope with transient failure better

 

I agree there're ideas how to handle/improve it that, but I'd like to
point out that trying to do so, we may worsen that instead. I don't
think ratio of current rate of random failures vs possible benefits vs
possible risk calls for any actions, so let's have this ticket to
collect stats of such failures, so far it was very low.


** Changed in: linaro-android-mirror
   Importance: Undecided => Low

** Project changed: linaro-android-mirror => linaro-android-build-tools

-- 
You received this bug notification because you are a member of linaro-
infrastructure-drivers, which is the registrant for linaro-android-
mirror.
https://bugs.launchpad.net/bugs/855050

Title:
  Needs to cope with transient failure better

Status in Linaro Android Build Tools:
  New

Bug description:
  Hi

  Filing this here so it's on a Linaro project, but we may fix it
  elsewhere.

  https://android-build.linaro.org/jenkins/job/linaro-android_staging-
  omap4460/2/parsed_console/?

  failed with

  error: Ref refs/remotes/origin/linaro_android_2.3.5 is at c1c999294e24a97466b4a863a96dc09d8c389088 but expected 67102998c3a453a1bd173ed3eec4bfafe077ba7a
  From git://android.git.linaro.org/platform/manifest
   ! 6710299..c1c9992 linaro_android_2.3.5 -> origin/linaro_android_2.3.5 (unable to update local ref)
  From git://android.git.linaro.org/platform/manifest
   6710299..c1c9992 linaro_android_2.3.5 -> linaro_android_2.3.5
  From git://android.git.linaro.org/platform/external/busybox
   + 17afabe...5c6ba6c master -> master (forced update)

  Initializing project kernel/omap-omapzoom ...
  fatal: The remote end hung up unexpectedly
  fatal: protocol error: bad pack header
  error: Cannot fetch kernel/omap-omapzoom

  error: Exited sync due to fetch errors

  which was apparently the connection with android.git.linaro.org
  closing prematurely.

  The next build (immediately following) worked fine.

  It may be that transient errors are handled, and this downtime just exceeded
  a threshold, but it's likely that they aren't handled at all.

  In order to have a reliable service short-term transient errors should be
  transparently handled.

  We could do this by running repo again if it fails. Given that it takes
  10 minutes to do nothing, that may be rather expensive if it's a 10 second
  outage.

  We could have repo re-run git if it fails, which would be more direct.

  Or git could try reconnecting.

  Or all of the above.

  Thanks,

  James

To manage notifications about this bug go to:
https://bugs.launchpad.net/linaro-android-build-tools/+bug/855050/+subscriptions


References