← Back to team overview

tiomap-dev team mailing list archive

[Bug 918412] Re: OMAP Panda does not boot on Linux-linaro 3.2 branch

 

Andrey I appreciate if you can test it next week.  The time I had to
spend on 3.2 is all gone fighting this already.

This bug is not critical for TILT however, probably not "Linaro Linux"
either.

TI have a need for us to spend a little time on 3.1 stuff and a major,
pressing need for us to work on tracking (3.3-rc1+) with them.  3.2 is
not featuring at all in their plans with us.

Linaro Ubuntu builds will not ship with 3.2 either, since there's a
syslink regression again although both Jassi and I spent some days on
it, fixing it is something TI explicitly don't want us to focus on since
they want syslink 3 / rpmsg work now.  Ricardo is au fait with that and
plans to continue to ship 3.1 (which because of TI need for 3.1, we
might be able to support a little).  Again the time we had on that 3.2
issue is all gone.

Linaro Android is shipping our kernel in workable state for ICS, but
it's tracking kernel.  Well they ship 3.0 AOSP kernel too.

Further, after I gave up on basing on linux-linaro-tracking for now (and
the fact I moved my base to it shows I really wanted to be using it
ongoing fwiw), reverted all my topics that had been adapted to it, I
started rebase action on 3.3-rc1 and found that the content in tracking-
topic-pm has all gone upstream in improved form, that includes the
barrier implementation which is radically changed.  Assuming the
upstream version works with the newer stuff from linux-linaro-tracking,
trying to fix the rotted 3.2 pm code is wasted effort.

Therefore unless someone like Andrey can send me a working merge I just
have to pull and push, I'm not going to create a linux-linaro-3.2 this
time around.  I'll work on tracking, when that is in reasonable shape I
will try again fresh to separately merge (not base) linux-linaro-
tracking and we can discuss the results of that earlier in the kernel
cycle for 3.3.

What are the lessons?

1) This was introduction of linux-linaro-tracking, it happened late in
the cycle.  I think it's right path, right methodology, right content,
Andrey is doing fine.

2) Between 9th Jan when I posted on linaro-kernel with this issue and a
few days ago when npitre pointed out the initial bug root cause (which
doesn't solve barrier issue) this issue did not get treated as
'critical' at all, I was the only person working on it.  Calling it
critical when major consumers in Linaro don't care and there's no time
left is arguably too narrow a focus on what actually is critical.  With
above in mind, for us this is even more wishlist than low.

3) Because of the nature of the 'nexty' content, intentionally, in
linux-linaro-tracking there is every reason to expect the merge will
conflict and the result will burn.  It's even reasonable to expect
sometimes the sparks from the burning code will uncover real, large
issues that need discussion upstream for a particular arch and solutions
that are not close by.  However the approach is really good, since it
means we get to look at and hopefully solve upcoming issues faster than
pure Linus HEAD.  But that pursuit is something to do when we have spare
cycles and right now at -rc1 we are underwater.

-- 
You received this bug notification because you are a member of TI OMAP
Developers, which is subscribed to linaro-landing-team-ti.
https://bugs.launchpad.net/bugs/918412

Title:
  OMAP Panda does not boot on Linux-linaro 3.2 branch

Status in Linaro Texas Instruments Landing Team:
  New
Status in Linaro Linux:
  Triaged

Bug description:
  Reported by Andy Green:

  Quite early in boot, it blows a new BUG() in the code around
  iotable_init(), I added some debug and see it's blowing up when trying
  to define io memory in omap_barriers_init().  A bunch of earlier
  machine-defined iotable_inits go OK.

  [    0.000000] iotable_init: adding addr=DA000000 size=02000000
  phys_addr=9A000000
  [    0.000000] iotable_init: adding addr=F8000000 size=00100000
  phys_addr=44000000
  [    0.000000] iotable_init: adding addr=FC000000 size=00400000
  phys_addr=4A000000
  [    0.000000] iotable_init: adding addr=F9000000 size=00100000
  phys_addr=50000000
  [    0.000000] iotable_init: adding addr=FD100000 size=00100000
  phys_addr=4C000000
  [    0.000000] iotable_init: adding addr=FD200000 size=00100000
  phys_addr=4D000000
  [    0.000000] iotable_init: adding addr=FD300000 size=00100000
  phys_addr=4E000000
  [    0.000000] iotable_init: adding addr=FA000000 size=00400000
  phys_addr=48000000
  [    0.000000] iotable_init: adding addr=FE800000 size=00800000
  phys_addr=54000000
  ...
  [    0.538024] omap_barriers_init: calling iotable_init
  [    0.543243] iotable_init: adding addr=FE600000 size=00100000
  phys_addr=AF600000
  [    0.550872] ------------[ cut here ]------------
  [    0.555725] Kernel BUG at c087c74c [verbose debug info unavailable]
  [    0.562255] Internal error: Oops - undefined instruction: 0 [#1] PREEMPT SMP
  [    0.569610] Modules linked in:
  [    0.572845] CPU: 1    Tainted: G        W
  (3.2.0-panda_tracking-topic-syslink-k3u+ #6)
  [    0.581451] PC is at vm_area_add_early+0x20/0x84
  [    0.586303] LR is at iotable_init+0xa8/0xbc
  [    0.590698] pc : [<c087c74c>]    lr : [<c086cd48>]    psr: 20000113
  [    0.590698] sp : ef78ff54  ip : ef78fe10  fp : 00000000
  [    0.602691] r10: c086cca0  r9 : 00000000  r8 : 40000001
  [    0.608154] r7 : ef7ffee0  r6 : ef78ff90  r5 : ef7ffec0  r4 : 00000001
  [    0.614959] r3 : 00000001  r2 : 00000000  r1 : 00000001  r0 : ef7ffec0
  [    0.621765] Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM
  Segment kernel
  [    0.629394] Control: 10c5387d  Table: 8000404a  DAC: 00000015
  [    0.635406] Process swapper/0 (pid: 1, stack limit = 0xef78e2f8)
  [    0.641662] Stack: (0xef78ff54 to 0xef790000)
  [    0.646240] ff40: 00000001 00000000 af600000
  [    0.654754] ff60: c09299e0 ef78e000 00000000 00000000 c0870f8c
  c0871070 c08d7f9c c08a7bf8
  [    0.663269] ff80: fe600000 000af600 00100000 0000000e c08a7bfc
  c0008870 0000009f c009f590
  [    0.671783] ffa0: 0000009f c0870f8c c0014640 00393531 00000000
  c0150000 00000000 c08e5604
  [    0.680297] ffc0: 0000019a c08a7bfc c08a842c c0014640 00000013
  00000000 00000000 00000000
  [    0.688812] ffe0: 00000000 c08668bc 00000000 00000000 c0866830
  c0014640 55555555 45515555
  [    0.697326] [<c087c74c>] (vm_area_add_early+0x20/0x84) from
  [<00000000>] (  (null))
  [    0.705322] Code: 059f3068 01a0c003 05933000 0a000010 (e7f001f2)
  [    0.711700] ---[ end trace 1b75b31a2719ed1d ]---
  [    0.716552] Kernel panic - not syncing: Attempted to kill init!
  [    0.722747] CPU0: stopping
  [    0.725646] [<c001a930>] (unwind_backtrace+0x0/0xf8) from
  [<c0018cf8>] (handle_IPI+0x114/0x140)
  [    0.734710] [<c0018cf8>] (handle_IPI+0x114/0x140) from [<c00086c0>]
  (gic_handle_irq+0x88/0xac)
  [    0.743682] [<c00086c0>] (gic_handle_irq+0x88/0xac) from
  [<c05fa0c0>] (__irq_svc+0x40/0x70)
  [    0.752380] Exception stack(0xc08adf70 to 0xc08adfb8)
  [    0.757659] df60:                                     ffffffed
  00000000 c08adfb8 00000000
  [    0.766174] df80: c08ac000 c0929aa8 c0603f50 c08c9bd8 c08c9d98
  412fc09a 00000000 00000000
  [    0.774688] dfa0: 00000000 c08adfb8 c00146bc c00146c0 60000013 ffffffff
  [    0.781616] [<c05fa0c0>] (__irq_svc+0x40/0x70) from [<c00146c0>]
  (default_idle+0x24/0x28)
  [    0.790130] [<c00146c0>] (default_idle+0x24/0x28) from [<c001493c>]
  (cpu_idle+0xfc/0x11c)
  [    0.798645] [<c001493c>] (cpu_idle+0xfc/0x11c) from [<c08667e0>]
  (start_kernel+0x260/0x2b0)
  [    0.807342] [<c08667e0>] (start_kernel+0x260/0x2b0) from
  [<80008044>] (0x80008044)

  You can see what's going on in omap_barriers_init() here:

  http://git.linaro.org/gitweb?p=landing-
  teams/working/ti/kernel.git;a=blob;f=arch/arm/mach-
  omap2/omap4-common.c;h=ad8c30d610736dde454bfe81f764f4a98b902dbf;hb=refs/heads/temp#l97

  From Andrey K.:

  
  I haven't dig deep into that, but the rmk's commit 716a3dc20084 description says that
  "Several platforms are now using the memblock_alloc+memblock_free+
  memblock_remove trick to obtain memory which won't be mapped in the
  kernel's page tables.  Most platforms do this (correctly) in the
  ->reserve callback.  However, OMAP has started to call these functions
  outside of this callback, and this is extremely unsafe - memory will
  not be unmapped, and could well be given out after memblock is no
  longer responsible for its management."

  Vanilla v3.2 kernel patched with "ARM: OMAP4: Fix errata i688 with MPU interconnect barriers" passes omap_barriers_init() ok (but crashes later in my setup). But "v3.2 merged with rmk's devel-stable" kernel encounters a BUG() inside the introduced omap_barriers_init()
  (the log is in the original message below). Looks like the "ARM: OMAP4: Fix errata i688 with MPU interconnect barriers" needs to be fixed anyway, but probably there is something in the rmk's devel-stable
  tree that makes this issue to the problem with calling the memblock_alloc+memblock_free+memblock_remove functions outside of ->reserve callback to reveal.

To manage notifications about this bug go to:
https://bugs.launchpad.net/linaro-landing-team-ti/+bug/918412/+subscriptions