← Back to team overview

kernel-packages team mailing list archive

[Bug 1402141] [NEW] Ubuntu 14.10 freezes when use smt-enabled=off as kernel argument

 

You have been subscribed to a public bug:

== Comment: #0 - Paulo Flabiano Smorigo <pfsmorigo@xxxxxxxxxx> - 2014-11-18 12:28:42 ==
Using Ubuntu as the host, if you add smt-enabled=off as kernel argument, the system will boot until the "Freeing initrd memory" line:
...
[    1.371729] vgaarb: loaded
[    1.372989] SCSI subsystem initialized
[    1.373977] libata version 3.00 loaded.
[    1.374158] usbcore: registered new interface driver usbfs
[    1.374246] usbcore: registered new interface driver hub
[    1.374382] usbcore: registered new device driver usb
[    1.374505] pps_core: LinuxPPS API ver. 1 registered
[    1.374563] pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti <giometti@xxxxxxxx>
[    1.374671] PTP clock support registered
[    1.377135] NetLabel: Initializing
[    1.377218] NetLabel:  domain hash size = 128
[    1.377328] NetLabel:  protocols = UNLABELED CIPSOv4
[    1.377472] NetLabel:  unlabeled traffic allowed by default
[    1.377983] Switched to clocksource timebase
[    1.395029] AppArmor: AppArmor Filesystem Enabled
[    1.402044] NET: Registered protocol family 2
[    1.403795] TCP established hash table entries: 524288 (order: 6, 4194304 bytes)
[    1.408343] TCP bind hash table entries: 65536 (order: 4, 1048576 bytes)
[    1.409301] TCP: Hash tables configured (established 524288 bind 65536)
[    1.409490] TCP: reno registered
[    1.409645] UDP hash table entries: 65536 (order: 5, 2097152 bytes)
[    1.411943] UDP-Lite hash table entries: 65536 (order: 5, 2097152 bytes)
[    1.415409] NET: Registered protocol family 1
[    1.415753] PCI: CLS 128 bytes, default 128
[    1.415962] Trying to unpack rootfs image as initramfs...
[    2.250464] Freeing initrd memory: 21952K (c000000003820000 - c000000004d90000)


Machine Type = Power 8 (S822L)

== Comment: #1 - Thadeu Lima De Souza Cascardo <thadeul@xxxxxxxxxx> - 2014-11-18 13:42:37 ==
What is the firmware version?

Cascardo.

== Comment: #2 - Paulo Flabiano Smorigo <pfsmorigo@xxxxxxxxxx> - 2014-11-19 07:13:35 ==
Currently is FW810.02 (SV810_061). Will update it today.

Smorigo.

== Comment: #3 - Paulo Flabiano Smorigo <pfsmorigo@xxxxxxxxxx> - 2014-11-19 12:47:24 ==
Updated to FW810.20 (SV810_101). Nothing changed.

== Comment: #4 - Greg Kurz <KURZGREG@xxxxxxxxxx> - 2014-11-20 05:47:45 ==
I reproduce it on a s824 with the FW810.20 (TV810_101) firmware, running 14.04.2 "alpha" (kernel 3.16.0-25). The issue doesn't show up with kernel 3.13.0-39. I shall try mainline and do some bisect.

== Comment: #5 - Greg Kurz <KURZGREG@xxxxxxxxxx> - 2014-11-20 13:31:03 ==
FYI issue is upstream.

== Comment: #6 - Breno Henrique Leitao <brenohl@xxxxxxxxxx> - 2014-11-24 11:23:04 ==
(In reply to comment #5)
> FYI issue is upstream.

Greg, are you working to solve this issue?

== Comment: #7 - Greg Kurz <KURZGREG@xxxxxxxxxx> - 2014-11-24 12:08:33 ==
(In reply to comment #6)
> (In reply to comment #5)
> > FYI issue is upstream.
> 
> Greg, are you working to solve this issue?

Yes I am.

== Comment: #8 - Greg Kurz <KURZGREG@xxxxxxxxxx> - 2014-12-01 04:56:07 ==
The hang occurs because all running threads are looping in the split core code:

static void wait_for_sync_step(int step)
{
	int i, cpu = smp_processor_id();

	for (i = cpu + 1; i < cpu + threads_per_core; i++)
>		while(per_cpu(split_state, i).step < step)
>			barrier();


The problem is that the split core code needs all possible threads to participate... if the kernel is booted with smt-enabled set to something different than the maximum value, some threads are missing and this ruins the sync.

== Comment: #9 - Greg Kurz <KURZGREG@xxxxxxxxxx> - 2014-12-01 05:24:28 ==
The current implementaqtion for smt-enabled= is a hack: it simply leaves hw threads looping where they happen to be (firmware probably)... This isn't acceptable in a production environment.

An "acceptable" fix would be to start all threads anyway and offline the
ones that need to be to honour the requested SMT mode AFTER subcores
init. This requires a non-trivial patch.

Since changing SMT mode from userspace when the system is booted is
really straightforward, Michael Ellerman suggests we simply drop that
smt-enabled= feature.

Smorigo,

Why were you using smt-enabled= ? Is there a reason not to do it after the system is booted with
ppc64_cpu --smt or writing directly to /sys/devices/system/cpu/cpu*/online ?

== Comment: #10 - Paulo Flabiano Smorigo <pfsmorigo@xxxxxxxxxx> - 2014-12-01 06:23:34 ==
I used smt-enabled= because for me was the easier way to disable it. Like, add this parameter in GRUB_CMDLINE_LINUX and done. :)

I'll check if there is a problem to drop it.

== Comment: #11 - Paulo Flabiano Smorigo <pfsmorigo@xxxxxxxxxx> - 2014-12-01 08:30:55 ==
Greg, are you saying to dropping it for good? Maybe we can add that as a feature request for next year. Btw, I'm ok with drop it for now.

== Comment: #12 - Greg Kurz <KURZGREG@xxxxxxxxxx> - 2014-12-01 09:30:00 ==
(In reply to comment #11)
> Greg, are you saying to dropping it for good? Maybe we can add that as a
> feature request for next year. Btw, I'm ok with drop it for now.

Yes, drop it for good as suggested by Michael Ellerman...

<mpe> groug: that smt-enabled stuff is just a hack. It leaves the cpu executing wherever it happens to be, possibly in firmware, possibly busy looping somewhere, it's really no good for use in production
<mpe> the only way we could make it usable I think is to have the cpu come up, and then we offline it
<mpe> but I'm really inclined to say that should just be done in userspace
<groug> mpe, yeah... I had thought of something similar (starting and then offlining) but I agree it should be handled from userspace
<mpe> I'll talk to benh and anton about it tomorrow, but I think we just rip it out

The point is that it is already extremely easy to change SMT mode from
an init script and you get the same result... compared to the hassle of
doing it in the kernel without breaking things. Not even worth a feature
request I would say.

== Comment: #13 - Greg Kurz <KURZGREG@xxxxxxxxxx> - 2014-12-12 08:50:25 ==
I've sent a patch:

powerpc/powernv: force all CPUs to be bootable

http://patchwork.ozlabs.org/patch/420440/

** Affects: linux (Ubuntu)
     Importance: High
         Status: Triaged


** Tags: architecture-ppc64le bugnameltc-119051 severity-medium targetmilestone-inin---
-- 
Ubuntu 14.10 freezes when use smt-enabled=off as kernel argument
https://bugs.launchpad.net/bugs/1402141
You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu.