← Back to team overview

firmware-testing-team team mailing list archive

Re: Fwd: Re: Using FWTS for Certification

 

Hi

On 10/04/2012 01:55 AM, Alex Hung wrote:
> On 10/04/2012 12:30 AM, Brendan Donegan wrote:
>> On 24/09/12 14:11, Brendan Donegan wrote:
>>> On 20/09/12 16:57, Jeff Lane wrote:
>>>>
>>>>
>>>>
>>>> -------- Original Message --------
>>>> Subject: Re: [Firmware-testing-team] Using FWTS for Certification
>>>> Date: Wed, 19 Sep 2012 17:27:37 +0100
>>>> From: Colin Ian King <colin.king@xxxxxxxxxxxxx>
>>>> To: firmware-testing-team@xxxxxxxxxxxxxxxxxxx
>>>>
>>>> On 19/09/12 16:19, Brendan Donegan wrote:
>>>>> (I hope this is the right mailing list)
>>>>>
>>>>> Hi,
>>>>>
>>>>> In Hardware Certification we have adopted FWTS for testing power
>>>>> management (amongst other things), including S3, shutdown and
>>>>> reboot. At
>>>>> the moment it is giving us a few issues because FWTS is quite strict
>>>>> about how it checks for kernel warnings and other issues. For Hardware
>>>>> Certification we need to be less strict, but without ignoring any
>>>>> problems which might be worthy of preventing the systems from being
>>>>> certified. The purpose of this email is to kick off a dialogue about
>>>>> which errors we really need to pay attention to for certification.
>>>>
>>>> Can you elaborate a little more about the features that are causing you
>>>> problems and which ones are useful?  Maybe some examples too?
>>> I'm working my way through all the failures we've seen, but just to
>>> get the ball rolling, I offer this one as an example:
>>>
>>>      1. 00054 summary         High failures: 1
>>>      2. 00055 summary          klog test, at 1 log line: 15
>>>      3. 00056 summary           "HIGH Kernel message: [ 1918.822903]
>>>         [Firmware Bug]: ACPI: No _BQC method, cannot determine initial
>>>         brightness"
> 
> _BQC is always interesting because ACPI spec does not explicitly say it
> is required for brightness but only sort of hints it; however, Linux
> kernel treats it as a bug and I don't think it it incorrect. We are
> always suggestion ODM/OEM BIOS to include _BQC in order to fix this
> error message.
> 
>>>
>>>
>>> Seen when running reboot tests. I assume the net effect of this bug
>>> would be to reset the brightness upon every boot? Or would it break
>>> the brightness control completely?
>> Here is another one we have come across on a Dell Optiplex XE during
>> poweroff and reboot tests:
>>
>> FAILED [CRITICAL] KlogPciACPIOSCRequestFailedreturned: Test 1, CRITICAL
>> Kernel message: [ 0.530647]pci0000:00: ACPI _OSC request failed
>> (AE_NOT_FOUND), returned control    mask: 0x1dADVICE: The _OSC method
>> evaluation failed, which will result in disabling PCIe functionality,
>> for example, the Linux kernel has to disable Active State Power
>> Management (ASPM) which means that PCIe power management is not
>> optimally configured.
>>
>> Can we get an idea of whether we can actually ignore these errors, why
>> they are considered critical and start a discussion about what we can do
>> to mitigate these failures?
>>
> 
> AE_NOT_FOUND happens when a "Name" (i.e. a variable) is declared cannot
> be accessed. I have seen it a few times and this needs BIOS fixes most,
> if not all, of time.
> 
> More analysis can be done by reading dmesg and acpidump.
> 
> This failure is critical because a ACPI method (i.e. _OSC in this case)
> fails half way and and there may be function loss. This, however, may
> not have stability impacts because kernel may treat it as "function not
> supported" - i.e. _OSC never exists.
> 
> For instance, _OSC is necessary for PCIE but not for PCI and the new
> features in PCIE may be simply not enabled. Having say that, not all
> PCIE-related driver spends on the evaluation of _OSC, and not all
> features will be lost - this is a good example that BIOS only "suggests"
> how OS behaves, not necessarily controls it.

Thanks Alex -- I think what Brendan is after here is more determinism in
the failures to let the Cert Team know that a bug should be opened. It
sounds like in this case, it may be as simple as changing an option in
your BIOS, or even a BIOS issue. Perhaps we should think about a new
classification of CRITIAL to make these easier to parse, CRITICAL-NOTE,
or something similar where the issue should be noted but may not be a
bug that can be fixed via the OS and may require a BIOS config change
and/or update.

Thoughts?

>> Thanks,
>>
>>>>
>>>> Colin
>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> Mailing list: https://launchpad.net/~firmware-testing-team
>>>>> Post to     : firmware-testing-team@xxxxxxxxxxxxxxxxxxx
>>>>> Unsubscribe : https://launchpad.net/~firmware-testing-team
>>>>> More help   : https://help.launchpad.net/ListHelp
>>>>
>>>>
>>>> _______________________________________________
>>>> Mailing list: https://launchpad.net/~firmware-testing-team
>>>> Post to     : firmware-testing-team@xxxxxxxxxxxxxxxxxxx
>>>> Unsubscribe : https://launchpad.net/~firmware-testing-team
>>>> More help   : https://help.launchpad.net/ListHelp
>>>>
>>>>
>>>
>>
>>
>>
>> _______________________________________________
>> Mailing list: https://launchpad.net/~firmware-testing-team
>> Post to     : firmware-testing-team@xxxxxxxxxxxxxxxxxxx
>> Unsubscribe : https://launchpad.net/~firmware-testing-team
>> More help   : https://help.launchpad.net/ListHelp
>>
> 
> 
> _______________________________________________
> Mailing list: https://launchpad.net/~firmware-testing-team
> Post to     : firmware-testing-team@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~firmware-testing-team
> More help   : https://help.launchpad.net/ListHelp
> 



References