← Back to team overview

kernel-packages team mailing list archive

[Bug 1269777] Re: [ Ubuntu 14.04 - Lenovo W510 - SSD Samsung 840 PRO] Sudden Read-Only Filesystem

 

##########################################################

DIARY since Wed, 15 Jan 2014
-----------------------------------------
Please see the attached log of this post for the last 6 days.

##########################################################
CONCLUSION (mainline kernel)
----------------------------
Wed, 22 Jan 2014 10:58:57 +0100
 10:58:57 up 23:45,  4 users,  load average: 0,11, 0,12, 0,19

Finally got quite stable system with some error remaining.


A. What DID "solve" the problem:
----------------------------------

1. Disabling the POWER: PCIe / PCI powermanagementin the bios.
I seemed to work, but did not in the end. After a while a hard reset occured and the system freezed.

That's why additional tasks are necessary:

2. The usage of the tlp package seams to force PCIe powermangement, at least with my settings (see formerly attached files).
Since freezes also occured with ubuntu 12.04, it should not be a problem with the tlp package itself, but the machine.
Removing tlp seems to work.

3. Additionally disabling PCIe powermanagement by adding the boot
parameter libata.noacpi=1 seams to be crucial.

4. Adding "options libata noacpi=1" in "/etc/modprobe.conf" is also
crucial for getting a stable system after supsend or resume.

5. I also applied the Thinkpad dock / undock scripts from comment #34 (https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1269777/comments/34).
Without this, I enountered various problems when docking / undocking the machine.


B. What did NOT solve the problem:
-------------------------------

1. Using a different SSD (Samsung 840 Pro instead of 840 EVO) is not the point.
The same problems did occur with both type of hardware.

2. Using the mainline, the standard or an older kernel lead to the same
problems.

3. Adding the libata.noncq=1 kernel option is not necessay and not advised due to performance (see comment #44)
Nevertheless: the error messages in dmesg "failed to get NCQ Send/Recv Log Emask 0x1" remain.

4. Using the SATA/Compatibility mode in bios.

5. I also applied the kernel parameters  "libata.force=1:3.0Gbps,2:1,5Gbps,2:80c modprobe.blacklist=pata_acpi" which lead to the same results: ata1 was limited to 1,5 Gbps after a while.
Also created an /etc/modprobe.conf with this content: "options libata noacpi=1 force=1:3.0Gbps,2:1.5Gbps,2:40c"

C. What causes still problems:
-------------------------------

1. Hibernation: seams to force upcoming of hard resets. In the end this
forces the kernel to set the ata1 device to a limit of 1,5 Gbps (when
libata.noacpi=1 is applied)

2. After a while ata1 is limited to 1,5 Gbps.

2. Setting modprobe.conf and kernel parameters is far from perfect.


D. FURTHER TESTING
-------------------

1. Apply the SSD of another Tinkpad (X201 / Samsung EVO 830) to this machine for testing purpose.
Just to see if this is really a problem with the Samsung 840 series SSD.

2. Test "safe" settings the Trusty mainline kernel.

3. Talking to the maintainer of the tlp package - this might solve
interferences with PCIe powermangement.

4. Testing a normal harddisk with PCIe on, without libata kernel boot
options and tlp and see if it works.

Please be patient, this could take a few days, if not weeks.


E. SUMMARY
----------

The problem occur with PCIe powermanagement ENABLED in bios and NOT applying the libata.noacpi=1 kernel parameter.
Also the tlp package seems to force the usage of PCIe powermanagement and should not be used with this machine / bios.

At January 15th, I had a running system (see at the beginning the attached log to this post) with kernel 3.12.7.
This had been exact the same settings from [1. - 3.], except with 3.12.7 kernel.
So I think this are the minimum requirements getting things done.

F. QUESTIONS
---------

1. Is there any chance that this might be a problem that can be fixed by
the kernel?

2. Does anybody know how to report this to lenovo?

3. Probably this might be a hardware problem, not just firmware / biso
related?


F. ATTACHMENTS:
------------

Attached you'll find dmesg.log for the actual uptime since last boot on
Sun, 19 Jan 2014 12:32:45 +0100.


** Attachment added: "dmesg_egrep_ata_scsi_BOOT.log"
   https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1269777/+attachment/3954221/+files/dmesg_egrep_ata_scsi_BOOT.log

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1269777

Title:
  [ Ubuntu 14.04 - Lenovo W510 - SSD Samsung 840 PRO] Sudden Read-Only
  Filesystem

Status in “linux” package in Ubuntu:
  Confirmed

Bug description:
  Like mentioned in bug #1266305 (and before in bug #1265309) the
  following problem occured:

  1. System
  --------------------------

  - Ubuntu 14 04 LTS (acutal development branch)
  - linux-image-generic v. 3.13.x,  mainline kernel 3.13.x (first testing the standard kernel), last 3.12 mainline kernel
  - lvm with trim support (batched discard)
  - tlp (currently disabled for testing purpose)

  2. Hardware:
  ---------------------------
  - Lenovo W510 i7 quad core
  - SSD Samsung 840 Pro (former Samsung 840 EVO, was changed in hope that the problem disapears)
  - nvidia graphics

  BIOS (defaults F9) and:

  - BIOS Mode Standard 1: SATA: AHCI, POWER: pci and pcie power management AUTOMATIC
  - BIOS Mode Standard 2: SATA: AHCI, POWER: pci and pcie power management OFF

  - BIOS Mode Compat 1: SATA: Compatability Mode, POWER: pci and pcie power management AUTOMATIC
  - BIOS Mode Compat 2: SATA: Compatability Mode, POWER: pci and pcie power management OFF

  3. Problem descripton
  -----------------------------------------------

  The SSD randomly freezes with the following errors:
  ata1: EH complete
  ata1: limiting SATA link speed to 1.5 Gbps
  ata1.00: exception Emask 0x52 SAct 0x1 SErr 0x1a80d00 action 0x6 frozen
  ata1.00: irq_stat 0x08000000, interface fatal error
  ata1: SError: { UnrecovData Proto HostInt 10B8B BadCRC LinkSeq TrStaTrns }
  ata1.00: failed command: READ FPDMA QUEUED
  ata1.00: cmd 60/08:00:d8:b9:27/00:00:05:00:00/40 tag 0 ncq 4096 in
  ata1.00: status: { DRDY }
  ata1: hard resetting link
  ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310)

  Probably affect other Thinkpad models too:

  - http://www.sevenforums.com/bsod-help-support/309288-bsod-after-few-minutes-5-55-minutes-samsung-ssd-840-evo.html
  - [GERMAN] http://thinkpad-forum.de/threads/168841-Freezes-nach-Einbau-einer-neuen-SSD?p=1698795#post1698795
  - http://www.howtoeverything.net/linux/hardware/ubuntu-freeze-issue-after-ssd-upgrade

  4. Troubleshooting so far
  ------------------------------------------------
  4.1. Kernel boot parameters:

  libata.force=1:3.0G,2:1,5G libata.force=noncq

  -> not working. DRDY, hard restting link.

  4.2. BIOS Mode Compat 2 (Compat, PCI powermangement OFF), no
  additional kernel boot parameters.

  System ran with standard generic kernel 3.13.x for over 12 hours usage without any problems.
  I did some suspend/ resume cycles: no problems.

  5. Further testing:
  --------------------------------------------------------
  Testing scenarios will be applied in the follwowing structure (if necessary)

  5.1 Testing the standard trusty kernel (linux-image-generic)
  5.1.1.  with all BIOS modes.
  5.1.2.  eventually kernel boot parameters

  5.2. Testing the latest mainline kernel  with all BIOS modes.
  5.2.1.  with all BIOS modes.
  5.2.2.  eventually kernel boot parameters

  This could take a few days, so be patient.

  Keeping fingers crossed ;-)

  P.S-.: Thanks a lot for Christopher M. Penalver for his excellent
  assistance so far!

  ######################################################################

  Wed, 15 Jan 2014 18:00:00 +0100

  - installed mainline kernel 3.13.0-031300rc8-generic
  - BIOS:
        SATA: Compatibility,
        POWER: pci / pcie power management OFF (!)
  - no tlp

  No problems until the next day (switched POWER management for pci and
  pcie on)

  ######################################################################

  Thu, 16 Jan 2014 13:18:45 +0100

  - installed linux-image-generic -> 3.13.0-3-generic
  - BIOS:
        SATA: AHCI,
        POWER:  pci/pcie power management AUTOMATIC
  - Reboot with new standard kernel
  - no tlp yet!

  ######################################################################

  ProblemType: Bug
  DistroRelease: Ubuntu 14.04
  Package: linux-image-generic 3.13.0.3.6
  ProcVersionSignature: Ubuntu 3.13.0-3.18-generic 3.13.0-rc8
  Uname: Linux 3.13.0-3-generic x86_64
  NonfreeKernelModules: nvidia
  ApportVersion: 2.13.1-0ubuntu1
  Architecture: amd64
  AudioDevicesInUse:
   USER        PID ACCESS COMMAND
   /dev/snd/controlC1:  apos       3069 F.... pulseaudio
   /dev/snd/controlC0:  apos       3069 F.... pulseaudio
  CurrentDesktop: Unity
  Date: Thu Jan 16 12:35:13 2014
  HibernationDevice: RESUME=UUID=7290992b-11df-4d5c-a9bc-579dafe5eb61
  InstallationDate: Installed on 2014-01-08 (7 days ago)
  InstallationMedia: Ubuntu 14.04 LTS "Trusty Tahr" - Alpha amd64 (20140105)
  MachineType: LENOVO 4391E46
  ProcFB:

  ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-3.13.0-3-generic root=/dev/mapper/ubuntu--vg-root ro quiet splash
  RelatedPackageVersions:
   linux-restricted-modules-3.13.0-3-generic N/A
   linux-backports-modules-3.13.0-3-generic  N/A
   linux-firmware                            1.121
  SourcePackage: linux
  UpgradeStatus: No upgrade log present (probably fresh install)
  dmi.bios.date: 10/03/2012
  dmi.bios.vendor: LENOVO
  dmi.bios.version: 6NET84WW (1.45 )
  dmi.board.name: 4391E46
  dmi.board.vendor: LENOVO
  dmi.board.version: Not Available
  dmi.chassis.asset.tag: No Asset Information
  dmi.chassis.type: 10
  dmi.chassis.vendor: LENOVO
  dmi.chassis.version: Not Available
  dmi.modalias: dmi:bvnLENOVO:bvr6NET84WW(1.45):bd10/03/2012:svnLENOVO:pn4391E46:pvrThinkPadW510:rvnLENOVO:rn4391E46:rvrNotAvailable:cvnLENOVO:ct10:cvrNotAvailable:
  dmi.product.name: 4391E46
  dmi.product.version: ThinkPad W510
  dmi.sys.vendor: LENOVO

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1269777/+subscriptions


References