group.of.nepali.translators team mailing list archive
-
group.of.nepali.translators team
-
Mailing list archive
-
Message #36927
[Bug 1756315] Re: fstrim and discard operations take too long to complete - Ubuntu 16.04
Hello Alexandre,
I tried to reproduce this bug, and I believe it has been fixed.
I started a i3.4xlarge instance on AWS, with Xenial as the distro:
$ uname -rv
4.4.0-1112-aws #124-Ubuntu SMP Fri Jul 24 11:10:25 UTC 2020
>From there, I checked the NVMe disks:
$ lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
xvda 202:0 0 8G 0 disk
`-xvda1 202:1 0 8G 0 part /
nvme0n1 259:0 0 1.7T 0 disk
nvme1n1 259:1 0 1.7T 0 disk
Made a Raid 0 array:
$ sudo mdadm --create --verbose --level=0 /dev/md0 --raid-devices=2 /dev/nvme0n1 /dev/nvme1n1
mdadm: chunk size defaults to 512K
mdadm: Defaulting to version 1.2 metadata
mdadm: array /dev/md0 started
And formatted it:
$ time sudo mkfs.xfs /dev/md0
meta-data=/dev/md0 isize=512 agcount=32, agsize=28989568 blks
= sectsz=512 attr=2, projid32bit=1
= crc=1 finobt=1, sparse=0
data = bsize=4096 blocks=927666176, imaxpct=5
= sunit=128 swidth=256 blks
naming =version 2 bsize=4096 ascii-ci=0 ftype=1
log =internal log bsize=4096 blocks=452968, version=2
= sectsz=512 sunit=8 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
real 0m24.414s
user 0m0.000s
sys 0m7.664s
I also tried to fstrim:
$ sudo mkdir /mnt/disk
$ sudo mount /dev/md0 /mnt/disk
$ sudo fstrim /mnt/disk
$ time sudo fstrim /mnt/disk
real 0m22.083s
user 0m0.000s
sys 0m7.560s
Things seem okay, I think this has been fixed somewhere between 4.4.0-1052-aws
and 4.4.0-1112-aws.
I'm going to mark this bug as fixed released, but let me know if you still have
problems.
** Also affects: linux (Ubuntu)
Importance: Undecided
Status: New
** No longer affects: linux-aws (Ubuntu)
** Also affects: linux (Ubuntu Xenial)
Importance: Undecided
Status: New
** Changed in: linux (Ubuntu)
Status: New => Fix Released
** Changed in: linux (Ubuntu Xenial)
Status: New => Fix Released
--
You received this bug notification because you are a member of नेपाली
भाषा समायोजकहरुको समूह, which is subscribed to Xenial.
Matching subscriptions: Ubuntu 16.04 Bugs
https://bugs.launchpad.net/bugs/1756315
Title:
fstrim and discard operations take too long to complete - Ubuntu 16.04
Status in linux package in Ubuntu:
Fix Released
Status in linux source package in Xenial:
Fix Released
Bug description:
1-) Ubuntu Release : Ubuntu 16.04 LTS
2-) linux-image-4.4.0.1052.54-aws
3-) mkfs.xfs and fstrim -v on a raid0 array using nvme and md should
not take more than some seconds to complete.
4-) Formating the raid0 array with xfs took around 2 hours. Running
fstrim -v on the mount point mounted on top of the raid array took
around 2 hours.
How to reproduce the issue:
- Launch an i3.4xlarge instance on Amazon AWS using an Ubuntu 16.04 AMI ( ami-78d2be01 on EU-WEST-1 ), this will generate an instance with one 8Gb EBS root volume and two 1.9T SSD drives that are presented to the instance using the nvme driver.
- Compose a raid0 array with the following command :
# mdadm --create --verbose --level=0 /dev/md0 --raid-devices=2
/dev/nvme0n1 /dev/nvme1n1
- When trying to format the raid0 array ( /dev/md0 ) using xfs it
takes around 2 hours to complete. I tried other AMIs like RHEL7,
CentOS7 and Ubuntu 18.04 and the time needed was around 2 seconds.
root@ip-172-31-30-133:~# time mkfs.xfs /dev/md0
real 120m45.725s
user 0m0.000s
sys 0m18.248s
- Running fstrim -v on a filesystem mounted on top of /dev/md0 can
take around 2 hours to complete. With other AMIs like RHEL7, CentOS7
and Ubuntu 18.04 and the time needed was around 2 seconds.
- When I try the same with any of the nvme SSD devices alone, let's
say /dev/nvme0n1, the issue doesn't happen.
- I tried to replicate this issue using LVM and striping, fstrim and
mkfs.xfs, the tasks complete without taking hours :
root@ip-172-31-27-69:~# pvcreate /dev/nvme0n1
Physical volume "/dev/nvme0n1" successfully created
root@ip-172-31-27-69:~# pvcreate /dev/nvme1n1
Physical volume "/dev/nvme1n1" successfully created
root@ip-172-31-27-69:~# vgcreate raid0 /dev/nvme0n1 /dev/nvme1n1
Volume group "raid0" successfully created
root@ip-172-31-27-69:~# lvcreate --type striped --stripes 2 --extents 100%FREE raid0 /dev/nvme0n1 /dev/nvme1n1
Using default stripesize 64.00 KiB.
Logical volume "lvol0" created.
root@ip-172-31-27-69:~# vgchange -ay
1 logical volume(s) in volume group "raid0" now active
root@ip-172-31-27-69:~# lvchange -ay /dev/raid0/lvol0
root@ip-172-31-27-69:~# lvs -a /dev/raid0/lvol0
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
lvol0 raid0 -wi-a----- 3.46t
htop
root@ip-172-31-27-69:~# time mkfs.xfs /dev/raid0/lvol0
meta-data=/dev/raid0/lvol0 isize=512 agcount=32, agsize=28991664 blks
= sectsz=512 attr=2, projid32bit=1
= crc=1 finobt=1, sparse=0
data = bsize=4096 blocks=927733248, imaxpct=5
= sunit=16 swidth=32 blks
naming =version 2 bsize=4096 ascii-ci=0 ftype=1
log =internal log bsize=4096 blocks=453008, version=2
= sectsz=512 sunit=16 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
real 0m2.926s
user 0m0.180s
sys 0m0.000s
root@ip-172-31-27-69:~# mount /dev/raid0/lvol0 /mnt
root@ip-172-31-27-69:~# time fstrim -v /mnt
/mnt: 3.5 TiB (3798138650624 bytes) trimmed
real 0m1.794s
user 0m0.000s
sys 0m0.000s
So the issue only happens when using nvme and md to compose the raid0
array.
Bellow follows some information that may be useful:
started formating the md array with mkfs.xfs. Process looks hanged.
root@ip-172-31-24-66:~# ps aux | grep -i mkfs.xfs
root 1693 12.0 0.0 12728 968 pts/1 D+ 07:54 0:03 mkfs.xfs /dev/md0
PID 1693 is in uninterruptible sleep ( D )
Looking at /proc/7965/stack
root@ip-172-31-24-66:~# cat /proc/1693/stack
[<ffffffff8134d8c2>] blkdev_issue_discard+0x232/0x2a0
[<ffffffff813524bd>] blkdev_ioctl+0x61d/0x7d0
[<ffffffff811ff6f1>] block_ioctl+0x41/0x50
[<ffffffff811d89b3>] do_vfs_ioctl+0x2e3/0x4d0
[<ffffffff811d8c21>] SyS_ioctl+0x81/0xa0
[<ffffffff81748030>] system_call_fastpath+0x1a/0x1f
[<ffffffffffffffff>] 0xffffffffffffffff
Looking at the stack, looks like it's hanged on a discard operation
root@ip-172-31-24-66:~# ps -flp 1693
F S UID PID PPID C PRI NI ADDR SZ WCHAN STIME TTY TIME CMD
4 D root 1693 1682 2 80 0 - 3182 blkdev 07:54 pts/1 00:00:03 mkfs.xfs /dev/md0
root@ip-172-31-24-66:~# cat /proc/1693/wchan
blkdev_issue_discard
Process stuck with function --> blkdev_issue_discard
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1756315/+subscriptions