kernel-packages team mailing list archive
-
kernel-packages team
-
Mailing list archive
-
Message #84249
[Bug 1371591] Re: file not initialized to 0s under some conditions on VMWare
Hi Chris,
This is Arvind Kumar from VMware. Recently the issue discussed in this
bug was brought into VMware's notice. We looked at the patch
(https://lkml.org/lkml/2014/9/23/509) which was done to address the
issue. Since the patch is done in mptsas driver, it addresses the issue
only on lsilogic controller, if user uses some other controller e.g.
pvscsi or buslogic then the issue remains. Moreover the patch disables
the WRITE SAME completely on the lsilogic which indicates that VMware
will never be able to support WRITE SAME on lsilogic. As I understand
from the bug, it is concluded that the WRITE SAME is not properly
implemented by VMware. Actually we don't support WRITE SAME at all.
We internally investigated the issue and as per our understanding the
issue is not VMware specific and rather seems to be with the kernel,
which could very well happen on real hardware too in case the disk
doesn't support WRITE SAME command. Below are the details of the
investigation by Petr Vandrovec.
--
In blk-lib.c on line 294 it checks whether bdev supports write_same.
With LVM, bdev here is dm-0. It says yes, it is supported, and so
write_same is invoked (note that check is racy in case device loses
write_same capability between test and moment bio is issued):
291 int blkdev_issue_zeroout(struct block_device *bdev, sector_t sector,
292 sector_t nr_sects, gfp_t gfp_mask)
293 {
294 if (bdev_write_same(bdev)) {
295 unsigned char bdn[BDEVNAME_SIZE];
296
297 if (!blkdev_issue_write_same(bdev, sector, nr_sects, gfp_mask,
298 ZERO_PAGE(0)))
299 return 0;
300
301 bdevname(bdev, bdn);
302 pr_err("%s: WRITE SAME failed. Manually zeroing.\n", bdn);
303 }
304
305 return __blkdev_issue_zeroout(bdev, sector, nr_sects, gfp_mask);
306 }
307 EXPORT_SYMBOL(blkdev_issue_zeroout);
Then it gets to LVM, and LVM forwards request to sda. When it fails,
kernel clears bdev_write_same() on sda, and returns -121 (EREMOTEIO).
Now next request comes. Nobody cleared bdev_write_same() on dm-0, it got
cleared only on sda, so request gets to LVM, which forwards it to sda.
Where it hits a snag in blk-core.c:
1824 if (bio->bi_rw & REQ_WRITE_SAME && !bdev_write_same(bio->bi_bdev)) {
1825 err = -EOPNOTSUPP;
1826 goto end_io;
1827 }
bi_bdev here is sda, and I/O fails with EOPNOTSUPP, without WRITE_SAME
ever being issued. And then it hits completion code that treats
EOPNOTSUPP as success:
18 static void bio_batch_end_io(struct bio *bio, int err)
19 {
20 struct bio_batch *bb = bio->bi_private;
21
22 if (err && (err != -EOPNOTSUPP))
23 clear_bit(BIO_UPTODATE, &bb->flags);
24 if (atomic_dec_and_test(&bb->done))
25 complete(bb->wait);
26 bio_put(bio);
27 }
So everybody outside of blkdev_issue_write_same() thinks that I/O
succeeded, while in reality kernel even did not issue request!
Fix should:
1. Use different error code if WRITE_SAME request is thrown away. Or
remove special EOPNOTSUPP handling from end_io - I assume EOPNOTSUPP is
supposed to ignore failures from discarded commands, but then nobody
else should be using EOPNOTSUPP, and
2. WRITE_SAME failure should propagate from sda to dm-0.
--
Our understanding is that we should revert the fix in mptsas driver and
try to do the right fix as described above. I am attaching the patch
from Petr who did the investigation. CC'ing all involved people from
VMware too. Could you please evaluate the patch and suggest on further
steps?
Thanks!
Arvind
** Patch added: "0001-Do-not-silently-discard-WRITE_SAME-requests.patch"
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1371591/+attachment/4230953/+files/0001-Do-not-silently-discard-WRITE_SAME-requests.patch
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1371591
Title:
file not initialized to 0s under some conditions on VMWare
Status in “linux” package in Ubuntu:
Fix Released
Status in “linux” source package in Trusty:
Fix Committed
Bug description:
Under some conditions, after fallocate() the file is observed not to
be completely initilized to 0s: some 4KB pages have left-over data
from previous files that occupied those pages. Note that in addition
to causing functional problems for applications expecting files to be
initialized to 0s, this is a security issue because it allows data to
"leak" from one file to another, bypassing file access controls.
The problem has been seen running under the following VMWare-based virtual environments:
Fusion 6.0.2
ESXi 5.1.0
And under the following versions of Ubuntu:
Ubuntu 12.04, 3.11.0-26-generic
Ubuntu 14.04.1, 3.13.0-32-generic
Ubuntu 14.04.1, 3.13.0-35-generic
But did not reproduce under the following version:
Ubuntu 10.04, 2.6.32-38-server
The problem reproduced under LVM, but did not reproduce without LVM.
I reproduced the problem as follows under VMWare Fusion:
set up custom VM with default disk size (20 GB) and memory size (1 GB)
attach Ubuntu 14.04.1 ISO to CDROM, set it as boot device, boot up
select all defaults during installation _including_ LVM
install gcc
unpack the attached repro.tgz
run repro.sh
what it does:
* fills the disk with a file containing bytes of 0xcc then deletes it
* repeatedly runs the repro program which creates two files and accesses them in a certain pattern
* checks the file f0 with hexdump; it should contain all 0s, but if pages 0x1000-0x7000 contain 0xcc you have reproduced the problem
If the problem does not appear to reproduce, please try waiting a bit
and checking the f0 files with hexdump again. This behavior was
observed by a customer reproducing the problem under ESXi. I since
added an sync after the running the repro binary which I think will
fix that.
If you still can't reproduce the problem please let me know if there's
anything I can do to help. For example can we trace the disk accesses
at the SCSI level to verify whether the appropriate SCSI commands are
being sent? This may help determine whether the problem is in Linux or
in VMWare.
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1371591/+subscriptions
References