← Back to team overview

ecryptfs-devel team mailing list archive

Re: [Patch 0/1] Add support for file names that are too long after encryption

 

Gah just some updated, thoughts corrections etc, so look at previous email
for the full reply


On 02/06/2011 07:43 AM, Dustin Kirkland wrote:
> On Sun, Feb 6, 2011 at 12:55 AM, John Johansen
> <john.johansen@xxxxxxxxxxxxx> wrote:
>> On 02/05/2011 08:13 PM, Dustin Kirkland wrote:
>>> On Tue, Feb 1, 2011 at 9:57 AM, John Johansen
>>> <john.johansen@xxxxxxxxxxxxx> wrote:
>>>> The following patch is a first pass at addressing the bug
>>>>  "file name too long when creating new file"
>>>>  https://bugs.launchpad.net/ecryptfs/+bug/344878

<<snip>>

>>> Ideally, we would use the header of the file to store this long
>>> filename, which would provide the great advantage of providing
>>> increased portability across filesystems.  Without requiring xattrs,
>>> it's trivial, for instance, to backup the lower encrypted files to a
>>> VFAT filesystem on a USB stick.
>>>
>> Right, and I am still looking at this.  It can be added without too
>> much effort, and the two methods can coexist, if so desired.
> 
> Oh, one other use case that comes to mind beyond USB sticks -- it's
> also somewhat common to back up data to CDRs and DVDRs, for which VFAT
> is pretty much required too, I think.
> 
>>> However, in such circumstances, VFAT does lose some metadata about the
>>> file, such as permissions.  Furthermore, I have come to understand the
>>> complexity of supporting elemental UNIX/Linux functionality such as
>>> hardlinks using the file header alone.  For these reasons, I will
>>> accept that your xattr approach is a reasonable solution to this
>>> problem, with the one very hard requirement that the short name you
>>> provide when the xattr is not available is both a) unique, and b) as
>>> absolutely descriptive as possible.
>>>
>> Right so currently we are just failing if we can't store the longname
>> in an xattr.  Its possible we could provide a very descriptive
>> shortname for vfat type failures, something along the lines of
>>  ECRYPTFS.SHORTENED.XXXX~md5
>> where XXX represent as much of the actual name as we can possibly fit
>> in the space provided.
> 
> It would be nice if you could provide us a few examples from your
> running code of what upper and lower filenames look like when they're
> too long.
> 
currently for the filename
abcdefghijklmnopqrstuvwxyz01234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789.txt

it would get a short name of
ECRYPTFS_SHORTNAME.sLGvy74p.yUYzsRcMIdh3---

which would be encrypted and encoded for the underlying fs as
ECRYPTFS_FNEK_ENCRYPTED.FYYwryMXdOJKVUQZfN26kvrVp30Yif-XV8gAGMafbWFD2uwvXzuJyuKwSKBQYqTjFs445JHAx4upNGsTAAwVi.dfuvY03TrVLnYw

The proposed descriptive shortname would be something that would exploit the
full space available for a name and would essentially provide two different
names to access a file by the original long name and the special short name.
The original name if used on lookup would get hashed to the shortname, so it
would available to applications referencing it that way.  And it would also
be available for direct lookup via the short name.

Of course this breaks for anything trying to look in the directory listing
for the original name, but would work for programs doing stat.  I just don't
have any idea of what the failure rate for this would be

So that would look something like
ECRYPTFS_SHORTNAME.abcdefghijklmnopqrstuvwxyz01234567890123456789012345678901234567890123456789012345678901234567890.sLGvy74p.yUYzsRcMIdh3---

or if the encoding tried to save the trailing .txt
ECRYPTFS_SHORTNAME.abcdefghijklmnopqrstuvwxyz0123456789012345678901234567890123456789012345678901234567890123456.sLGvy74p.yUYzsRcMIdh3---.txt


I don't see using such descriptive short name encoding as the solution
except for file systems (or parts of) that don't support another method.
That would leave us supporting two methods with one being a fall back.

Well in actual fact it could be more than that, we could do any combination
of the methods.
  Encode long names in headers for files but fall back to xattrs or
  descriptive short names for directories, and symlinks.

Supporting any combination of options is possible but I think if we do
something like this we should limit so that we can control the complexity
and documented it well.


<<snip>>

>>> As an aside (and feel free to break this off into a separate thread,
>>> if you like)...  Tyler and I have discussed several times shortening
>>> the long-and-clunky preamble on encrypted filenames.  This does eat
>>> ~23 out of 255, so roughly 10% of the total available file name
>>> length.  It would be pretty rare to have encrypted and non-encrypted
>>> files next to one another in the same directory, so this preamble
>>> seems unnecessarily long, to me.  Any thoughts about trimming this
>>> down some?  Tyler?  Mike?  John?
>>>
>> Biggest downside to trimming is losing some backwards compatibility, and only
>> gaining a few bytes for it.
> 
> Okay, thanks.  Yeah, not worth it.
> 
My use of backwards compatibility was wrong here, we wouldn't loose that.
We would lose compatibility with older versions of ecryptfs, but could still
support files with the older naming format in newer versions.  We could even
make using the newer naming format a flag so that the use of the new naming
format would be optional, and could live within the same mount much like
plain text pass through.  So its not a big deal, but is an annoyance for
dual boot, and recovery.

Now to what can be saved?
At most 24 bytes if the prefix (including trailing .) is eliminated entirely(1).
With the current prefix the longest name before requiring longname support is
143 bytes (252 bytes when encrypted and encoded).  So getting rid of the
prefix would enable 167 byte names.

So getting rid of or changing the prefix still requires some form of long name
support so I see this as a separate issue.  However consider that the savings
are an all or nothing situation.  If we did change the prefix it needs to be
between 0 and 3 bytes long to see any savings due to how the encrypt and encode
padding works (24 byte steps).





References