← Back to team overview

ecryptfs-devel team mailing list archive

Re: [Patch 0/1] Add support for file names that are too long after encryption

 

On 02/05/2011 08:13 PM, Dustin Kirkland wrote:
> On Tue, Feb 1, 2011 at 9:57 AM, John Johansen
> <john.johansen@xxxxxxxxxxxxx> wrote:
>> The following patch is a first pass at addressing the bug
>>  "file name too long when creating new file"
>>  https://bugs.launchpad.net/ecryptfs/+bug/344878
>>
>> Which occurs when a file is created with a file name that would be valid
>> before encrypting and encoding but after being encrypted and encoded is
>> too long for the underlying filesytem.
> 
> First and foremost, thank you, John, for tackling this ~2 year old
> problem.  It's something that we knew would be an issue when we
> embarked on encrypted (or, as I prefer, obfuscated) filenames.  We
> didn't really have any idea how big of a problem it might be.  I
> remember doing a deep find on all of my ~10 Ubuntu systems, and did
> not have any particular file that was >200 characters, nor path that
> was >2000 characters.  However, different strokes for different folks,
> and eventually people did start having issues.
> 
>> Overview:
>> To support file names that are too long when encrypted and encoded the patch
>> stores the long file name (longname) in an xattr on the file and creates a
>> "unique" short file name (shortname) which is stored in the underlying
>> filesystem.  The shortname is never seen when accessing files from the
>> ecryptfs view, but it is what will be found when accessing the lower
>> filesystem directly.
>>
>> While the patch currently uses xattrs it is possible to convert to storing
>> the longname in the ecryptfs file header (see below for some notes about
>> advantanges and disadvantages), or even allow for both options.
> 
> So we've discussed this 1:1, but I'll just re-state here...
> 
> Ideally, we would use the header of the file to store this long
> filename, which would provide the great advantage of providing
> increased portability across filesystems.  Without requiring xattrs,
> it's trivial, for instance, to backup the lower encrypted files to a
> VFAT filesystem on a USB stick.
> 
Right, and I am still looking at this.  It can be added without too
much effort, and the two methods can coexist, if so desired.

> However, in such circumstances, VFAT does lose some metadata about the
> file, such as permissions.  Furthermore, I have come to understand the
> complexity of supporting elemental UNIX/Linux functionality such as
> hardlinks using the file header alone.  For these reasons, I will
> accept that your xattr approach is a reasonable solution to this
> problem, with the one very hard requirement that the short name you
> provide when the xattr is not available is both a) unique, and b) as
> absolutely descriptive as possible.
> 
Right so currently we are just failing if we can't store the longname
in an xattr.  Its possible we could provide a very descriptive
shortname for vfat type failures, something along the lines of
  ECRYPTFS.SHORTENED.XXXX~md5
where XXX represent as much of the actual name as we can possibly fit
in the space provided.

Alternately we can continue looking at storing in the ecryptfs header
of a special ecryptfs "dentry" file.

>> Current State:
>> - Use xattrs to store longname on the file
>> - Detects xattr support at mount time
>> - Uses a mount flag for longname support
>>  - currently the mount flag is inverted.  Longname support is enabled
>>    by default and the flag is used to disable it.
>>  - current method is some what hacky in that it was assumed this
>>    would be inverted, back to requiring a flag but if not this can
>>    be cleaned up.
> 
> This is okay by me.  I can add some support code in ecryptfs-utils
> which would allow for this to be configured on a per-user basis in
> ~/.ecryptfs with some flag file, perhaps
> ~/.ecryptfs/disable-long-names.
> 
Right, either way works.  Its just a matter of choosing which is the
best for ecryptfs and moving forward.  I think either way will need
some support code.

>> - Currently the code is does not have a Kconfig to disable at compile
>>  time.  Is this desired?
> 
> Not desired by me, but I think others may have dissenting opinions and
> valid reasons.
> 
>> - the longname xattr is stored in the trusted namespace using the
>>  trusted.ecryptfs. prefix
>> - the longname is encrypted using the same tag70 packet encoding as any
>>  other encrypted file name.  It is not encoded to reduce the size of the
>>  xattr.
>> - a file can have multiple longnames (hardlinks)
> 
> Cool.
> 
>> - each longname is stored as a single xattr name, value pair.
>>  - the xattr name is based off of the encrypted and encoded shortname
>>    without the ECRYPTFS_FNEK prefix
>>    eg.
>>       if the encrypted and encoded shortname is
>>          ECRYPTFS_FNEK_ENCRYPTED.FZYwryMXdKVUQZfN26kvrVp30Yif
>>       then the xattr name will be
>>          trusted.ecryptfs.FZYwryMXdKVUQZfN26kvrVp30Yif
> 
> Okay, that sounds fine to me.
> 
> As an aside (and feel free to break this off into a separate thread,
> if you like)...  Tyler and I have discussed several times shortening
> the long-and-clunky preamble on encrypted filenames.  This does eat
> ~23 out of 255, so roughly 10% of the total available file name
> length.  It would be pretty rare to have encrypted and non-encrypted
> files next to one another in the same directory, so this preamble
> seems unnecessarily long, to me.  Any thoughts about trimming this
> down some?  Tyler?  Mike?  John?
> 
Biggest downside to trimming is losing some backwards compatibility, and only
gaining a few bytes for it.

>>    + it would be possible to reduce the size of the xattr name if it was
>>      based on the unencrypted and unecoded shortname
>>  - the value contains the encrypted long filename
>> - if the expected longname is missing, the current code falls back to
>>  using the shortname.
> 
> Good.  I think we should really create some automated test cases
> around here, generating thousands of files, testing and reporting on
> the long names and shortnames, the mappings, etc.
> 
yeah I have been messing with this a bit, and need to improve the tests
I have, and kick them out.

>>  + a mount option could be added to force failure instead of trying to
>>    gracefully fallback
>> + the patch extends the ecryptfs private dentry field with a longname flag
>>  that is used to indicate that the underlying dentry has a longname
>> - a unique shortname is used as a place holder for the long file name in
>>  the lower filesystem.
>>  + the current encoding of the shortname will most like change a least some
> 
> How so?  Can you elaborate on this?
> 
Well I think it will include something of the directory either before or in the
hash.  This fixes the name collision of two hardlinks having the same name but
being in different directories.  While this isn't a problem per say for the
obfuscated name it is a problem for the shortname as it prevents the second
hardlink from being created.

That said we have to be careful with it and not just use the directory name,
as we don't won't the names to break if the directory a file in is renamed.

>>  + the shortname generated is always the same for the same name, this
>>    leaks more information than it should and can result in collisions
>>    if the same name is used from different directories.
> 
> That's no different than we have now for all encrypted filenames.
> This is why I prefer to call this feature "file name obfuscation"
> rather than encryption.  The scheme for encrypting file contents is
> particularly strong in eCryptfs, which each individual file being
> encrypted with a unique, random key.  This is clearly not the case for
> filenames, and this is due entirely to the performance demands
> necessary.
> 
> In any case, this is in no way a blocker for.  There's plenty of meta
> information about an encrypted file which is already available --
> permissions, ownerships, atimes, mtimes, ctimes.  The filename is
> merely an extension of this.  Filename obfuscation is merely a subtle
> layer of abstraction that makes the real filename simply non-obvious.
> I maintain that the real value of eCryptfs is providing strong
> security for the contents of each file at rest.
> 
right, the only issue for me here is the hardlink name collision mentioned
above.

>>  + the current shortname generation doesn't deal with potential collision
>>    between encrypted and encoded file names (this seems pretty unlikely),
>>    nor with name collisions of filenames that hash to the same md5 (again
>>    unlikely)
> 
> Yeah, no worries here, by me.  Someone would really have to try and
> cause collisions for this to be a problem.  This isn't a matter of
> accidentally touching the stove.  This is more like sticking your hand
> in a blender.  We don't recommend it.  No, really; don't.
> 
>>  - currently the shortname is created from combining the the
>>    ECRYPTFS_FNEK_ENCRYPTED. prefix with the encoded md5 hash of the long
>>    file name.
>>    eg.
>>      ECRYPTFS_FNEK_ENCRYPTED.sdfjyo34n2lkh2lknlkafa--
>>  - the shortname is encrypted and encoded just like any other filename
>>  - both the shortname and the encrypted and encoded shortname must have
>>    the ECRYPTFS_FNEK_ENCRYPTED. for a file name to be considered a valid
>>    shortname
>>  - This design allows for the shortname to "work" to some degree, with
>>    older versions of ecryptfs.  Name lookups based off of the long file
>>    name won't work but the shortname can be used so that files can
>>    be copied/moved without losing data.
> 
> Hmm, okay.  If I understand you correctly, I think I agree with this
> approach.  I will want to play with it a bit and see how it actually
> behaves in practice.  And we will want to establish some solid
> documentation around this.
> 
Yeah we will want to document the hell out of it because, while its nice
that it can work on an older version of ecryptfs you risk "losing" the
longname information if you rename files.

>> - only the symlink name can be give a long name currently.  The
>>  symlink target encryption hasn't changed.
>>  - this means symlinks don't use the shortname when being accessed
>>    by older versions of ecryptfs.  So even if the long name file
>>    they reference exists they won't resolve to a long name file.
>>    - it is possible to have the target to use shortnames
>>  - it is possible to add support for long name targets, that after
>>    encrypting and encoding are too long.  By using short names and
>>    an extra xattr for the long target name on the symlink.
> 
> Yeah, this does sound desirable.  I'd think symlinks should be able to
> function by pointing to either a long name, or a short name, and that
> eCryptfs would correctly handle both.
> 
Hrmmm, yeah that would be easy enough to do.  I'll update for that.

>> = Supportting long file names =
>>
>> Since encrypting and encoding expand the length of the dentry, we need to
>> either cancel out the expansion or store the extra information for the
>> long name else where.  This also necessitates putting a shorter place
>> holder name as the name in the file system.
>>
>> Each method of dealing with long names have their own advantages and
>> disadvantages.
>>
>> == compression ==
>> Little gain, certainly not enough for all possible long file names.  Several
>> applications make random large file names, etc.  Would also have to cope
>> with language encoding etc.
> 
> Yeah, agreed.  This was the very first thing I thought about when this
> problem surfaced.  As I started looking at error reports, it became
> clear that the (annoying?) programs that would systematically create
> 200+ character filenames where often randomly generated.  For this
> reason, compression would never really guarantee us a working
> solution.
> 
>> == reducing ECRYPTFS_FNEK_ENCRYPTED prefix ==
>> Some gain in size, but loses any potential backwards compatibility.  Also
>> doesn't deal with expansion caused by encoding, nor the tag70 packet header
>> expanding the encrypted value.
> 
> Okay, strike the paragraph I wrote above asking about this ;-)  (I
> won't bother deleting it myself from this mail, as this response is
> quite stream-of-conscious at this point).
> 
>> == long file names with xattrs ==
>>
>> Disadvatanges
>> - requires lower file system to support xattrs
> 
> This is a bummer.  We need to handle this as gracefully as possible.
> I'm thinking something like the old W95 approach of filena~1.txt for
> fat32 -> fat16.  Obviously, we'd have somewhere around 200 characters
> to play with so hopefully that should suffice.  Beyond that, we just
> need to document the heck out of this.
> 
Hrmm if you mean to encrypt and encode its less than that, closer to ~150
characters if I remember correctly.

>> - long file name information can be lost by copies, taring, backups, etc
>>  made on the lower file system that are unaware of xattrs
> 
> Again, documentation will be required.
> 
> I just checked the manpage of tar in Ubuntu and was surprised that we
> don't have the xattr support that RHEL patches in.  We might need to
> consider pulling that into Ubuntu?   I also couldn't find an star
> package.  This is something I'll need to chase down.  We will want to
> make sure that Ubuntu (and other distros) have some method for
> archiving files which supports xattrs, and we'll want to make it
> perfectly clear that that's what eCryptfs recommends for backups.
> 
yes, /me is puzzled that this has been done yet.

>> - xattrs can be manipulated directly through the lower file system
> 
> Hmm...  So can filenames, permissions, ownerships, and timestamps.  I
> guess I'm not clear on the disadvantage here...
> 
Well just its a little easier to mess with the filename than when its
stored in the file, but your right going underneath for either lets
you screw the with everything, so not much of a disadvantage

>> Advantages
>> - supports multiple names with space only limited by xattrs limits
>>
>> - no extra code to manage name value paris, if multiple long names are
>>  to be supported.
>>
>> - provides for partial backwards compatibility
>>
>>  The ecryptfs header doesn't need to be modified, so previous versions
>>  can still read/write the file data.  However version that don't support
>>  long names via xattrs, will see the short name, and will not update
>>  the long name xattrs.
> 
> This is very important to me.  Thanks.
> 
>> - allows for long directory and symlink names
> 
> Oh good.
> 
This and the above point were actually the reason for me to choose xattrs
(at least for the first pass).

>> - can allow for long symlink targets
>>  If the encoded symlink target is to long an extra xattr containing the
>>  target can be stored, and a short name style encoding can be performed
>>  on the symlink target data.
> 
> Very nice.
> 
This isn't currently done but shouldn't be hard to add.  With symlink targets
generally having more space than for just a name (think using ../) I am not
sure its worth adding.  But maybe that is just my bias

>> == long file names in the ecryptfs header ==
>>
>> Disadvantages
>> - the space to store long file names is more limited than with xattrs.
>>
>>  In practice this shouldn't be a problem as just supporting a single long
>>  file name would cover the majority of use cases, if multiple shorter
>>  name links are allowed.
>>
>>  Even when storing multiple long names, being able to store 2 or 3
>>  should cover almost all use cases.
>>
>> - requires extra code to manage name value pairs, if multiple long names
>>  are to be supported.
>>
>>  This is just a matter of code. Xattrs provide support for name value pairs,
>>  and supporting multiple long file names in the ecryptfs header would
>>  require creating some addition code.
>>
>>  If however only a single long file name is supported then there is
>>  no extra code required.  Though storing the long name as a name value
>>  pair is still advisable as it will allow catching rename operations
>>  that are done on the lower filesystem so that the stored long name is
>>  not properly updated.
> 
> Okay, well, all things being equal, I would prefer seeing all of this
> solved in the header itself, but I can see that it's non trivial.
> 
honestly if this were the only issue with using the header I do it in a
heart beat

> I understand that you've sent this design information to the
> ecryptfs-devel@ list first, for initial feedback.  I think when you
> send this to the Linux Filesystem list, you'll probably get much more
> expert feedback on these issues.
> 
>> - is not backwards compatible.
>>
>>  Storing long file names in xattrs allows for some degraded backwards
>>  compatibily with older versions of ecryptfs.  But storing long names in
>>  the ecryptfs header will prevent older version from being able to
>>  access the stored data.
>>
>>  How important is this?  Not very, while being able to access the data
>>  with an older version may be nice for data recovering it also risks
>>  losing the longer specially stored longer names.
> 
> Agreed.
> 
>> - requires header to be updated, for renames or hardlinks with long names
>>
>>  This is mostly a non issue.  It may even be faster than storing an xattr.
>>
>> - can not be used for directories, symlinks
>>
>>  Storing the long file name in the ecryptfs header will only work for
>>  encrypted files, it won't work for directories or symlinks as they
>>  don't have a header.
> 
> Dang.  Okay, yeah, that's a dealbreaker, and big +1 for xattrs, IMO.
> 
>> - can not work around the symlink target being to long.
>>
>>  This is fs dependent but if the name for the symlink target is too long
>>  after encrypting and encoding, creation of the symlink may fail, and
>>  since symlinks have no header there is no place to store the extra
>>  information.
>>
>> Advantages
>> - the lower filesystem does not require xattr support
>>
>> - long name information will not be lost by copies, taring, or backups
>>  made on the lower file system that don't store xattrs.
>>
>> == special dentry file ==
>>
>> XAttrs Notes
>> - requires fs have xattr support
>> - 4 namespaces (security, system, trusted, user)
>>  - security: used by smack/selinux not appropriate to use
>>  - system: is tied to acls for some filesystems, so affected by mount flags
>>  - user: can't be trusted, can't be set on symlinks, device files
>>  - trusted: need cap_sys_admin to see/set
>> - not the same space restrictions as ecryptfs header, can use multiple xattrs
>> - xattr can be ecrypted separate from file, so error in name encryption leaks
>>  name instead of data.  Does this matter if relying on current encryption?
>> - having longname xattr leaks that the file has a longname
>>  - is this anyworse than directory walking would leak
> 
> Non-issue, as I stated above.  Filenames, like other meta data, are
> merely obfuscated, and not encrypted.
> 
>> - use trusted.ecryptfs.<name>
> 
> Hmm, I guess I'm confused about the statement above, requiring
> cap_sys_admin ...  Will every user have to have cap_sys_admin to use
> these long filenames?
> 
No, ecryptfs gets around the regular permission checks by storing calling
the underlying filesystem xattr routine directly.  This could be an issue
except it only happens after the file checks so the user has already been
validated to the file, and we are storing "system" information in the
trusted xattr.

For a user to be able to see or manipulate the trusted xattr directly they
will require cap_sys_admin.

There is some potential issue with this (that just came to me), this may
prevent users from backing up the xattrs, as they can see or read them :(
This is problematic as the other xattrs aren't really suited.
  security - is not the right place to stick this information
  system - has been too tightly tied to acls, at most fs implementation,
           and may not even be available without a mount flag
  user - would work except its not available without a mount flag, which
         hasn't been on by default in Ubuntu

I'll work on getting the next revision up and try to get a post to fs devel
out early next week.



Follow ups

References