ecryptfs-devel team mailing list archive
-
ecryptfs-devel team
-
Mailing list archive
-
Message #00134
Re: [Patch 0/1] Add support for file names that are too long after encryption
On Tue, Feb 1, 2011 at 9:57 AM, John Johansen
<john.johansen@xxxxxxxxxxxxx> wrote:
> The following patch is a first pass at addressing the bug
> "file name too long when creating new file"
> https://bugs.launchpad.net/ecryptfs/+bug/344878
>
> Which occurs when a file is created with a file name that would be valid
> before encrypting and encoding but after being encrypted and encoded is
> too long for the underlying filesytem.
First and foremost, thank you, John, for tackling this ~2 year old
problem. It's something that we knew would be an issue when we
embarked on encrypted (or, as I prefer, obfuscated) filenames. We
didn't really have any idea how big of a problem it might be. I
remember doing a deep find on all of my ~10 Ubuntu systems, and did
not have any particular file that was >200 characters, nor path that
was >2000 characters. However, different strokes for different folks,
and eventually people did start having issues.
> Overview:
> To support file names that are too long when encrypted and encoded the patch
> stores the long file name (longname) in an xattr on the file and creates a
> "unique" short file name (shortname) which is stored in the underlying
> filesystem. The shortname is never seen when accessing files from the
> ecryptfs view, but it is what will be found when accessing the lower
> filesystem directly.
>
> While the patch currently uses xattrs it is possible to convert to storing
> the longname in the ecryptfs file header (see below for some notes about
> advantanges and disadvantages), or even allow for both options.
So we've discussed this 1:1, but I'll just re-state here...
Ideally, we would use the header of the file to store this long
filename, which would provide the great advantage of providing
increased portability across filesystems. Without requiring xattrs,
it's trivial, for instance, to backup the lower encrypted files to a
VFAT filesystem on a USB stick.
However, in such circumstances, VFAT does lose some metadata about the
file, such as permissions. Furthermore, I have come to understand the
complexity of supporting elemental UNIX/Linux functionality such as
hardlinks using the file header alone. For these reasons, I will
accept that your xattr approach is a reasonable solution to this
problem, with the one very hard requirement that the short name you
provide when the xattr is not available is both a) unique, and b) as
absolutely descriptive as possible.
> Current State:
> - Use xattrs to store longname on the file
> - Detects xattr support at mount time
> - Uses a mount flag for longname support
> - currently the mount flag is inverted. Longname support is enabled
> by default and the flag is used to disable it.
> - current method is some what hacky in that it was assumed this
> would be inverted, back to requiring a flag but if not this can
> be cleaned up.
This is okay by me. I can add some support code in ecryptfs-utils
which would allow for this to be configured on a per-user basis in
~/.ecryptfs with some flag file, perhaps
~/.ecryptfs/disable-long-names.
> - Currently the code is does not have a Kconfig to disable at compile
> time. Is this desired?
Not desired by me, but I think others may have dissenting opinions and
valid reasons.
> - the longname xattr is stored in the trusted namespace using the
> trusted.ecryptfs. prefix
> - the longname is encrypted using the same tag70 packet encoding as any
> other encrypted file name. It is not encoded to reduce the size of the
> xattr.
> - a file can have multiple longnames (hardlinks)
Cool.
> - each longname is stored as a single xattr name, value pair.
> - the xattr name is based off of the encrypted and encoded shortname
> without the ECRYPTFS_FNEK prefix
> eg.
> if the encrypted and encoded shortname is
> ECRYPTFS_FNEK_ENCRYPTED.FZYwryMXdKVUQZfN26kvrVp30Yif
> then the xattr name will be
> trusted.ecryptfs.FZYwryMXdKVUQZfN26kvrVp30Yif
Okay, that sounds fine to me.
As an aside (and feel free to break this off into a separate thread,
if you like)... Tyler and I have discussed several times shortening
the long-and-clunky preamble on encrypted filenames. This does eat
~23 out of 255, so roughly 10% of the total available file name
length. It would be pretty rare to have encrypted and non-encrypted
files next to one another in the same directory, so this preamble
seems unnecessarily long, to me. Any thoughts about trimming this
down some? Tyler? Mike? John?
> + it would be possible to reduce the size of the xattr name if it was
> based on the unencrypted and unecoded shortname
> - the value contains the encrypted long filename
> - if the expected longname is missing, the current code falls back to
> using the shortname.
Good. I think we should really create some automated test cases
around here, generating thousands of files, testing and reporting on
the long names and shortnames, the mappings, etc.
> + a mount option could be added to force failure instead of trying to
> gracefully fallback
> + the patch extends the ecryptfs private dentry field with a longname flag
> that is used to indicate that the underlying dentry has a longname
> - a unique shortname is used as a place holder for the long file name in
> the lower filesystem.
> + the current encoding of the shortname will most like change a least some
How so? Can you elaborate on this?
> + the shortname generated is always the same for the same name, this
> leaks more information than it should and can result in collisions
> if the same name is used from different directories.
That's no different than we have now for all encrypted filenames.
This is why I prefer to call this feature "file name obfuscation"
rather than encryption. The scheme for encrypting file contents is
particularly strong in eCryptfs, which each individual file being
encrypted with a unique, random key. This is clearly not the case for
filenames, and this is due entirely to the performance demands
necessary.
In any case, this is in no way a blocker for. There's plenty of meta
information about an encrypted file which is already available --
permissions, ownerships, atimes, mtimes, ctimes. The filename is
merely an extension of this. Filename obfuscation is merely a subtle
layer of abstraction that makes the real filename simply non-obvious.
I maintain that the real value of eCryptfs is providing strong
security for the contents of each file at rest.
> + the current shortname generation doesn't deal with potential collision
> between encrypted and encoded file names (this seems pretty unlikely),
> nor with name collisions of filenames that hash to the same md5 (again
> unlikely)
Yeah, no worries here, by me. Someone would really have to try and
cause collisions for this to be a problem. This isn't a matter of
accidentally touching the stove. This is more like sticking your hand
in a blender. We don't recommend it. No, really; don't.
> - currently the shortname is created from combining the the
> ECRYPTFS_FNEK_ENCRYPTED. prefix with the encoded md5 hash of the long
> file name.
> eg.
> ECRYPTFS_FNEK_ENCRYPTED.sdfjyo34n2lkh2lknlkafa--
> - the shortname is encrypted and encoded just like any other filename
> - both the shortname and the encrypted and encoded shortname must have
> the ECRYPTFS_FNEK_ENCRYPTED. for a file name to be considered a valid
> shortname
> - This design allows for the shortname to "work" to some degree, with
> older versions of ecryptfs. Name lookups based off of the long file
> name won't work but the shortname can be used so that files can
> be copied/moved without losing data.
Hmm, okay. If I understand you correctly, I think I agree with this
approach. I will want to play with it a bit and see how it actually
behaves in practice. And we will want to establish some solid
documentation around this.
> - only the symlink name can be give a long name currently. The
> symlink target encryption hasn't changed.
> - this means symlinks don't use the shortname when being accessed
> by older versions of ecryptfs. So even if the long name file
> they reference exists they won't resolve to a long name file.
> - it is possible to have the target to use shortnames
> - it is possible to add support for long name targets, that after
> encrypting and encoding are too long. By using short names and
> an extra xattr for the long target name on the symlink.
Yeah, this does sound desirable. I'd think symlinks should be able to
function by pointing to either a long name, or a short name, and that
eCryptfs would correctly handle both.
> = Supportting long file names =
>
> Since encrypting and encoding expand the length of the dentry, we need to
> either cancel out the expansion or store the extra information for the
> long name else where. This also necessitates putting a shorter place
> holder name as the name in the file system.
>
> Each method of dealing with long names have their own advantages and
> disadvantages.
>
> == compression ==
> Little gain, certainly not enough for all possible long file names. Several
> applications make random large file names, etc. Would also have to cope
> with language encoding etc.
Yeah, agreed. This was the very first thing I thought about when this
problem surfaced. As I started looking at error reports, it became
clear that the (annoying?) programs that would systematically create
200+ character filenames where often randomly generated. For this
reason, compression would never really guarantee us a working
solution.
> == reducing ECRYPTFS_FNEK_ENCRYPTED prefix ==
> Some gain in size, but loses any potential backwards compatibility. Also
> doesn't deal with expansion caused by encoding, nor the tag70 packet header
> expanding the encrypted value.
Okay, strike the paragraph I wrote above asking about this ;-) (I
won't bother deleting it myself from this mail, as this response is
quite stream-of-conscious at this point).
> == long file names with xattrs ==
>
> Disadvatanges
> - requires lower file system to support xattrs
This is a bummer. We need to handle this as gracefully as possible.
I'm thinking something like the old W95 approach of filena~1.txt for
fat32 -> fat16. Obviously, we'd have somewhere around 200 characters
to play with so hopefully that should suffice. Beyond that, we just
need to document the heck out of this.
> - long file name information can be lost by copies, taring, backups, etc
> made on the lower file system that are unaware of xattrs
Again, documentation will be required.
I just checked the manpage of tar in Ubuntu and was surprised that we
don't have the xattr support that RHEL patches in. We might need to
consider pulling that into Ubuntu? I also couldn't find an star
package. This is something I'll need to chase down. We will want to
make sure that Ubuntu (and other distros) have some method for
archiving files which supports xattrs, and we'll want to make it
perfectly clear that that's what eCryptfs recommends for backups.
> - xattrs can be manipulated directly through the lower file system
Hmm... So can filenames, permissions, ownerships, and timestamps. I
guess I'm not clear on the disadvantage here...
> Advantages
> - supports multiple names with space only limited by xattrs limits
>
> - no extra code to manage name value paris, if multiple long names are
> to be supported.
>
> - provides for partial backwards compatibility
>
> The ecryptfs header doesn't need to be modified, so previous versions
> can still read/write the file data. However version that don't support
> long names via xattrs, will see the short name, and will not update
> the long name xattrs.
This is very important to me. Thanks.
> - allows for long directory and symlink names
Oh good.
> - can allow for long symlink targets
> If the encoded symlink target is to long an extra xattr containing the
> target can be stored, and a short name style encoding can be performed
> on the symlink target data.
Very nice.
> == long file names in the ecryptfs header ==
>
> Disadvantages
> - the space to store long file names is more limited than with xattrs.
>
> In practice this shouldn't be a problem as just supporting a single long
> file name would cover the majority of use cases, if multiple shorter
> name links are allowed.
>
> Even when storing multiple long names, being able to store 2 or 3
> should cover almost all use cases.
>
> - requires extra code to manage name value pairs, if multiple long names
> are to be supported.
>
> This is just a matter of code. Xattrs provide support for name value pairs,
> and supporting multiple long file names in the ecryptfs header would
> require creating some addition code.
>
> If however only a single long file name is supported then there is
> no extra code required. Though storing the long name as a name value
> pair is still advisable as it will allow catching rename operations
> that are done on the lower filesystem so that the stored long name is
> not properly updated.
Okay, well, all things being equal, I would prefer seeing all of this
solved in the header itself, but I can see that it's non trivial.
I understand that you've sent this design information to the
ecryptfs-devel@ list first, for initial feedback. I think when you
send this to the Linux Filesystem list, you'll probably get much more
expert feedback on these issues.
> - is not backwards compatible.
>
> Storing long file names in xattrs allows for some degraded backwards
> compatibily with older versions of ecryptfs. But storing long names in
> the ecryptfs header will prevent older version from being able to
> access the stored data.
>
> How important is this? Not very, while being able to access the data
> with an older version may be nice for data recovering it also risks
> losing the longer specially stored longer names.
Agreed.
> - requires header to be updated, for renames or hardlinks with long names
>
> This is mostly a non issue. It may even be faster than storing an xattr.
>
> - can not be used for directories, symlinks
>
> Storing the long file name in the ecryptfs header will only work for
> encrypted files, it won't work for directories or symlinks as they
> don't have a header.
Dang. Okay, yeah, that's a dealbreaker, and big +1 for xattrs, IMO.
> - can not work around the symlink target being to long.
>
> This is fs dependent but if the name for the symlink target is too long
> after encrypting and encoding, creation of the symlink may fail, and
> since symlinks have no header there is no place to store the extra
> information.
>
> Advantages
> - the lower filesystem does not require xattr support
>
> - long name information will not be lost by copies, taring, or backups
> made on the lower file system that don't store xattrs.
>
> == special dentry file ==
>
> XAttrs Notes
> - requires fs have xattr support
> - 4 namespaces (security, system, trusted, user)
> - security: used by smack/selinux not appropriate to use
> - system: is tied to acls for some filesystems, so affected by mount flags
> - user: can't be trusted, can't be set on symlinks, device files
> - trusted: need cap_sys_admin to see/set
> - not the same space restrictions as ecryptfs header, can use multiple xattrs
> - xattr can be ecrypted separate from file, so error in name encryption leaks
> name instead of data. Does this matter if relying on current encryption?
> - having longname xattr leaks that the file has a longname
> - is this anyworse than directory walking would leak
Non-issue, as I stated above. Filenames, like other meta data, are
merely obfuscated, and not encrypted.
> - use trusted.ecryptfs.<name>
Hmm, I guess I'm confused about the statement above, requiring
cap_sys_admin ... Will every user have to have cap_sys_admin to use
these long filenames?
Once again, thanks for tackling this mammoth problem, John!
--
:-Dustin
Dustin Kirkland
Ubuntu Core Developer
Follow ups
References