← Back to team overview

syncany-team team mailing list archive

Re: Illegal file names

 

Hi,

Le 12/12/2013 18:11, Philipp Heckel a écrit :
> Encoding is yet another issue, true. I'll try to address it in the options
> below. I'm not quite sure how to solve it yet. Here are a few
> thoughts/options:
> 
> *A) Ignoring stuff in the "up" operation/indexer (least common denominator)*
> 1. Illegal filenames: Ignore all files with any characters that are illegal
> on Windows, Mac or Linux (ignore in "Indexer")

That would be super annoying.

> 2. Case conflicts: If a case conflict is detected (while indexing, in
> "Indexer"), ignore the "new" file. (if "file" was there first, "FILE" will
> be ignored)

Also super annoying.

> 3. Encoding issues: This is impossible to catch because in the "up"
> operation, the target encodings are not known; so there is no way to detect
> encoding issues; e.g. if I index on ext3 with UTF8 filenames, I cannot know
> if someone with a FAT FS with latin1 filenames will sync perform a "down"
> later.

You have at least to record the encoding used by the creator of the
repository.

> Pros/Cons:
> + Probably easy to implement
> - Not so cool for Linux<->Linux sync

Indeed.

> - No encoding solution
> 
> Open Issues:
> - How does this affect filename encoding?!

What do you mean?

> - Unclear how this behaves when "FILE" and "file" are created on two
> different machines and then sync'd...

Based on the winning strategy implemented in the Reconciliator, I guess
one version will win and prevent the other one from being ever synced.

> *B) Upload, but refuse to reconstruct illegal files/conflicts (don't care
> approach)*
> 1. Illegal filenames: Index and upload files with filenames that are/might
> be illegal on other systems, but refuse to "reconstruct" these files if
> they conflict (-> this could get ugly, because those files might be
> considered "deleted" in the next "up" operation; how to solve this?)

Each local database should include local only information that are not
synchronized. This will be a major win to handle fileKey (aka inode) for
modification tracking. Once you have the infrastructure to do that, you
can do fancy tricks such as:
- saying you don't want to care about a file locally or remotely (mark
this file as unsynchronized from the remote or to the remote)
- mapping a file name to something that can be handled locally (such as
slugifying a unix filename into a windows one)
- implementing whatever per file transformation you want via plugins.

> 2. Case conflicts: Like in (1)
> 3. Encoding issues: If a file was uploaded and the encoding of the target
> filesystem (e.g. FAT32 with latin1 filenames) does not support the
> filename, behave like (1)

Or slugify.

> Pros/Cons:
> + Syncs more files
> + No issues with Linux<->Linux sync
> + Also handles encoding issues.
> 
> Open issues:
> - How to handle files that were not reconstructed?!
> 
> Any other options?

Like I said in my previous email but with more details:
- in 0.1:
  - record the os, encoding, etc. of the creator of a repository
  - refuse to sync incompatible os unless the user force you too
  - allow a windows mode (not the default one!) which corresponds more
or less to your first solution (and thus does not handle encoding)
- latter: have per file local attributes which enables name translation
with slugifying.

In all cases, notify problems to the user (see my other email ;-)

Cheers,

Fabrice




Follow ups

References