← Back to team overview

pytagsfs team mailing list archive

Re: Really large collections? Other Fusefs?

 

Hi,

On Sat, Oct 30, 2010 at 01:49:07PM -0700, Telejester wrote:
> I have a really big music collection (320K files) that already lives
> on a fuse fs (mhddfs, not surprisingly). I have a few questions
> about operational limits and one about features:
> 
> 1. Do you see any potential issues with pytagsfs over a set this
> size? Or on top of another fusefs?

The inotify limits you mentioned are one issue.  Apart from that:

* Startup time is going to be awful.  Scanned metadata is not saved to disk when
  you unmount, so we have to rescan the entire collection on every mount.  This
  issue is the most commonly reported shortcoming of pytagsfs.

* Memory usage may become significant.  The mapping from virtual paths to real
  paths is stored entirely in memory (see pathstore.pytypes).  This performs
  well but with an extremely large collection memory usage may climb quite high
  (maybe several hundred MB?).  An alternate pathstore implementation that
  stores this mapping on disk would solve this problem, at the expense of some
  performance.

BTW, I'm not sure that inotify will always work as expected if your source
filesystem is a FUSE filesystem.  For instance, changes to a pytagsfs mount can
happen when you retag source files directly.  As far as I'm aware, there is no
mechanism by which the FUSE filesystem can notify the kernel that this has
happened.

> 2. This collection is needless to say still growing, so how would
> you recommend sizing max_user_watches?

pytagsfs needs one watch per source directory (including the root), so it
depends on how your files are organized.  For instance, if you have a directory
containing three albums, each of which is stored in its own directory, you will
need four watches.

> 3. Any guesses (or ideally, knowledge) about the practical upper
> limit for user watches or how big the relevant struct is?

Hm, good question, and not one I can answer.  A lot of people seem to have much
bigger music collections than I do. ;)

> 4. I have images and metadata files that go along with the music
> files in most directories. Cues, logs, some m3u files, the odd ffs,
> some text files and even html in a few cases. Is there any way that
> I can treat the contents of a given directory as a Managed Release
> or some other abstraction of an album? If I retag the music, I don't
> want it to leave the art and metadata behind in a (presumably badly
> named) directory.

pytagsfs doesn't move your source files, it only retags them.  When you move
files around in the (virtual) pytagsfs mount, that implicitly applies new
metadata to the files.  But the source files don't get renamed.

Of course, you can use pytagsfs to rename files by mounting the collection and
then copying the virtual files to a new on-disk location, but that gives you a
new copy of your collection (with new file and directory names).  If you went
through this process, it's true that the non-media files would not exist in the
new copy.  You may be able to work around this by cleverly using conditional
expressions in the format string.  The manpage discusses conditional
expressions, but feel free to ask questions.

Hope this is helpful.

Thanks,
Forest
-- 
Forest Bond
http://www.alittletooquiet.net
http://www.pytagsfs.org

Attachment: signature.asc
Description: Digital signature


References