mahara-contributors team mailing list archive
-
mahara-contributors team
-
Mailing list archive
-
Message #45558
[Bug 1685049] A change has been merged
Reviewed: https://reviews.mahara.org/8132
Committed: https://git.mahara.org/mahara/mahara/commit/78c87713a4fafb06419843b7cac2e887e1ac0c82
Submitter: Robert Lyon (robertl@xxxxxxxxxxxxxxx)
Branch: master
commit 78c87713a4fafb06419843b7cac2e887e1ac0c82
Author: Ilya Tregubov <ilya@xxxxxxxxxxxxxxx>
Date: Thu Apr 6 10:27:13 2017 +1000
Bug 1685049: Remote file system modification
behatnotneeded
Enables Mahara to save files to an external file system
- object storage (such as AWS's S3) -
which can reduce the cost of storage
Change-Id: I76822612f2922ba0ef2a0b7a4efb9cd2b96979a6
--
You received this bug notification because you are a member of Mahara
Contributors, which is subscribed to Mahara.
Matching subscriptions: Subscription for all Mahara Contributors -- please ask on #mahara-dev or mahara.org forum before editing or unsubscribing it!
https://bugs.launchpad.net/bugs/1685049
Title:
Modifications to filesystem to allow object storage
Status in Mahara:
Fix Committed
Bug description:
To use object storage (like S3) several modifications are needed (like
get_path should return either local path to file or remote if it is
stored remotely).
Use cases:
Offloading large and old files to save money
Disk can be expensive, so a simple use case is we simply want to move
some of the largest and oldest files off local disk to somewhere
cheaper. But we still want the convenience and performance of having
the majority of files local, especially if you are hosting on-prem
where the latency or bandwidth to the remote filesystem may not be
great.
Sharing files across maharas to save disk
Clients can have multiple mahara instances, and there is much
duplicated content across instances. By pointing multiple maharas at
the same remote filesystem, and not allowing deletes, then large
amounts of content can be de-duplicated.
Sharing files across environments to save time
We can have multiple environments for various types of testing, and
often have ad hoc environments created on demand. Not only do we not
want to have to store duplicated files, but we also want refreshing
data to new environments to be as fast as possible.
Using this plugin we can configure production to have full read write
to the remote filesystem and store the vast bulk of content remotely.
In this setup the latency and bandwidth isn't an issue as they are
colocated. The local filedir on disk would only consist of small or
fast churning files. A refresh of the production data back to a
staging environment can be much quicker now as we skip the sitedir
clone completely and stage is simple configured with readonly access
to the production filesystem. Any files it creates would only be
writen to it's local filesystem which can then be discarded when next
refreshed.
Currently it only works with AWS S3 storage. There is support for more
object stores planed, in particular enabling Openstack deployments.
To use a plugin you will need to create an Amazon S3 bucket for your
mahara instance. You will also need Amazon SDK to make this plugin
work.
So currently following directory is being saved to S3 - /path_to_sitedata/artefact/file/originals
There is a cron task that checks if there are new files that are not yet in S3 (depending on setting it only pushes files of certain age and size). Once files are duplicated, there is an option to delete local copy if we want to save space (cron task checks duplicated files, again only deletes those depending on setting, like large files etc). Certain operations only can deal only with local files - for example on content page - download content as zip archive. So when hitting download button, first all files will be pulled to local from S3.
To manage notifications about this bug go to:
https://bugs.launchpad.net/mahara/+bug/1685049/+subscriptions
References