← Back to team overview

holland-discuss team mailing list archive

Re: Generic copy backup plugin?

 

You can pass a stream straight into tarfile.open.  I wasn't sure this
would work, but apparently the holland compression implementation is
better than I remembered :P

import tarfile
from holland.lib.compression import open_stream

stream = open_stream('backup.tar.gz', 'w', method='gzip', level=1)

tar = tarfile.open(mode='w', fileobj=stream)

tar.add('/mnt/mysql-lvm/data')
tar.close()
stream.close()

When you check with du, are you including the original (uncompressed)
tarfile as well?  That would explain the 2x.

~Andy

On Fri, Jan 14, 2011 at 9:32 AM, Tim Soderstrom
<tim@xxxxxxxxxxxxxxxxxxxxx> wrote:
> A crappy little proof of concept is in my GH branch. It only does tar right now - compression is totally ignored :) I was wanting to use Holland's built-in stuff for that so I'm guessing I need to using pipes for that.
>
> One weird result is that the on-disk size (according to 'du') after untarring the backup is 2x the size of the original directory. I checked and all the byte-counts of the files are the same (according to 'ls'). I could see it being off by a little but 2x seems a bit excessive :P
>
> After I get this thing going I'll likely want to work on the raw plugin we talked about at some point (not totally sure when depending on my normal work stuff) since we should likely have that in Holland as a nice catch-all type of plugin.
>
> On Jan 14, 2011, at 7:38 AM, Tim Soderstrom wrote:
>
>> Ah plus I just coined a cool name for the dir + tar plugin: DirT :) I wanted to call it DirTY but couldn't think of what the 'Y' should be for. :P Useless, I know, but as you recall, I'm pretty easily amused.
>>
>> On Jan 14, 2011, at 7:29 AM, Tim Soderstrom wrote:
>>
>>>
>>> On Jan 14, 2011, at 4:19 AM, Andrew Garner wrote:
>>>
>>>> On Thu, Jan 13, 2011 at 4:28 PM, Tim Soderstrom
>>>> <tim@xxxxxxxxxxxxxxxxxxxxx> wrote:
>>>>> Curious as to thoughts here - what about a generic copy/tar plugin for Holland?
>>>>
>>>> I would suggest instead a general "script/command" plugin. This would
>>>> run an arbitrary provided command and fail if the command exits
>>>> non-zero.   You can pass the directory to stash files via variable
>>>> substitution, for example:
>>>>
>>>>>>> from string import Template
>>>>>>> cmd = Template('/bin/tar cvf -C /var/www/images . > ${backupdir}/my_images.tar.gz')
>>>>>>> cmd.safe_substitute(backupdir='/var/spool/holland/custom/XXXX_YYYY/')
>>>> '/bin/tar cvf -C /var/www/images . >
>>>> /var/spool/holland/custom/XXXX_YYYY//my_images.tar.gz'
>>>>
>>>> Then just run that and check the exit status.  There's probably
>>>> various other substitutions that may be useful.
>>>
>>> Ah good point! I think I may play around with my current solution first just because I want to fiddle with Python's tar but, yes, that's actually a good plugin to have! That would also be quite useful to Holland and maybe help increase adaption rate too.
>>>
>>>> With this, you could use rsync or many other archiving tools as you'd
>>>> like.  Many simple use cases would fit into this plugin as well, like
>>>> just running slapcat and redirecting its output.  You might not need
>>>> an ldap plugin at all.
>>>>
>>>> I would probably support using a specific shell (but default to
>>>> /bin/sh).   You'll want to deal with the command's stdout/stderr -
>>>> probably redirecting that to some location in the backup directory.
>>>
>>> I might need some explanation on that. Would I not be able to use subprocess here and simply grab the output and funnel it into the Holland log?
>>>
>>>> Estimating the size can be done in various ways - I would probably
>>>> just initially punt on this (estimated_size = 1).   There are many
>>>> ways it might be implemented, however.
>>>
>>> Hmm...maybe there could be an option in the config to specify a directory and, if undefined, then simply set it to 0 or 1. Most of the solutions I could think of off-hand would be backing up files from somewhere making simply scanning a directory housing the contents, while wildly inaccurate, still better than a punt. At least that way the hope is that the directory estimate would be larger than the file backup size, adding a buffer (which can be adjusted with the size factor anyway).
>>>
>>>
>>> _______________________________________________
>>> Mailing list: https://launchpad.net/~holland-discuss
>>> Post to     : holland-discuss@xxxxxxxxxxxxxxxxxxx
>>> Unsubscribe : https://launchpad.net/~holland-discuss
>>> More help   : https://help.launchpad.net/ListHelp
>>>
>>
>
>



Follow ups

References