gibak 0.3.0 (backup tool using Git): OSX support, extended attributes, bugfixes
gibak is a backup tool based on git. Since gibak builds upon the infrastructure offered by Git, it shares its main strengths:
- speed: recovering your data is faster that cp -a...
- full revision history
- space-efficient data store, with file compression and textual/binary deltas
- efficient transport protocol to replicate the backup (faster than rsync)
gibak uses Git's hook system to save and restore the information Git doesn't track itself such as permissions, empty directories and optionally extended attributes and mtime fields.
You can read more about gibak here.
What's new in 0.3.0
- OSX support
- support for extended attributes both on Linux and OSX (it might work on other systems if their getxattr(2) interface matches Linux or OSX's).
- a few bugfixes
Read README.upgrade if you used earlier versions of gibak and want to use extended attributes.
Getting it
The latest tarball as of 2008-03-31 is gibak-0.3.0.tar.gz.
You can always get the latest code with
git clone http://eigenclass.org/repos/git/gibak/.git/
The repository can be browsed at http://eigenclass.org/repos/gitweb
Thanks
- Lee Marlow solved problems with submodule paths including spaces
- sean provided most of the fixes required for OSX support
Some FAQs
(This will eventually be moved to a new node.)
Will the repository grow monotonically? Can I delete older history? What about large files?
You can use git rebase -i to squash older commits. For instance, you could combine the first year worth of backups into a single commit. This can also be used to remove large files. I will document this sometime.
If I clone the repository, will I need 4 times the space taken up by the data?
No:
- git uses object compression and binary deltas, so the worst-case space consumption is 2 * (history-compressed + original)
- you can use a "bare" clone (without tree checkout), using 2 * history-compressed + original
- if losing the history (but not the latest version!) in case of disk crash is acceptable, you can mound another disk in ~/.git, for a total consumption of (history-compressed + original). If the "live" disk dies, you can recover the latest version from the backup. If the backup drive crashes, the history is lost, but the latest version of the data is still in the "live" disk.
Note that in a typical incremental backup scheme (such as the common rsync-based scripts you'll find around), history is also lost if the disk holding your incremental backups dies.
What does gibak protect me against?
It depends on the safety level you choose:
- with multiple clones and geographical distribution, your data will survive most catastrophes (earthquake, fire, town nuked)
- with N clones on different machines in the same site, there will be no problem if (N-1) machines are physically destroyed
- with a single machine and N clones on different disks, the data and the full history will survive (N-1) disk crashes.
- with a single machine and a second disk mounted on ~/.git, your data will survive a single disk crash. The history might be lost if the drive holding the contents of the .git directory dies.
Two things: commitendconding error and hooks directory - Zeno Davatz (2008-03-31 (Mon) 07:58:04)
Hi
Thank you for this upgrade. I have got two questions, for after I upgraded to 0.3.0:
1. In your "README.upgrade" you say add the "-x" to .git/pre-commit and .git/post-checkout. Those files are located in '.git/hooks' not in '.git' - Can you confirm this or is something wrong with my installation?
2. With the newest release I get following error when running "gibak commit"
Committing. Fatal error: exception Failure("lgetxattr") Warning: commit message does not conform to UTF-8. You may want to amend it after fixing the message, or set the config variable i18n.commitencoding to the encoding your project uses.
Thank you for your Feedback.
Best Zeno
Zeno Davatz 2008-03-31 (Mon) 08:07:41
Ok, according to above Error-Message I proceeded according to: http://www.kernel.org/pub/software/scm/git/docs/git-commit.html
I put this in my .git/config: [i18n]
commitencoding = ISO-8859-1
an reran "gibak commit". Now I still keep getting: Fatal error: exception Failure("lgetxattr")
Zeno
Zeno Davatz 2008-03-31 (Mon) 08:20:20
Ok, I checked in the Kernel via "make menuconfig" and then searched with "/ 'xattr'" and as far as I can tell JFS does not support "xattr" as of the latest Kernel Release of Linus Torvalds. Please correct me if I am wrong.
I am using JFS as my File-System.
Zeno
mfp 2008-03-31 (Mon) 08:29:21
(1) is indeed a mistake, the files to be edited are under .git/hooks.
(2) On which OS are you running it? The exception you're observing means that getxattr(2) is failing for some reason; the possibilities include:
- attribute not accessible by the process
- bug in my code resulting in the wrong attribute name being passed to getxattr
- ...
You can try to change ometastore.ml around line 51 to read as follows:
51 let xattrs =
52 List.map
53 (fun attr -> printf "XATTR(%S, %S)\n%!" path attr;
54 { name = attr; value = lgetxattr path attr; })
55 (List.sort compare (llistxattr path))
And run
ometastore -x -s -i --sort
from $HOME. This will print all the xattrs saved and should at least tell us if the attribute name seems sensible.
Until your problem is solved, ometastore won't generate the .ometastore file and the backups will be incomplete (they will have the metadata from the last time it ran successfully).
For the time being, you can do this: recompile without xattr support (no need to change the hooks, -x will become a no-op); for good measure, you can undo your last commit with
git reset --mixed HEAD^
(not "hard"!) and then gibak commit again with the version without xattr support.
(As for the warning about the encoding, you get rid of it by running git config i18.commitencoding <your encoding> under $HOME)
mfp 2008-03-31 (Mon) 08:38:42
According to wikipedia
In Linux, the ext2, ext3, ext4, JFS, ReiserFS and XFS filesystems support extended attributes (abbreviated xattr) if the libattr feature is enabled in the kernel configuration.
Regardless of whether the FS actually supports xattrs, this failure mode surprises me: if xattrs are not enabled, ometastore should choke on llistxattr, not on lgetxattr. Therefore, there seems to be a bug in perform_llistxattr which causes the wrong name to be passed to lgetxattr. The above diff should clarify this.
Zeno Davatz 2008-04-08 (Die) 02:16:17
Ok, I followed your advice from above, now if I run
~> ometastore -x -s -i --sort
Fatal error: exception Failure("lgetxattr")
from my Home-Directory, I get the above Error. I am on Gentoo-Linux (x86).
Zeno Davatz 2008-04-08 (Die) 02:33:10
Ok sorry, I forgot to copy
ometastore
to
/usr/local/bin
Now
ometastore -x -s -i --sort
gives me tons of output (after I made the modifications to the file you told me)! I can sent it to you by Email if you like.
Zeno Davatz 2008-04-08 (Die) 02:39:21
An example of the Output of:
ometastore -x -s -i --sort
is
XATTR("./.Mail/inbox/SPAM/1", "user.Beagle.AttrTime")
XATTR("./.Mail/inbox/SPAM/1", "user.Beagle.Filter")
XATTR("./.Mail/inbox/SPAM/1", "user.Beagle.Fingerprint")
XATTR("./.Mail/inbox/SPAM/1", "user.Beagle.MTime")
XATTR("./.Mail/inbox/SPAM/1", "user.Beagle.Uid")
Eric Pollmann 2008-04-08 (Tue) 22:13:31
Hi,
I was having similar failures in llistxattr on OSX, the message was:
Fatal error: exception Failure("llistxattr")
This prevented .ometastore from being created.
To debug, I changed ometastore_stub.c::perform_llistxattr(value file) around line 65. This will print out the file that is causing the failure:
if(siz < 0) {
printf("Running llistxattr on %s failed!\n", file);
caml_failwith("llistxattr");
}
Also added this to the top of the file to get printf:
#include <stdio.h>
On rebuild, and running the command, I discover that in my case it was a file with the sticky bit set:
XATTR("./Library/Application Support/Firefox/Profiles/eemdt8z1.default/extensions/googleDesktop@google.com/chrome.manifest", "com.apple.FinderInfo")
XATTR("./Library/Application Support/Firefox/Profiles/eemdt8z1.default/chrome", "com.apple.FinderInfo")
Running llistxattr on ./Library/Application Support/BitTorrent/data/controlsocket failed!
Fatal error: exception Failure("llistxattr")
A little more debugging info, based on the information at: http://developer.apple.com/documentation/Darwin/Reference/ManPages/man2/listxattr.2.html
if(siz < 0) {
printf("Running llistxattr on %s failed, error %i\n", file, errno);
if (errno == ENOTSUP) printf("Not supported on file system.\n");
if (errno == ERANGE) printf("Namebuf too small.\n");
if (errno == EPERM) printf("Not supported on file.\n");
if (errno == ENOTDIR) printf("Path not a directory.\n");
if (errno == ENAMETOOLONG) printf("Name too long.\n");
if (errno == EACCES) printf("Permission denied.\n");
if (errno == ELOOP) printf("Too many symbolic links. Loop?\n");
if (errno == EFAULT) printf("Inavlid address.\n");
if (errno == EIO) printf("I/O error occured.\n");
if (errno == EINVAL) printf("Options invalid.\n");
caml_failwith("llistxattr");
}
and
#include <sys/errno.h>
gives:
Running llistxattr on ./Library/Application Support/BitTorrent/data/controlsocket failed, error 1
Not supported on file.
Fatal error: exception Failure("llistxattr")
So in my ometastore_stub.c, I just skip that error case right above the debug, ~line 63
if (siz == 0 || errno == EPERM)
CAMLreturn(Val_int(0));
ran again and got another error:
Running llistxattr on ./.Trashes failed, error 13
Permission denied.
Fatal error: exception Failure("llistxattr")
I added the permission denied files to .gitignore, since they won't be able to be backed up anyway, and just for caution's sake, add one more skipped error type:
if (siz == 0 || errno == EPERM || errno == EACCES)
CAMLreturn(Val_int(0));
Hope that helps!
-Eric
website design 2008-06-19 (Thr) 14:25:05
.. with a website inaccessible for a few weeks at a time.
Kazelqar 2008-07-13 (Sun) 21:24:49
Hi webmaster!
aspire .com 2008-08-16 (Sat) 06:35:15
Very Nice Site! Thanx! http://excellent-credit-card.blogspot.com
No Title - Stat (2008-04-03 (Thr) 18:07:57)
Hi! I understand that my question is not directly related to gibak, but still perhaps you can help me. I have in a home root a directory called "bin". I want to exclude it, but still keep some nested directory "bin/test"
I added to .gitignore
"/bin" "!/bin/test"
But git ignores it all together.
Can you help me with my problem, please?
Thanks.
Reply to the above comment...
mfp 2008-04-04 (Fri) 08:42:54
If you add /bin to .gitignore, git will not descend into that directory, so !/bin/test will not be effective.
The way to do this is to place a .gitignore file under bin/:
/* !/test
This will ignore everything under bin/ but bin/test.
Kazeldbo 2008-07-13 (Sun) 21:30:50
Hi webmaster!
zsh completion for gibak - Nick (2008-04-04 (Fri) 12:07:20)
This is more out of play than need, as typing "gibak commit" is a) not particularly laborious and b) really wanting to be automated.
I am, however, a self-confessed zsh completion junkie, so if any of your readers fancy it, I hereby present zsh completion for gibak:
http://github.com/nickstenning/zsh/tree/master/completions/_gibak
Installation problem - Chris (2008-04-09 (Wed) 06:49:43)
I'm getting this error on installation:
- omake: finished reading OMakefiles (0.2 sec)
- build . find-git-files.opt + ocamlopt.opt -warn-error A -dtypes -inline 10 -S -I . -o find-git-files.opt unix.cmxa str.cmxa util.cmx folddir.cmx find-git-files.cmx ometastore_stub.a find-git-files.cmx is not a compilation unit description.
- omake: 41/60 targets are up to date
- omake: failed (0.5 sec, 0/4 scans, 1/15 rules, 0/123 digests)
- omake: targets were not rebuilt because of errors:
find-git-files.opt
Can anyone help?
mfp 2008-04-11 (Fri) 16:00:51
What's your
ocamlopt -version
?
You can build find-git-files.opt manually by executing the following line (as reported by omake, where find-git-files.cmx has been replaced with find-git-files.ml, to compile the .ml directly instead of linking against the .cmx):
ocamlopt -warn-error A -dtypes -inline 10 -S -I . -o find-git-files.opt unix.cmxa str.cmxa util.cmx folddir.cmx find-git-files.ml ometastore_stub.a
Ditto for find-git-repos.opt. (You'll have to save them as find-git-repos and find-git-files, without the .opt extension, to some directory in your PATH).
hth
cygwin trouble - Paul (2008-04-28 (Mon) 22:42:17)
I'm trying to build it under Cygwin using the OMake Windows binary from http://omake.metaprl.org/
$ /cygdrive/c//Program\ Files/OMake/bin/omake.exe *** omake: reading OMakefiles *** omake: finished reading OMakefiles (0.04 sec) *** omake: 9/10 targets are up to date *** omake: failed (0.07 sec, 0/0 scans, 0/0 rules, 0/31 digests) *** omake error: File OMakefile: line 31, characters 1-51 Do not know how to build "ometastore" required for "<phony <.DEFAULT>>"
and debian etch with OMake 0.9.6.9 - Paul (2008-04-28 (Mon) 23:08:00)
...
- build . ometastore.cmi
+ ocamlc -warn-error A -dtypes -g -I . -c ometastore.ml
File "ometastore.ml", line 228, characters 17-32:
Unbound value Printf.ifprintf
*** omake: 45/52 targets are up to date
*** omake: failed (11.2 sec, 6/6 scans, 11/23 rules, 36/132 digests)
*** omake: targets were not rebuilt because of errors:
ometastore.o
depends on: ometastore.ml
ometastore.cmx
depends on: ometastore.ml
ometastore.cmo
depends on: ometastore.ml
ometastore.cmi
depends on: ometastore.ml
Paul 2008-05-03 (Sat) 11:48:03
I really want to run gibak, or at least try it out! Is there any recourse for me?
Thanks in advance.
Matt 2008-06-30 (Mon) 10:17:57
ocaml from testing (3.10.2-3) worked for me
Working with submodules... - James Snyder (2008-04-29 (Tue) 13:17:58)
I have a question about treating existing git repositories as submodules of one's home directory. How might this overall integrate into pushing changes of the main gibak store to a backup server. I assume that the submodules won't go along with the main repository when this is done (I'm relatively new to git). If what I want to replicate the backed up versions of these other local repositories, what is the best way? Should I do an rsync of their contents after the main push of the backup? Is there another way to do this? I'd like for this to be all rolled up and automated so that I can have regular automated snapshots taken every so often, and then for less frequently (say, daily) I'd like to replicate that over to another server.
Using rsync for the whole ~/.git dir seems like a bit of a waste since the pack files could get rather large and I would figure it would be more wasteful for rsync to figure out the locations of changes in a multi-gigabyte file compared with git just using its indexes on local and server sides to decide what objects need to be copied.
Any suggestions?
submodules - Christoph (2008-05-22 (Thr) 17:07:33)
same question as james snyder above, how do you handle the transfer of the included submodules?
submodules not ignored? - sol (2008-05-23 (Fri) 03:46:50)
Hi, thanks for the project, really nice
One thing I noticed, although I have an entry 'projects/' in .gitignore, gibak adds all git repositories in the projects dir to .git/git-repositories
Did I miss something?
Thank you
trying to backup /, getting error "fatal: Out of memory" - patrick (2008-06-30 (Mon) 09:33:17)
hey! I am trying to do incremental-backups of / with gibak. All i got till now is the error:
fatal: Out of memory? mmap failed: No such device Could not complete addition of files to history store!
here is what i have done so far:
my system:
uname -a Linux Debian-40-etch-64-minimal 2.6.18-6-xen-amd64 #1 SMP Fri Jun 6 06:38:05 UTC 2008 x86_64 GNU/Linux date Mon Jun 30 14:24:31 UTC 2008
install ocaml:
adduser patrick cd /home/patrick wget http://www.backports.org/debian/pool/main/o/ocaml/ocaml_3.10.1-1~bpo40+3_amd64.deb wget http://www.backports.org/debian/pool/main/o/ocaml/ocaml-base_3.10.1-1~bpo40+3_amd64.deb wget http://www.backports.org/debian/pool/main/o/ocaml/ocaml-nox_3.10.1-1~bpo40+3_amd64.deb wget http://www.backports.org/debian/pool/main/o/ocaml/ocaml-base-nox_3.10.1-1~bpo40+3_amd64.de wget http://www.backports.org/debian/pool/main/o/ocaml/ocaml-interp_3.10.1-1~bpo40+3_amd64.deb dpkg -i ocaml-base-nox.. ocaml-interp.. ocaml-nox.. ocaml-base.. ocaml..
install git and gibak:
apt-get install git-core apt-get install curl wget http://www.backports.org/debian/pool/main/g/git-core/git-core_1.5.5.1-1~bpo40+1_amd64.deb dpkg -i git-core_1.5.5.1-1~bpo40+1_amd64.deb mkdir /home/patrick/tools cd /home/patrick/tools git clone http://eigenclass.org/repos/git/gibak/.git/ export PATH=$PATH:/home/patrick/gibak cd gibak omake
init gibak and try to commit:
gibak init Initialized empty Git repository in .git/ You might be interested in tweaking the ~/.gitignore file Please run 'gibak commit' to save a first state in your history editor /.gitignore ______________________________<editor> /dev /proc ______________________________</editor> editor /home/patrick/tools/gibak/gibak ______________________________<editor> line 33: # Remember that all actions here are to be made in $HOME dir! line 34: cd / ______________________________</editor> gibak commit add .. add .. .. fatal: Out of memory? mmap failed: No such device Could not complete addition of files to history store!
any suggestions?
-patrick
Fatal error: exception Failure("float_of_string") - sMark (2008-07-17 (Thr) 19:24:18)
Hi, commented on the previous post. (OSX 10.5.4 ; ocaml 3.10.2 ; git 1.5.6)
When I run:
ometastore -v -x -a -i
I receive:
Fatal error: exception Failure("float_of_string")
Thanks much! This is a great idea.
-Mark
restore - Jason McMullan (2008-08-19 (Tue) 08:58:30)
For completeness, how difficult would it be to add the following subcommand:
gibak restore [dirty|pristine] <path> [to <alternate path>] [as of <time_spec>]
Usage examples:
I screwed up everything in my taco project. Start from my latest backup.
$ gibak restore project/taco gibak: Cannot restore over existing directory without the 'as of' keyword. gibak: Either remove the 'project/taco' directory, or specify a timestamp.
$ gibak restore project/taco as of 1 second ago gibak: Restoring project/taco from 2008-08-19 09:00:00 -0400 gibak: 45 files to be restored, but 7 files in project/taco are not in the backup. gibak: Either specify 'dirty' or 'pristine' to the command.
$ gibak restore pristine project/taco as of 1 second ago gibak: Restoring pristine project/taco from 2008-08-19 09:00:00 -0400 gibak: 45 files restored, and 7 files that were not in the backup were deleted.
I want to copy my home directory from 2 years ago to a working area, so I can burn a CD to send to my lawyer.
$ gibak restore . to /var/tmp/Dewey-Cheatham-and-Howe as of 2 years ago gibak: Restoring . to /var/tmp/Dewey-Cheatham-and-Howe from 2006-08-19 10:15:00 -0400 gibak: 27328 files restored.
Oops! Make that *July*!
$ gibak restore pristine . to /var/tmp/Dewey-Cheatham-and-Howe as 2006-07-19 10:15:00 -0400 gibak: Restoring . to /var/tmp/Dewey-Cheatham-and-Howe from 2006-07-19 9:30:00 -0400 gibak: 625 files restored, 8 files that were not in the backup were deleted.
- 576 http://www.reddit.com/r/programming
- 148 http://del.icio.us/subblue
- 43 http://forum.slicehost.com/comments.php?DiscussionID=1728&page=1
- 40 http://www.reddit.com/r/programming/info/6o226/comments
- 32 http://anarchaia.org
- 22 http://railsfs.blogspot.com/2008/04/gitback-git-based-backup-tool-with-os-x.html
- 22 http://www.reddit.com/r/programming/new
- 19 http://www.middleastpost.com/246/links-for-2008-07-09
- 16 http://www.artima.com/forums/flat.jsp?forum=123&thread=227832
- 15 http://del.icio.us/popular/backup
Keyword(s):[blog] [frontpage] [gibak] [release] [backup] [git] [ocaml]
References:[A better backup system based on Git]