Ticket #376 (new Improvement)
Hot backup of repository
| Reported by: | idzelis@… | Owned by: | somebody |
|---|---|---|---|
| Priority: | Major | Milestone: | |
| Component: | Backup | Version: | 1.5 |
| Keywords: | Cc: |
Description
[jira2trac import : issue created on December 4, 2006 12:47:56 PM CET http://issues.cocoondev.org/browse/DSY-376 ]
We have several distributed teams accessing our Daisy document repository from different timezones. Our repository is large, and takes about an hour to backup. During this time, new documents can't be created, changed, etc because the repository is locked. Because someone may always want access to the repository, there is no good time to backup the repository. The best solution would be a way to backup the repository without locking it for writes.
Change History
comment:2 Changed 3 years ago by paul
[jira2trac import : comment created by julio.reis on May 22, 2009 12:40:24 PM CEST]
+1
We have about 5,000 documents in Daisy, and already the backup takes from 6:25 to 8:22 am CET -- 1 hour 57 minutes. Too long to deny write access!
The time doesn't affect me too much, but it does affect my team mates in New Zealand... and if I do the backup earlier I will affect the Canadians ;-) So fiddling with the backup hour won't solve anything. The backup has become a liability; but we cannot simply not do it.
So, please create a backup which won't lock the repository. Please.
comment:3 Changed 3 years ago by paul
[jira2trac import : comment created by idzelis on May 22, 2009 4:42:04 PM CEST]
Our backups were running about 3-5 hours at this point. We've come up with a pretty nice strategy for backups.
This will only work on linux (or linux-type) machines. Our daisy installation is installed on a LVM-managed partition. To "backup" a database, we use daisy API to lock the database, then we create a LVM snapshot of the disk. This will "freeze" the state of the disk at that point in time. Then we backup the database, and then use the daisy API to unlock the database. Even with our 20+Gig disk image, we only need to lock the daisy repository for write access for about 30 seconds! The backup still takes 3-5 hours to backup the snapshot of the LVM, but at least daisy is open for business during that time. After the backup is done, the LVM snapshot is destroyed and the partition is "normal" again.
To perform incremental backups, we've created a rsync script that will hard link the previous 14 days of data. (However, since they are hard links - the files are only really stored once, but hardlinked into the daily backup directories) This will allow us to roll back the database to any state it was in the previous 14 days. The backups are stored on a separate NFS (or SMB?) mounted drive.
comment:4 Changed 3 years ago by paul
[jira2trac import : comment created by karel on March 10, 2010 5:12:09 PM CET]
Another company using Daisy reported to us that they were using a backup lock + a proprietary system (HP EVA) for backups.
Another solution (Similar to Min Idzelis' solution) would be to have two systems: The 'master' system and a 'backup' system. The backup system would receive data using mysql replication and an rsync cron job or something similar for the blobstore and indexstore. Taking a backup would require these steps:
- take a backup lock on the master
- wait until you are sure mysql slave replication is not lagging (maatkit tools can help you here)
- wait for a last rsync of the blobstore to complete
- stop replication (and rsync)
- unlock the master
At this point you can continue editing on the master repository, and you can take as much time as needed to do the complete backup. When you are done, resume mysql replication and rsync'ing the blobstore and indexstore
[jira2trac import : comment created by bruno on June 18, 2007 9:12:45 AM CEST]
This issue could be solved by some more intelligent lock-mode for the blobstore, exploiting the fact that the blobstore never updates blobs, it only adds new blobs and sometimes removes blobs.
From the point of view of the backup, the important thing is no data is lost:
The main problem is how to keep track of this queue:
Note about the blobstore-cleanup tool: care should be taken that, if the repository server is running, this doesn't remove blobs for documents just being added (= non-committed db transactions). This could e.g. be solved by only considering blobs that are older than e.g. one day.