-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lvm plus rsync #314
Comments
This could be worked into the mysql-lvm module. Here's a list of things we would need to consider.
None of these things are deal breakers, but should be considers. I'm assuming your goal is to reduce backup time, is that correct? Have you tested how long it takes for rsync to run after the data has been seeded? If it only takes 10 minutes to copy all the data using tar, you're really not going to save that much time and it's going to be CPU intensive. |
Hi
Thanks for your fast response
I agree with your points
Perhaps another way would be to implement a post tar command
Ie a command that runs after the tar (and for lvm MySQL dump after the export) and before the snapshot is shutdown
That way all the multiple logic would remain
FYI
I would put the synched copy
In /var/spool/Holland/var/lib/MySQL
( sorry my iPhone keeps on capping things)
That way I have an easy access to all the databases and tables within
I’ve not looked at the source code yet
What is the best way to have a play with this
Thanks in advance
Dave
…Sent from my iPhone
On 24 Sep 2020, at 18:22, soulen3 ***@***.***> wrote:
This could be worked into the mysql-lvm module. Here's a list of things we would need to consider.
Holland doesn't really have a concept of incremental backups, so you would need to put the backup somewhere besides /var/spool/holland/. (New config option like you mentioned)
This will mess with purging failed backups.
Using rsync in this way will mean the plugin would only have one active copy of the data. By default holland will complete a backup before purging the old one.
I'm concerned that a corrupt backup directory would look similar to a successful one.
None of these things are deal breakers, but should be considers.
I'm assuming your goal is to reduce backup time, is that correct? Have you tested how long it takes for rsync to run after the data has been seeded? If it only takes 10 minutes to copy all the data using tar, you're really not going to save that much time and it's going to be CPU intensive.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
Hello, I am curious if you could clarify, as I am not sure I understood this
Do you mean to say that your backup takes 10 minutes with:
But that it takes an hour with both of these defaults:
Using My question is whether you mean as you said and that the backup is slow due to Respectfully, |
Hi
Sorry I think a typo made it unclear
Timings are both with tar 122 gig
Tar with default compression over an hour
Tar with zero compression 12 minutes
I was not aware of the dir method
It looks like a directory copy
Thanks
…Sent from my iPhone
On 24 Sep 2020, at 21:37, mikegriffin ***@***.***> wrote:
Hello,
I am curious if you could clarify, as I am not sure I understood this
effectively not tar backs up 122 Gig in 10 minutes with tar compression takes an hour
Do you mean to say that your backup takes 10 minutes with:
archive-method=dir
But that it takes an hour with both of these defaults:
archive-method=tar
[compression]
method = gzip
options = ""
inline = yes
split = no
level = 1
Using archive-method=tar with default gzip compression can be quite slow, which is why Holland supports pigz or zstd. If your data in the "large blobs" are binary data, this is effectively "compressed" data and you would not want to use a compression method besides none
My question is whether you mean as you said and that the backup is slow due to tar or if you meant that the backup is fast when compression is off, in either tar/dir archive-method.
Respectfully,
Mike
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
Being honest, I forgot |
Please let us know if (and maybe even how) the Adding in a new method like a hypothetical I would not think that |
Hi Soulen
I think a ten minute backup is brilliant
So really happy with that
My additional aim would be to have the whole directory structure available without having to untar
Also i come from the time when 10 meg disks were a luxury so always looking to minimise data moving about
At the moment my backups are ten minutes or so to a separate disk on the same box
Coping about 122 gig
This is then copied to a remote server which takes about 50 mins
An r-synch would reduce the in box to a few gig and similarly for the remote copy
The non blocking lvm is brilliant
Thanks everyone
This is now more of a mental exercise
For me to understand Holland
And hopefully provide more options and functionality for all
Dave
…Sent from my iPhone
On 24 Sep 2020, at 21:55, soulen3 ***@***.***> wrote:
Being honest, I forgot archive-method was an option. Does that solve your issue? I'm still not sure if I understand what you're trying to accomplish. I'm assuming you're trying to reduce the amount of time you need the snapshot available. Is that correct?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
Hi Mike
I am not aware of what the dir method is
Where is it configured I have as yet not found it in the docs
I’m guessing it’s a tar by directory or similar
For my databases total size 122gig the daily changes are I would say less than 1 gig
In real data terms, in actual updated files i would need to do some further investigation
It seemed a good idea to this from the lvm snapshot before it is dropped
Perhaps as I mentioned earlier in the thread a post tar - but before snapshot command would be suitable
Thanks for all your responses
Dave
…Sent from my iPhone
On 24 Sep 2020, at 22:00, mikegriffin ***@***.***> wrote:
Please let us know if (and maybe even how) the dir method is useful for you.
Adding in a new method like a hypothetical rsync-partial is a heavy hammer and we try to avoid confusing configuration options or footguns (understanding these implications is a very niche concept for data integrity.)
I would not think that dir is much faster than tar when compression method is none - it is mostly useful in cases where you want some other process to not copy a giant single file (external to holland) and where also the split option doesn't quite solve your dilemma. If I am honest, a ten minute backup sounds pretty good and I would not think that an rsync in default mode (without -P --append) would be faster.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
https://docs.hollandbackup.org/docs/provider_configs/mysql-lvm.html
After the snapshot is complete you can run a command using |
Hi Soulen
Brilliant thank you re the tar dir
Re after backup command , i have this in use at the moment and i run scripts that check diskspace and send out mail, works brilliantly as do before and failure ones
Unfortunately as i understand this the snapshot has gone at this point.
So my rsync idea would not work nicely, hence i would add an extra command ,
Before snapshot drop or similar
Thanks
Dave
www.shapland.co.uk
www.abo.co.uk
… On 24 Sep 2020, at 22:28, soulen3 ***@***.***> wrote:
https://docs.hollandbackup.org/docs/provider_configs/mysql-lvm.html
Last option under mysql-lvm
archive-method = tar | dir (default: tar)
Create a tar file of the datadir, or just copy it.
After the snapshot is complete you can run a command using after-backup-command
https://docs.hollandbackup.org/docs/config.html#backup-set-configs
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
The problems with an rsync command that writes outside of backupdir are many (some examples are checking disk space is no longer viable, options that don't have to do expensive checksums of those files are risky, etc) and there has been a goal, when you have open a snapshot, to close it as quickly as possible due to "negative scalability" of read performance, as the snapshot size grows. Allowing any hooks while the snapshot is open is obviously possible but perhaps not something that should be encouraged. Did the If you really do think you have some large files in the MySQL data dir that are 100% never written to, I would encourage you to test the rsync performance outside of holland using two backups if you have the space. Something like:
If you didn't have space for three backups or whatever is required to do such a test, make sure that you are reading the freshest backup from the same mount point, when you are doing an rsync test outside of Holland, so that you have the same read pressure. |
By the way, nothing is stopping you now from the If many or large data files haven't changed, then you should see a significant speed up there from the current 50 minutes you measure. Whether Holland proper used rsync or not has no impact on your external copy (unless you meant that you wouldn't store a local copy at all) |
Hi Mike
Thanks for your feedback, I understand the issues and your reservations
I have just tried the dir method which ran in about the same time as the tar backup
Backup completed in 11 minutes, 11.28 seconds
- command line output below
I will investigate the speed up of the rsync utility too
Question the log is stating
Unknown parameter 'defaults-file' in section 'mysql:client’
Pete Caseta (from Rackspace) and I could not find where this is being set
Could you point me in the right direction
Dave
Holland 1.1.21 started with pid 14045
--- Starting backup run ---
Creating backup path /var/spool/holland/mysql-lvm/20200925_070602
Unknown parameter 'defaults-file' in section 'mysql:client'
No backups purged
Estimated Backup Size: 121.77GB
Starting backup[mysql-lvm/20200925_070602] via plugin mysql-lvm
Backing up /data00/var/lib/mysql via snapshot
Auto-sizing snapshot-size to 15.00GB (3840 extents)
Acquiring read-lock and flushing tables
Recorded binlog = m1-mysql-bin.000049 position = 500876761
Recorded slave replication status: master_binlog = mysql-bin.001327 master_position = 193160447
Created snapshot volume /dev/vglocal01/data00_snapshot
Releasing read-lock
xfs filesystem detected on /dev/vglocal01/data00_snapshot. Using mount -o nouuid
Mounted /dev/vglocal01/data00_snapshot on /tmp/tmpRv8ubA
Starting InnoDB recovery
Bootstrapping with /usr/libexec/mysqld
Starting /usr/libexec/mysqld --defaults-file=/tmp/tmpRv8ubA/var/lib/mysql/my.innodb_recovery.cnf --bootstrap
/usr/libexec/mysqld has stopped
/usr/libexec/mysqld ran successfully
Running: cp --archive /tmp/tmpRv8ubA/var/lib/mysql -t /var/spool/holland/mysql-lvm/20200925_070602/backup_data
Unmounted /dev/vglocal01/data00_snapshot
Final LVM snapshot size for /dev/vglocal01/data00_snapshot is 44.54MB during pre-remove
Removed snapshot /dev/vglocal01/data00_snapshot
Removing temporary mountpoint /tmp/tmpRv8ubA
Final on-disk backup size 121.77GB
100.00% of estimated size 121.77GB
Backup completed in 11 minutes, 11.28 seconds
Released lock /etc/holland/backupsets/mysql-lvm.conf
--- Ending backup run ---
… On 25 Sep 2020, at 06:39, mikegriffin ***@***.***> wrote:
The problems with an rsync command that writes outside of backupdir are many (some examples are checking disk space is no longer viable, options that don't have to do expensive checksums of those files are risky, etc) and there has been a goal, when you have open a snapshot, to close it as quickly as possible due to "negative scalability" of read performance, as the snapshot size grows.
Allowing any hooks while the snapshot is open is obviously possible but perhaps not something that should be encouraged.
Did the archive-method=dir generally solve your issue (not wanting to untar the resulting backup during restore) with acceptable performance during the backup?
If you really do think you have some large files in the MySQL data dir that are 100% never written to, I would encourage you to test the rsync performance outside of holland using two backups if you have the space. Something like:
Set Holland lvm to use dir copy instead of tar
Copy a backup out of your Holland backupdir when it is complete, to some other location on the same mount point, or one with similar performance
Wait a couple of days and when Holland is not running, do an rsync from the newest backup dir to the one you preserved and see if it is significantly faster (I think this is unlikely)
If you didn't have space for three backups or whatever is required to do such a test, make sure that you are reading the freshest backup from the same mount point, when you are doing an rsync test outside of Holland, so that you have the same read pressure.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub <#314 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ARD55N2XZFPY73JNCK5MSI3SHQNJVANCNFSM4RYBKVRQ>.
|
Looks like there's a 'defaults-file' option being defined in the section 'mysql:client' of your backupset configuration file. 'defaults-file' isn't a valid option for that section of the config. It's looking for 'defaults-extra-file' if you're trying to add a That warning shouldn't be causing any issues though. |
Hi
Am using LVM with Tar at the moment, a lot of my databases are quite static anc contain large blobs,
running tar with 0 - effectively not tar backs up 122 Gig in 10 minutes with tar compression takes an hour
Ideally I would like to create a new way lvm+rsync
Create snapshop
Innodb recovery
rsync to target directory (in config I guess)
finish and drop snapshot
This could also be achieved with a command before snapshot is dropped and excluding everything
My first thoughts are to expand the lvm one
The text was updated successfully, but these errors were encountered: