Editing
2 Node Cluster: Dual Primary DRBD + CLVM + KVM + Live Migrations
Jump to navigation
Jump to search
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
{{header}} == Info == a brain dump... === OS Details === * ''OS'' Centos 6.3 * ''Packages'' listed here is a base ( others will be installed ) ** pacemaker-1.1.7-6.el6.x86_64 ** cman-3.0.12.1-32.el6_3.2.x86_64 ** corosync-1.4.1-7.el6_3.1.x86_64 ** drbd83-utils-8.3.15-1.el6.elrepo.x86_64 ** libvirt-0.9.10-21.el6_3.8.x86_64 === Hardware === === LVM === ; my naming conventions are not great (vg and lv named the same) this was a work in progress.. *PV -> VG -> LV -> DRBD PV -> VG (CLVM) > LV [ Raw KVM Image ] **PV: /dev/md10 -> VG: raid10 -> LV: drbd_spacewalk -> PV: /dev/drbd9 -> VG: drbd_spacewalk -> LV: spacewalk -- spacewalk-ha kvm *'''node1''' **PV: /dev/md10 **VG: raid10 *'''node2''' **PV: /dev/sdb1 **VG: raid1 == Dual Primary DRBD/KVM Virt Install == === New KVM Virt - Details === * '''NewVirt''': spacewalk * '''SIZE''': 20GB * '''DRBD res''': 8 * '''NODE1''' : IP: 10.69.1.253 : Name: bigeye : VG: raid1 * '''NODE2''' : IP: 10.69.1.250 : Name: blindpig : VG: raid10 * KVM DISK cache setting: none === Creating the Dual Primary DRBD KVM Virt === ;* Run this on NODE1 1) create LVM for DRBD device <source lang="bash"> lvcreate --name drbd_spacewalk --size 21.1GB raid1 ssh 10.69.1.250 -C lvcreate --name drbd_spacewalk --size 21.1GB raid10 </source> 2) copy spacewalk.res to /etc/drbd.d/ <source lang="bash"> cp spacewalk.res /etc/drbd.d/ scp spacewalk.res 10.69.1.250:/etc/drbd.d/ </source> 3) reloading drbd <source lang="bash"> /etc/init.d/drbd reload ssh 10.69.1.250 -C /etc/init.d/drbd reload </source> 4) create DRBD device on both nodes <source lang="bash"> drbdadm -- --force create-md spacewalk ssh 10.69.1.250 -C drbdadm -- --force create-md spacewalk </source> 5) reloading drbd <source lang="bash"> /etc/init.d/drbd reload ssh 10.69.1.250 -C /etc/init.d/drbd reload </source> 6) bring drbd up on both nodes <source lang="bash"> drbdadm up spacewalk ssh 10.69.1.250 -C drbdadm up spacewalk </source> 7) set bigeye primary and overwrite blindpig <source lang="bash"> drbdadm -- --overwrite-data-of-peer primary spacewalk </source> 8) set blindpig secondary (should already be set) <source lang="bash"> ssh 10.69.1.250 -C drbdadm secondary spacewalk </source> 9) bigeye create PV/VG/LV (not setting VG to cluster aware yet due to LVM bug not using --monitor y) <source lang="bash"> pvcreate /dev/drbd9 vgcreate -c n drbd_spacewalk /dev/drbd9 lvcreate -L20G -nspacewalk drbd_spacewalk </source> 10) Activating VG drbd_spacewalk -- (should already be, but just incase) <source lang="bash"> vgchange -a y drbd_spacewalk </source> 11) create the POOL in virsh <source lang="bash"> virsh pool-create-as drbd_spacewalk --type=logical --target=/dev/drbd_spacewalk </source> 12a) If this is NEW kvm install - continue following - else go to step 12b ::1. Install new virt on bigeye:/dev/drbd_spacewalk/spacewalk named spacewalk-ha ::2. After installed and rebooted - scp virt definition and define <source lang="bash"> scp /etc/libvirt/qemu/spacewalk-ha.xml 10.69.1.250:/etc/libvirt/qemu/spacewalk-ha.xml ssh 10.69.1.250 -C virsh define /etc/libvirt/qemu/spacewalk-ha.xml </source> ::3. Linux? Test virsh shutdown (may need to install acpid) <source lang="bash"> virsh shutdown -ha </source> ::4. SKIP step 12b (go to #13) 12b) If this is a migration from an exsiting KVM virt - continue, else skip this (ONLY if you completed 12a) ::1. restore your KVM/LVM to the new LV: of=/dev/drbd_spacewalk/spacewalk bs=1M <source lang="bash"> command: dd if=<your image files.img> of=/dev/drbd_spacewalk/spacewalk bs=1M </source> ::2. Edit the exists KVM xml file -- copy the existing file to edit <source lang="bash"> cp /etc/libvirt/qemu/spacewalk.xml ./spacewalk-ha.xml </source> #-modify: <name>spacewalk</name> to <name>spacewalk-ha</name> #-remove: <uuid>[some long uuid]</uuid> <source lang="bash"> emacs spacewalk-ha.xml cp spacewalk-ha.xml /etc/libvirt/qemu/spacewalk-ha.xml # this will setup a uniuq UUID, which is needed before you copy to blindpig virsh define /etc/libvirt/qemu/spacewalk-ha.xml scp /etc/libvirt/qemu/spacewalk-ha.xml 10.69.1.250:/etc/libvirt/qemu/spacewalk-ha.xml ssh 10.69.1.250 -C virsh define /etc/libvirt/qemu/spacewalk-ha.xml </source> ; All install work is done. deactivate VG / set cluster aware / and down drbd for pacemaker provisioning 13) deactivate VG drbd_spacewalk on blindpig <source lang="bash"> vgchange -a n drbd_spacewalk </source> 14) set drbd primary on blindpig to set VG cluster aware <source lang="bash"> vgchange -a n drbd_spacewalk ssh 10.69.1.250 -C drbdadm primary spacewalk </source> 15) activate VG on both nodes <source lang="bash"> vgchange -a y drbd_spacewalk ssh 10.69.1.250 -C vgchange -a y drbd_spacewalk </source> 16) set VG cluster aware on both nodes (only one command is needed due to drbd) <source lang="bash"> vgchange -c y drbd_spacewalk </source> 17) deactivate VG <source lang="bash"> vgchange -a n drbd_spacewalk ssh 10.69.1.250 -C vgchange -a n drbd_spacewalk </source> 18) down drbd on both - so we can put it in pacemaker <source lang="bash"> drbdadm down spacewalk ssh 10.69.1.250 -C drbdadm down spacewalk </source> ; Now lets provision Pacemaker -- we already expect you have a working pacemaker config with DLM/CLVM 19) Load the dual primary drbd/lvm RA config to the cluster <source lang="bash"> crm configure < spacewalk.crm </source> 20) verify all is good with crm_mon: DRBD should look like something below <source lang="bash"> crm_mon -f Master/Slave Set: ms_drbd-spacewalk [p_drbd-spacewalk] Masters: [ blindpig blindpig ] </source> 21) Load the VirtualDomain RA confi to the cluster <source lang="bash"> crm configure < spacewalk-vd.crm </source> ; Files Created # spacewalk.res # for DRBD # spacewalk.crm # DRBD/LVM configs to load into crm configure # spacewalk-vd.crm # KVM VirtualDomain configs to load into crm configure === Config Examples === ==== Pacemaker / crmsh ==== ===== DRBD/LVM ===== <pre> * Note - we do not monitor LVM. Sometimes LVM command hang and are not really an issue.. * these are all auto created from the script below <pre> primitive p_drbd-spacewalk ocf:linbit:drbd \ params drbd_resource="spacewalk" \ operations $id="p_drbd_spacewalk-operations" \ op monitor interval="20" role="Slave" timeout="20" \ op monitor interval="10" role="Master" timeout="20" \ op start interval="0" timeout="240" \ op stop interval="0" timeout="100" start-delay="0" primitive p_lvm-spacewalk ocf:heartbeat:LVM \ operations $id="spacewalk-LVM-operations" \ op start interval="0" timeout="120" \ op stop interval="0" timeout="120" \ params volgrpname="drbd_spacewalk" ms ms_drbd-spacewalk p_drbd-spacewalk \ meta master-max="2" clone-max="2" notify="true" migration-threshold="1" allow-migrate="true" target-role="Started" interleave="true" is-managed="true" clone clone_lvm-spacewalk p_lvm-spacewalk \ meta clone-max="2" notify="true" target-role="Started" interleave="true" is-managed="true" colocation c_lvm-spacewalk_on_drbd-spacewalk inf: clone_lvm-spacewalk ms_drbd-spacewalk:Master </pre> ===== KVM Virt - VirtualDomain===== * these are all auto created from the script below <pre> primitive p_vd-spacewalk-ha ocf:heartbeat:VirtualDomain \ params config="/etc/libvirt/qemu/spacewalk-ha.xml" migration_transport="ssh" force_stop="0" hypervisor="qemu:///system" \ operations $id="p_vd-spacewalk-operations" \ op start interval="0" timeout="90" \ op stop interval="0" timeout="90" \ op migrate_from interval="0" timeout="240" \ op migrate_to interval="0" timeout="240" \ op monitor interval="10" timeout="30" start-delay="0" \ meta allow-migrate="true" failure-timeout="10min" target-role="Started" colocation c_vd-spacewalk-on-master inf: p_vd-spacewalk-ha ms_drbd-spacewalk:Master order o_drbm-lvm-vd-start-spacewalk inf: ms_drbd-spacewalk:promote clone_lvm-spacewalk:start p_vd-spacewalk-ha:start </pre> ==== DRBD ==== * these are all auto created from the script below <pre> resource spacewalk { protocol C; startup { become-primary-on both; } net { allow-two-primaries; after-sb-0pri discard-zero-changes; after-sb-1pri discard-secondary; after-sb-2pri disconnect; } disk { on-io-error detach; fencing resource-only; } handlers { #split-brain "/usr/lib/drbd/notify-split-brain.sh root"; fence-peer "/usr/lib/drbd/crm-fence-peer.sh"; after-resync-target "/usr/lib/drbd/crm-unfence-peer.sh"; } syncer { rate 50M; } on bigeye { device /dev/drbd9; disk /dev/raid1/drbd_spacewalk; address 10.69.1.253:7799; meta-disk internal; } on blindpig { device /dev/drbd9; disk /dev/raid10/drbd_spacewalk; address 10.69.1.250:7799; meta-disk internal; } } </pre> === Script === * this will create the config/install above <pre> #cat create.new.sh NAME=spacewalk ## virt name SIZE=20 ## virt size GB LVMETA=lvmeta ## volume group on VG stated above for metadata DRBDNUM=8 ## how many drbds do you have right now? NODE1_VG=raid1 ## VolumeGroup for DRBD lvm NODE2_VG=raid10 ## VolumeGroup for DRBD lvm NODE1_IP=10.69.1.253 NODE2_IP=10.69.1.250 NODE1_NAME=bigeye NODE2_NAME=blindpig #NODE3_NAME=blindpig2 ############ DO NOT EDIT BELOW ####################### NODE2=$NODE2_IP DRBD_SIZE=$SIZE let DRBD_SIZE+=1 let DRBDNUM+=1 #let DRBDNUM+=1 let PORT=7790+DRBDNUM echo ' resource '$NAME' { protocol C; startup { become-primary-on both; } net { allow-two-primaries; after-sb-0pri discard-zero-changes; after-sb-1pri discard-secondary; after-sb-2pri disconnect; } disk { on-io-error detach; fencing resource-only; } handlers { #split-brain "/usr/lib/drbd/notify-split-brain.sh root"; fence-peer "/usr/lib/drbd/crm-fence-peer.sh"; after-resync-target "/usr/lib/drbd/crm-unfence-peer.sh"; } syncer { rate 50M; } on '$NODE1_NAME' { device /dev/drbd'$DRBDNUM'; disk /dev/'$NODE1_VG'/drbd_'$NAME'; address '$NODE1_IP':'$PORT'; meta-disk internal; } on '$NODE2_NAME' { device /dev/drbd'$DRBDNUM'; disk /dev/'$NODE2_VG'/drbd_'$NAME'; address '$NODE2_IP':'$PORT'; meta-disk internal; } } ' > $NAME.res echo 'primitive p_drbd-'$NAME' ocf:linbit:drbd \ params drbd_resource="'$NAME'" \ operations $id="p_drbd_'$NAME'-operations" \ op monitor interval="20" role="Slave" timeout="20" \ op monitor interval="10" role="Master" timeout="20" \ op start interval="0" timeout="240" \ op stop interval="0" timeout="100" start-delay="0" primitive p_lvm-'$NAME' ocf:heartbeat:LVM \ operations $id="'$NAME'-LVM-operations" \ op start interval="0" timeout="120" \ op stop interval="0" timeout="120" \ params volgrpname="drbd_'$NAME'" ms ms_drbd-'$NAME' p_drbd-'$NAME' \ meta master-max="2" clone-max="2" notify="true" migration-threshold="1" allow-migrate="true" target-role="Started" interleave="true" is-managed="true" clone clone_lvm-'$NAME' p_lvm-'$NAME' \ meta clone-max="2" notify="true" target-role="Started" interleave="true" is-managed="true" colocation c_lvm-'$NAME'_on_drbd-'$NAME' inf: clone_lvm-'$NAME' ms_drbd-'$NAME':Master ' > $NAME'.crm' #location drbd_'$NAME'_excl ms_drbd-'$NAME' \ # rule $id="drbd_'$NAME'_excl-rule" -inf: #uname eq '$NODE3_NAME' echo 'primitive p_vd-'$NAME'-ha ocf:heartbeat:VirtualDomain \ params config="/etc/libvirt/qemu/'$NAME'-ha.xml" migration_transport="ssh" force_stop="0" hypervisor="qemu:///system" \ operations $id="p_vd-'$NAME'-operations" \ op start interval="0" timeout="90" \ op stop interval="0" timeout="90" \ op migrate_from interval="0" timeout="240" \ op migrate_to interval="0" timeout="240" \ op monitor interval="10" timeout="30" start-delay="0" \ meta allow-migrate="true" failure-timeout="10min" target-role="Started" colocation c_vd-'$NAME'-on-master inf: p_vd-'$NAME'-ha ms_drbd-'$NAME':Master order o_drbm-lvm-vd-start-'$NAME' inf: ms_drbd-'$NAME':promote clone_lvm-'$NAME':start p_vd-'$NAME'-ha:start ' > $NAME'-vd.crm' ## test DRBD before cmd="drbdadm dump -t $NAME.res" $cmd >/dev/null rc=$? if [[ $rc != 0 ]] ; then echo -e "\n !!! DRBD config ("$NAME.res")file will not work.. need to fix this first. exiting...\n" echo -e " check command: "$cmd"\n"; echo -e "\n * HINT: you might just need to remove the file /etc/drbd.d/"$NAME.res" [be careful]"; echo -e " mv /etc/drbd.d/"$NAME.res" ./$NAME.res.disabled."$NODE1_NAME echo -e " scp "$NODE2":/etc/drbd.d/"$NAME.res" ./$NAME.res.disabled."$NODE2_NAME echo -e " ssh "$NODE2" -C mv /etc/drbd.d/"$NAME.res" /tmp/$NAME.res.disabled" # exit $rc fi echo -e " * DRBD config verified (it should work)\n" echo ' ' echo -e '\n# 1) create LVM for DRBD device' echo ' 'lvcreate --name drbd_$NAME --size $DRBD_SIZE'.1GB' $NODE1_VG echo ' 'ssh $NODE2 -C lvcreate --name drbd_$NAME --size $DRBD_SIZE'.1GB' $NODE2_VG echo -e '\n# 2) copy '$NAME'.res to /etc/drbd.d/' echo ' 'cp $NAME.res /etc/drbd.d/ echo ' 'scp $NAME.res $NODE2:/etc/drbd.d/ echo -e '\n# 3) reloading drbd' echo ' '/etc/init.d/drbd reload echo ' 'ssh $NODE2 -C /etc/init.d/drbd reload echo -e '\n# 4) create DRBD device on both nodes' echo ' 'drbdadm -- --force create-md $NAME echo ' 'ssh $NODE2 -C drbdadm -- --force create-md $NAME echo -e '\n# 5) reloading drbd' echo ' '/etc/init.d/drbd reload echo ' 'ssh $NODE2 -C /etc/init.d/drbd reload echo -e '\n# 6) bring drbd up on both nodes' echo ' 'drbdadm up $NAME echo ' 'ssh $NODE2 -C drbdadm up $NAME echo -e '\n# 7) set '$NODE1_NAME' primary and overwrite '$NODE2_NAME echo ' 'drbdadm -- --overwrite-data-of-peer primary $NAME echo -e '\n# 8) set '$NODE2_NAME' secondary (should already be set)' echo ' 'ssh $NODE2 -C drbdadm secondary $NAME echo -e '\n# 9) '$NODE1_NAME' create PV/VG/LV (not setting VG to cluster aware yet due to LVM bug not using --monitor y)' echo ' 'pvcreate /dev/drbd$DRBDNUM echo ' 'vgcreate -c n drbd_$NAME /dev/drbd$DRBDNUM echo ' 'lvcreate -L$SIZE'G' -n$NAME drbd_$NAME echo -e '\n# 10) Activating VG drbd_'$NAME' -- (should already be, but just incase)' echo ' 'vgchange -a y drbd_$NAME ## ubuntu bug -- enable if ubuntu host #echo ' 'vgchange -a y drbd_$NAME --monitor y echo -e '\n# 11) create the POOL in virsh' echo ' 'virsh pool-create-as drbd_$NAME --type=logical --target=/dev/drbd_$NAME echo -e '\n# 12a) If this is NEW kvm install - continue following - else go to step 12b' echo ' + NOW install new virt from '$NODE1_NAME' on /dev/drbd_'$NAME'/'$NAME named $NAME'-ha' echo ' # after intalled and rebooted' echo ' ' scp /etc/libvirt/qemu/$NAME'-ha.xml' $NODE2:/etc/libvirt/qemu/$NAME'-ha.xml' echo ' ' ssh $NODE2 -C virsh define /etc/libvirt/qemu/$NAME'-ha.xml' echo ' # test virsh shutdown -- install acpid' echo ' ' virsh shutdown $NAME1'-ha' echo ' * SKIP 12b ' echo ' 12b) If this is a migration from an exsiting KVM virt - continue, else skip 2, you already completed step 1 right?' echo ' ## restore your KVM/LVM to the new LV: of=/dev/drbd_'$NAME'/'$NAME' bs=1M' echo ' command: dd if=<your image files.img> of=/dev/drbd_'$NAME'/'$NAME' bs=1M' echo ' ## Edit the exists KVM xml file -- copy the existing file to edit' echo ' ' cp /etc/libvirt/qemu/$NAME'.xml' ./$NAME'-ha.xml' echo ' -modify: <name>'$NAME'</name> to <name>'$NAME'-ha</name>' echo ' -remove: <uuid>[some long uuid]</uuid>' echo ' ' emacs $NAME'-ha.xml' echo ' ' cp $NAME'-ha.xml' /etc/libvirt/qemu/$NAME'-ha.xml' echo ' #' this will setup a uniuq UUID, which is needed before you copy to $NODE2_NAME echo ' ' virsh define /etc/libvirt/qemu/$NAME'-ha.xml' echo ' ' scp /etc/libvirt/qemu/$NAME'-ha.xml' $NODE2:/etc/libvirt/qemu/$NAME'-ha.xml' echo ' ' ssh $NODE2 -C virsh define /etc/libvirt/qemu/$NAME'-ha.xml' echo -e '\n#' echo '# All install work is done. deactivate VG / set cluster aware / and down drbd for pacemaker provisioning' echo -e "#\n" echo -e '\n# 13) deactivate VG drbd_'$NAME' on '$NODE2_NAME ## ubuntu bug -- enable if ubuntu host #echo ' 'vgchange -a n drbd_$NAME --monitor y echo ' 'vgchange -a n drbd_$NAME echo -e '\n# 14) set drbd primary on '$NODE2_NAME' to set VG cluster aware' ## ubuntu bug -- enable if ubuntu host #echo ' 'vgchange -a n drbd_$NAME --monitor y echo ' 'vgchange -a n drbd_$NAME echo ' 'ssh $NODE2 -C drbdadm primary $NAME echo -e '\n# 15) activate VG on both nodes' ## ubuntu bug -- enable if ubuntu host #echo ' 'vgchange -a y drbd_$NAME --monitor y #echo ' 'ssh $NODE2 -C vgchange -a y drbd_$NAME --monitor y echo ' 'vgchange -a y drbd_$NAME echo ' 'ssh $NODE2 -C vgchange -a y drbd_$NAME echo -e '\n# 16) set VG cluster aware on both nodes (only one command is needed due to drbd)' echo ' 'vgchange -c y drbd_$NAME echo -e '\n# 17) deactivate VG' ## ubuntu bug -- enable if ubuntu host #echo ' 'vgchange -a n drbd_$NAME --monitor y #echo ' 'ssh $NODE2 -C vgchange -a n drbd_$NAME --monitor y echo ' 'vgchange -a n drbd_$NAME echo ' 'ssh $NODE2 -C vgchange -a n drbd_$NAME echo -e '\n# 18) down drbd on both - so we can put it in pacemaker' echo ' 'drbdadm down $NAME echo ' 'ssh $NODE2 -C drbdadm down $NAME echo -e '\n# 19) MAKE sure the disk cache for the virtio is set to NONE - live migrate will fail is no' echo -e '\n#' echo '# Now lets provision Pacemaker -- we already expect you have a working pacemaker config with DLM/CLVM' echo -e "#\n" echo -e '\n# 19) Load the dual primary drbd/lvm RA config to the cluster' echo ' crm configure < '$NAME'.crm' echo -e '\n# 20) verify all is good with crm_mon: DRBD should look like something below' echo -e " crm_mon -f\n" echo ' Master/Slave Set: ms_drbd-'$NAME' [p_drbd-'$NAME']' echo -e ' Masters: [ '$NODE2_NAME' '$NODE2_NAME" ]\n" echo -e '\n# 21) Load the VirtualDomain RA confi to the cluster' echo ' crm configure < '$NAME'-vd.crm' echo '#####################################################################' echo '# Files Created' echo '# '$NAME'.res # for DRBD' echo '# '$NAME'.crm # DRBD/LVM configs to load into crm configure' echo '# '$NAME'-vd.crm # KVM VirtualDomain configs to load into crm configure' </pre> ==== notes ==== * running the script will test the DRBD resource and at least print a warning <pre> !!! DRBD config (spacewalk.res) file will not work.. need to fix this first. exiting... check command: drbdadm dump -t spacewalk.res * HINT: you might just need to remove the file /etc/drbd.d/spacewalk.res [be careful] mv /etc/drbd.d/spacewalk.res ./spacewalk.res.disabled.bigeye scp 10.69.1.250:/etc/drbd.d/spacewalk.res ./spacewalk.res.disabled.blindpig ssh 10.69.1.250 -C mv /etc/drbd.d/spacewalk.res /tmp/spacewalk.res.disabled </pre> == Backups== === Pacemaker === crm configure save /path/to/file.bak === DRBD & CLVM === ; We have the option to snapshopt both the DRBD backing device and the KVM Virt LV ; Major issues with backups. 1) LVM DRBD backing device snapshots hang (when primary) :* workaround: we will set DRBD device as secondary / snapshot+backup DRBD LV / set DRBD primary 2) CLVM does not allow for snapshots :* workaround: we will set DRBD device as secondary / remove VG cluster bit / snapshot+backup CLVM / set VG cluster bit / set DRBD primary ==== DRBD backing Device ==== * you will have to edit some variables for this to work properly * It will also set DRBD and others in unmanaged mode, so pacemaker will not potential fence on failures * This is a heavily modified version of ''http://repo.firewall-services.com/misc/virt/virt-backup.pl'' (other options like cleanup do not work) ;Usage: ./virt-backup-drbd_backdevice.pl vm=<virt_name> [--compress] ; virt-backup-drbd_backdevice.pl <pre> #!/usr/bin/perl -w # vm == drbd use XML::Simple; use Sys::Virt; use Getopt::Long; # Set umask umask(022); # Some constant my $drbd_dir = '/etc/drbd.d/'; our %opts = (); our @vms = (); our @excludes = (); our @disks = (); our $drbd_dev; my $migrate_to = 'bigeye'; ## host to migrate machines to if they are running locally my $migrate_from = 'blindpig'; ## ht # Sets some defaults values my $host =`hostname`; chomp($host); my $migration = 0; #placeholder # What to run. The default action is to dump $opts{action} = 'dump'; # Where backups will be stored. This directory must already exists $opts{backupdir} = '/NFS/_local_/_backups/DRBD/'; # Size of LVM snapshots (which will be used to backup VM with minimum downtown # if the VM store data directly on a LV) $opts{snapsize} = '5G'; # Debug $opts{debug} = 1; $opts{snapshot} = 1; $opts{compress} = 'none'; $opts{lvcreate} = '/sbin/lvcreate -c 512'; $opts{lvremove} = '/sbin/lvremove'; $opts{blocksize} = '262144'; $opts{nice} = 'nice -n 19'; $opts{ionice} = 'ionice -c 2 -n 7'; $opts{livebackup} = 1; $opts{wasrunning} = 1; # get command line arguments GetOptions( "debug" => \$opts{debug}, "keep-lock" => \$opts{keeplock}, "state" => \$opts{state}, "snapsize=s" => \$opts{snapsize}, "backupdir=s" => \$opts{backupdir}, "vm=s" => \@vms, "action=s" => \$opts{action}, "cleanup" => \$opts{cleanup}, "dump" => \$opts{dump}, "unlock" => \$opts{unlock}, "connect=s" => \$opts{connect}, "snapshot!" => \$opts{snapshot}, "compress:s" => \$opts{compress}, "exclude=s" => \@excludes, "blocksize=s" => \$opts{blocksize}, "help" => \$opts{help} ); # Set compression settings if ($opts{compress} eq 'lzop'){ $opts{compext} = ".lzo"; $opts{compcmd} = "lzop -c"; } elsif ($opts{compress} eq 'bzip2'){ $opts{compext} = ".bz2"; $opts{compcmd} = "bzip2 -c"; } elsif ($opts{compress} eq 'pbzip2'){ $opts{compext} = ".bz2"; $opts{compcmd} = "pbzip2 -c"; } elsif ($opts{compress} eq 'xz'){ $opts{compext} = ".xz"; $opts{compcmd} = "xz -c"; } elsif ($opts{compress} eq 'lzip'){ $opts{compext} = ".lz"; $opts{compcmd} = "lzip -c"; } elsif ($opts{compress} eq 'plzip'){ $opts{compext} = ".lz"; $opts{compcmd} = "plzip -c"; } # Default is gzip elsif (($opts{compress} eq 'gzip') || ($opts{compress} eq '')) { $opts{compext} = ".gz"; $opts{compcmd} = "gzip -c"; } else{ $opts{compext} = ""; $opts{compcmd} = "cat"; } # Allow comma separated multi-argument @vms = split(/,/,join(',',@vms)); @excludes = split(/,/,join(',',@excludes)); # Backward compatible with --dump --cleanup --unlock $opts{action} = 'dump' if ($opts{dump}); $opts{action} = 'cleanup' if ($opts{cleanup}); $opts{action} = 'unlock' if ($opts{unlock}); # Stop here if we have no vm # Or the help flag is present if ((!@vms) || ($opts{help})){ usage(); exit 1; } if (! -d $opts{backupdir} ){ print "$opts{backupdir} is not a valid directory\n"; exit 1; } print "\n" if ($opts{debug}); foreach our $vm (@vms){ print "Checking $vm status\n\n" if ($opts{debug}); our $backupdir = $opts{backupdir}.'/'.$vm; if ($opts{action} eq 'cleanup'){ print "Running cleanup routine for $vm\n\n" if ($opts{debug}); # run_cleanup(); } elsif ($opts{action} eq 'dump'){ print "Running dump routine for $vm\n\n" if ($opts{debug}); run_dump(); } # else { # usage(); # exit 1; # } } ############################################################################ ############## FUNCTIONS #################### ############################################################################ sub prepare_backup{ my ($source,$res); my $target = $vm; my $match=0; ## locate the backing device for this res my @drbd_res = &runcmd("drbdadm dump $vm"); foreach my $line (@drbd_res) { $res = $line; if ($match == 1 && $line =~ /disk\s+(.*);/) { $source = $1; $match = 0; } if ($line =~ /device\s+.*(drbd\d+)\s+minor/) { $drbd_dev = $1; } if ($line =~ /on\s$host\s+{/i) { $match = 1; } } if (!$source) { print "Did not find DRBD backing deviced for VM\n"; exit; } else { ## set target backup file based on device $target = $source; $target =~ s/\//_-_/g; ## rename / to _-_ $target =~ s/^_-_//g; ## remove leading _-_ } ## Check if VM is running locally - migrate if off to backup ## set migrate = 1, to migrate back when done my $local_test = join("",&runcmd("virsh list")); if ($local_test =~ /$vm.*running/i) { print "$vm running locally - migration to $migrate_to\n"; my $pvd = &GetPVD($vm); &runcmd("crm resource migrate $pvd $migrate_to"); $migration = 1; sleep 1; my $remote_test = join("",&runcmd("ssh $migrate_to -C virsh list | grep -i $vm",1)); while($local_test =~ /(.*$vm.*)/) { print " $migrate_from:\t" . $1 . "\n"; print "(r)$migrate_to:\t$remote_test\n"; sleep 5; $local_test = join("",&runcmd("virsh list",1)); $remote_test = join("",&runcmd("ssh $migrate_to -C virsh list | grep -i $vm",1)); } $remote_test = join("",&runcmd("ssh $migrate_to -C virsh list | grep -i $vm",1)); print "We must of migrated ok... \n"; print "(r)$migrate_to:\t$remote_test\n"; } &runcmd("crm resource unmanage clone_lvm-" . $vm); &runcmd("crm resource unmanage ms_drbd-" . $vm); #&runcmd("crm resource unmanage p_drbd-" . $vm); sleep 1; &runcmd("vgchange -aln drbd_" . $vm,0,5); sleep 2; &runcmd("drbdadm secondary " . $vm); &runcmd("ssh $migrate_to -C touch /tmp/backup.$drbd_dev"); &runcmd("ssh $migrate_to -C touch /tmp/backup.p_drbd-$vm"); &runcmd("touch /tmp/backup.$drbd_dev"); &runcmd("touch /tmp/backup.p_drbd-$vm"); my $sec_check = join("",&runcmd("drbdadm role $vm")); if( $sec_check !~ /Secondary\/Primary/) { print "Fail: DRBD res [$vm] is not Secondary! result: $sec_check\n"; exit; } else { print "OK: DRBD res [$vm] is Secondary. result: $sec_check\n"; } if (!-d $backupdir) { mkdir $backupdir || die $!; } if (!-d $backupdir.'.meta') { mkdir $backupdir . '.meta' || die $!; } lock_vm(); save_drbd_res($res); my $time = "_".time(); # Try to snapshot the source if snapshot is enabled if ( ($opts{snapshot}) && (create_snapshot($source,$time)) ){ print "$source seems to be a valid logical volume (LVM), a snapshot has been taken as " . $source . $time ."\n" if ($opts{debug}); $source = $source.$time; push (@disks, {source => $source, target => $target . '_' . $time, type => 'snapshot'}); } # Summarize the list of disk to be dumped if ($opts{debug}){ if ($opts{action} eq 'dump'){ print "\n\nThe following disks will be dumped:\n\n"; foreach $disk (@disks){ print "Source: $disk->{source}\tDest: $backupdir/$vm" . '_' . $disk->{target} . ".img$opts{compext}\n"; } } } if ($opts{livebackup}){ print "\nWe can run a live backup\n" if ($opts{debug}); } } sub run_dump{ # Pause VM, dump state, take snapshots etc.. prepare_backup(); # Now, it's time to actually dump the disks foreach $disk (@disks){ my $source = $disk->{source}; my $dest = "$backupdir/$vm" . '_' . $disk->{target} . ".img$opts{compext}"; print "\nStarting dump of $source to $dest\n\n" if ($opts{debug}); my $ddcmd = "$opts{ionice} dd if=$source bs=$opts{blocksize} | $opts{nice} $opts{compcmd} > $dest 2>/dev/null"; print $ddcmd . "\n"; unless( system("$ddcmd") == 0 ){ die "Couldn't dump the block device/file $source to $dest\n"; } # Remove the snapshot if the current dumped disk is a snapshot destroy_snapshot($source) if ($disk->{type} eq 'snapshot'); } &runcmd("crm resource manage p_drbd-" . $vm); &runcmd("crm resource manage ms_drbd-" . $vm); &runcmd("crm resource manage clone_lvm-" . $vm); &runcmd("drbdadm primary " . $vm); sleep 1; &runcmd("ssh $migrate_to -C rm /tmp/backup.$drbd_dev"); &runcmd("ssh $migrate_to -C rm /tmp/backup.p_drbd-$vm"); &runcmd("rm /tmp/backup.$drbd_dev"); &runcmd("rm /tmp/backup.p_drbd-$vm"); &runcmd("vgchange -ay drbd_" . $vm); sleep 1; &runcmd("crm_resource -r ms_drbd-$vm -C"); sleep 1; &runcmd("crm_resource -r clone_lvm-$vm -C"); sleep 3; my $prim_check = join("",&runcmd("drbdadm role $vm")); print "DRBD resource: $prim_check\n"; ## if this was migrations, move it back if ($migration) { if ($prim_check =~ /primary\/primary/i) { ## migrate back my $local_test = join("",&runcmd("virsh list")); if ($local_test !~ /$vm.*running/i) { print "$vm NOT running locally - migration to $migrate_from\n"; my $pvd = &GetPVD($vm); &runcmd("crm resource migrate $pvd $migrate_from"); sleep 1; my$remote_test = join("",&runcmd("ssh $migrate_to -C virsh list | grep -i $vm")); my $status = 'unknown'; while($local_test !~ /(.*$vm.*running)/i) { if ($local_test =~ /(.*$vm.*)/i) { $status = $1; } print " $migrate_from:\t" . $status . "\n"; print "(r)$migrate_to:\t$remote_test\n"; sleep 5; $local_test = join("",&runcmd("virsh list",1)); $remote_test = join("",&runcmd("ssh $migrate_to -C virsh list | grep -i $vm",1)); } print "Migration is Done!\n"; print "(r)$migrate_from:\t$local_test\n"; } } } ## done # And remove the lock file, unless the --keep-lock flag is present unlock_vm() unless ($opts{keeplock}); } sub usage{ print "usage:\n$0 --action=[dump|cleanup|chunkmount|unlock] --vm=vm1[,vm2,vm3] [--debug] [--exclude=hda,hdb] [--compress] ". "[--state] [--no-snapshot] [--snapsize=<size>] [--backupdir=/path/to/dir] [--connect=<URI>] ". "[--keep-lock] [--bs=<block size>]\n" . "\n\n" . "\t--action: What action the script will run. Valid actions are\n\n" . "\t\t- dump: Run the dump routine (dump disk image to temp dir, pausing the VM if needed). It's the default action\n" . "\t\t- unlock: just remove the lock file, but don't cleanup the backup dir\n\n" . "\t--vm=name: The VM you want to work on (as known by libvirt). You can backup several VMs in one shot " . "if you separate them with comma, or with multiple --vm argument. You have to use the name of the domain, ". "ID and UUID are not supported at the moment\n\n" . "\n\nOther options:\n\n" . "\t--snapsize=<snapsize>: The amount of space to use for snapshots. Use the same format as -L option of lvcreate. " . "eg: --snapsize=15G. Default is 5G\n\n" . "\t--compress[=[gzip|bzip2|pbzip2|lzop|xz|lzip|plzip]]: On the fly compress the disks images during the dump. If you " . "don't specify a compression algo, gzip will be used.\n\n" . "\t--backupdir=/path/to/backup: Use an alternate backup dir. The directory must exists and be writable. " . "The default is /var/lib/libvirt/backup\n\n" . "\t--keep-lock: Let the lock file present. This prevent another " . "dump to run while an third party backup software (BackupPC for example) saves the dumped files.\n\n"; } # Dump the domain description as XML sub save_drbd_res{ my $res = shift; print "\nSaving XML description for $vm to $backupdir/$vm.res\n" if ($opts{debug}); open(XML, ">$backupdir/$vm" . ".res") || die $!; print XML $res; close XML; } # Create an LVM snapshot # Pass the original logical volume and the suffix # to be added to the snapshot name as arguments sub create_snapshot{ my ($blk,$suffix) = @_; my $ret = 0; print "Running: $opts{lvcreate} -p r -s -n " . $blk . $suffix . " -L $opts{snapsize} $blk > /dev/null 2>&1\n" if $opts{debug}; if ( system("$opts{lvcreate} -s -n " . $blk . $suffix . " -L $opts{snapsize} $blk > /dev/null 2>&1") == 0 ) { $ret = 1; open SNAPLIST, ">>$backupdir.meta/snapshots" or die "Error, couldn't open snapshot list file\n"; print SNAPLIST $blk.$suffix ."\n"; close SNAPLIST; } return $ret; } # Remove an LVM snapshot sub destroy_snapshot{ my $ret = 0; my ($snap) = @_; print "Removing snapshot $snap\n" if $opts{debug}; if (system ("$opts{lvremove} -f $snap > /dev/null 2>&1") == 0 ){ $ret = 1; } return $ret; } # Lock a VM backup dir # Just creates an empty lock file sub lock_vm{ print "Locking $vm\n" if $opts{debug}; open ( LOCK, ">$backupdir.meta/$vm.lock" ) || die $!; print LOCK ""; close LOCK; } # Unlock the VM backup dir # Just removes the lock file sub unlock_vm{ print "Removing lock file for $vm\n\n" if $opts{debug}; unlink <$backupdir.meta/$vm.lock>; } sub runcmd() { my $cmd = shift; my $quiet = shift; my $ignore = shift; ## ignore exit code 1 with greps -- not found is OK.. if ($cmd =~ /grep/) { $ignore = 1; } if (!$quiet) { print "exec: $cmd ... ";} my @output = `$cmd`; if ($?) { print $ignore . "\n"; my $e = sprintf("%d", $? >> 8); if ($ignore && $ignore == $e) { print "exit code = $e -- ignoring exit code $e\n"; } else { printf "\n******** command $cmd exited with value %d\n", $? >> 8; print @output; exit $? >> 8; } } if (!$quiet) { print "success\n"; } return @output; } ## get primative VirtualDomain sub GetPVD() { my $vm = shift; my $out = join("",&runcmd("crm resource show | grep $vm | grep VirtualDomain")); if ($out =~ /([\d\w\-\_]+)/) { return $1; } else { print "Could not locate Primative VirtualDomain for $vm\n"; } } </pre> ==== CLVM - KVM virt Snapshot ==== * you will have to edit some variables for this to work properly * It will also set DRBD and others in unmanaged mode, so pacemaker will not potential fence on failures * This is a heavily modified version of ''http://repo.firewall-services.com/misc/virt/virt-backup.pl'' (other options like cleanup do not work) ;Usage: ./virt-backup-drbd_clvm.pl vm=<virt_name> [--compress] ;virt-backup-drbd_clvm.pl <pre> #!/usr/bin/perl -w ## lots of hacks due to bugs.. in lvm/clustered vg use XML::Simple; use Sys::Virt; use Getopt::Long; use Data::Dumper; # Set umask umask(022); # Some constant my $drbd_dir = '/etc/drbd.d/'; our %opts = (); our @vms = (); our @excludes = (); our @disks = (); our $drbd_dev; my $migrate_to = 'blindpig'; ## host to migrate machines to if they are running locally my $migrate_from = 'bigeye'; ## ht # Sets some defaults values my $host =`hostname`; chomp($host); my $migration = 0; #placeholder # What to run. The default action is to dump $opts{action} = 'dump'; $opts{backupdir} = '/NFS/_local_/_backups/KVM/'; $opts{snapsize} = '1G'; # Debug $opts{debug} = 1; $opts{snapshot} = 1; $opts{compress} = 'none'; $opts{lvcreate} = '/sbin/lvcreate -c 512'; $opts{lvremove} = '/sbin/lvremove'; $opts{blocksize} = '262144'; $opts{nice} = 'nice -n 19'; $opts{ionice} = 'ionice -c 2 -n 7'; $opts{livebackup} = 1; $opts{wasrunning} = 1; # get command line arguments GetOptions( "debug" => \$opts{debug}, "keep-lock" => \$opts{keeplock}, "state" => \$opts{state}, "snapsize=s" => \$opts{snapsize}, "backupdir=s" => \$opts{backupdir}, "vm=s" => \@vms, "action=s" => \$opts{action}, "cleanup" => \$opts{cleanup}, "dump" => \$opts{dump}, "unlock" => \$opts{unlock}, "connect=s" => \$opts{connect}, "snapshot!" => \$opts{snapshot}, "compress:s" => \$opts{compress}, "exclude=s" => \@excludes, "blocksize=s" => \$opts{blocksize}, "help" => \$opts{help} ); # Set compression settings if ($opts{compress} eq 'lzop'){ $opts{compext} = ".lzo"; $opts{compcmd} = "lzop -c"; } elsif ($opts{compress} eq 'bzip2'){ $opts{compext} = ".bz2"; $opts{compcmd} = "bzip2 -c"; } elsif ($opts{compress} eq 'pbzip2'){ $opts{compext} = ".bz2"; $opts{compcmd} = "pbzip2 -c"; } elsif ($opts{compress} eq 'xz'){ $opts{compext} = ".xz"; $opts{compcmd} = "xz -c"; } elsif ($opts{compress} eq 'lzip'){ $opts{compext} = ".lz"; $opts{compcmd} = "lzip -c"; } elsif ($opts{compress} eq 'plzip'){ $opts{compext} = ".lz"; $opts{compcmd} = "plzip -c"; } # Default is gzip elsif (($opts{compress} eq 'gzip') || ($opts{compress} eq '')) { $opts{compext} = ".gz"; $opts{compcmd} = "gzip -c"; # $opts{compcmd} = "pigz -c -p 2"; } else{ $opts{compext} = ""; $opts{compcmd} = "cat"; } # Allow comma separated multi-argument @vms = split(/,/,join(',',@vms)); @excludes = split(/,/,join(',',@excludes)); # Backward compatible with --dump --cleanup --unlock $opts{action} = 'dump' if ($opts{dump}); $opts{action} = 'cleanup' if ($opts{cleanup}); $opts{action} = 'unlock' if ($opts{unlock}); # Libvirt URI to connect to $opts{connect} = "qemu:///system"; # Stop here if we have no vm # Or the help flag is present if ((!@vms) || ($opts{help})){ usage(); exit 1; } if (! -d $opts{backupdir} ){ print "$opts{backupdir} is not a valid directory\n"; exit 1; } print "\n" if ($opts{debug}); # Connect to libvirt print "\n\nConnecting to libvirt daemon using $opts{connect} as URI\n" if ($opts{debug}); our $libvirt = Sys::Virt->new( uri => $opts{connect} ) || die "Error connecting to libvirt on URI: $opts{connect}"; foreach our $vm (@vms){ print "Checking $vm status\n\n" if ($opts{debug}); our $backupdir = $opts{backupdir}.'/'.$vm; my $vdom = $vm . '-ha'; $vdom =~ s/-ha-ha/-ha/; our $dom = $libvirt->get_domain_by_name($vdom) || die "Error opening $vm object"; if ($opts{action} eq 'cleanup'){ print "Running cleanup routine for $vm\n\n" if ($opts{debug}); # run_cleanup(); } elsif ($opts{action} eq 'dump'){ print "Running dump routine for $vm\n\n" if ($opts{debug}); run_dump(); } # else { # usage(); # exit 1; # } } ############################################################################ ############## FUNCTIONS #################### ############################################################################ sub prepare_backup{ my ($source,$res); my $target = $vm; my $match=0; my $xml = new XML::Simple (); my $data = $xml->XMLin( $dom->get_xml_description(), forcearray => ['disk'] ); my @drbd_res = &runcmd("drbdadm dump $vm"); foreach my $line (@drbd_res) { $res = $line; if ($line =~ /device\s+.*(drbd\d+)\s+minor/) { $drbd_dev = $1; last; } } # Create a list of disks used by the VM foreach $disk (@{$data->{devices}->{disk}}){ if ($disk->{type} eq 'block'){ $source = $disk->{source}->{dev}; } elsif ($disk->{type} eq 'file'){ $source = $disk->{source}->{file}; } else{ print "\nSkiping $source for vm $vm as it's type is $disk->{type}: " . " and only block is supported\n" if ($opts{debug}); next; } ## we only support the first block device for now. if ($target && $source) { last; } } ## locate the backing device for this res #my @drbd_res = &runcmd("drbdadm dump $vm"); #foreach my $line (@drbd_res) { # $res = $line; # if ($match == 1 && $line =~ /disk\s+(.*);/) { # $source = $1; # $match = 0; # } # # if ($line =~ /on\s$host\s+{/i) { $match = 1; } # } # if (!$source) { # print "Did not find DRBD backing deviced for VM\n"; # exit; # } else { # ## set target backup file based on device # $target = $source; # $target =~ s/\//_-_/g; ## rename / to _-_ # $target =~ s/^_-_//g; ## remove leading _-_ # } ## check if running on node2 - migrate here if so my $local_test = join("",&runcmd("virsh list")); if ($local_test !~ /$vm.*running/i) { my $status = 'not running'; print "$vm running remotely - migration to $migrate_to\n"; my $pvd = &GetPVD($vm); &runcmd("crm resource migrate $pvd $migrate_to"); $migration = 1; sleep 1; my $remote_test = join("",&runcmd("ssh $migrate_from -C virsh list | grep -i $vm",1)); while($remote_test =~ /(.*$vm.*)/) { print " $migrate_to:\t" . $status . "\n"; print "(r)$migrate_from:\t$remote_test\n"; sleep 5; $local_test = join("",&runcmd("virsh list",1)); if ($local_test =~ /(.*$vm.*)/i) { $status = $1; } $remote_test = join("",&runcmd("ssh $migrate_from -C virsh list | grep -i $vm",1)); } $remote_test = join("",&runcmd("ssh $migrate_from -C virsh list | grep -i $vm",1)); print "We must of migrated ok... \n"; print "(r)$migrate_to:\t$remote_test\n"; } &runcmd("crm resource unmanage clone_lvm-" . $vm); &runcmd("crm resource unmanage ms_drbd-" . $vm); sleep 1; &runcmd("ssh $migrate_from -C vgchange -aln drbd_" . $vm); # sleep 2; &runcmd("ssh $migrate_from -C drbdadm secondary " . $vm); &runcmd("ssh $migrate_from -C touch /tmp/backup.$drbd_dev"); &runcmd("ssh $migrate_from -C touch /tmp/backup.p_drbd-$vm"); &runcmd("touch /tmp/backup.$drbd_dev"); &runcmd("touch /tmp/backup.p_drbd-$vm"); my $sec_check = join("",&runcmd("drbdadm role $vm")); if( $sec_check !~ /Primary\/Secondary/) { print "Fail: DRBD res [$vm] is not the ONLY primary! result: $sec_check\n"; exit; } else { print "OK: DRBD res [$vm] is the ONLY Primary. result: $sec_check\n"; } if (!-d $backupdir) { mkdir $backupdir || die $!; } if (!-d $backupdir.'.meta') { mkdir $backupdir . '.meta' || die $!; } lock_vm(); &runcmd("vgchange -c n drbd_" . $vm); sleep 1; &runcmd("vgchange -aey drbd_" . $vm); #save_drbd_res($res); save_xml($res); my $time = "_".time(); # Try to snapshot the source if snapshot is enabled if ( ($opts{snapshot}) && (create_snapshot($source,$time)) ){ print "$source seems to be a valid logical volume (LVM), a snapshot has been taken as " . $source . $time ."\n" if ($opts{debug}); $source = $source.$time; push (@disks, {source => $source, target => $target . '_' . $time, type => 'snapshot'}); } # Summarize the list of disk to be dumped if ($opts{debug}){ if ($opts{action} eq 'dump'){ print "\n\nThe following disks will be dumped:\n\n"; foreach $disk (@disks){ print "Source: $disk->{source}\tDest: $backupdir/$vm" . '_' . $disk->{target} . ".img$opts{compext}\n"; } } } if ($opts{livebackup}){ print "\nWe can run a live backup\n" if ($opts{debug}); } } sub run_dump{ # Pause VM, dump state, take snapshots etc.. prepare_backup(); # Now, it's time to actually dump the disks foreach $disk (@disks){ my $source = $disk->{source}; my $dest = "$backupdir/$vm" . '_' . $disk->{target} . ".img$opts{compext}"; print "\nStarting dump of $source to $dest\n\n" if ($opts{debug}); my $ddcmd = "$opts{ionice} dd if=$source bs=$opts{blocksize} | $opts{nice} $opts{compcmd} > $dest 2>/dev/null"; unless( system("$ddcmd") == 0 ){ die "Couldn't dump the block device/file $source to $dest\n"; } # Remove the snapshot if the current dumped disk is a snapshot destroy_snapshot($source) if ($disk->{type} eq 'snapshot'); } $meta = unlink <$backupdir.meta/*>; rmdir "$backupdir.meta"; print "$meta metadata files removed\n\n" if $opts{debug}; &runcmd("ssh $migrate_from -C drbdadm primary " . $vm); &runcmd("ssh $migrate_from -C rm /tmp/backup.$drbd_dev"); &runcmd("ssh $migrate_from -C rm /tmp/backup.p_drbd-$vm"); &runcmd("rm /tmp/backup.$drbd_dev"); &runcmd("rm /tmp/backup.p_drbd-$vm"); sleep 1; &runcmd("vgchange -c y drbd_" . $vm); sleep 1; &runcmd("vgchange -ay drbd_" . $vm); sleep 1; &runcmd("crm resource manage p_drbd-" . $vm); &runcmd("crm resource manage ms_drbd-" . $vm); &runcmd("crm resource manage clone_lvm-" . $vm); sleep 1; &runcmd("crm_resource -r ms_drbd-$vm -C"); sleep 1; &runcmd("crm_resource -r clone_lvm-$vm -C"); sleep 3; my $prim_check = join("",&runcmd("drbdadm role $vm")); print "DRBD resource: $prim_check\n"; ## if this was migrations, move it back if ($migration) { if ($prim_check =~ /primary\/primary/i) { ## migrate back my $local_test = join("",&runcmd("virsh list")); if ($local_test =~ /$vm.*running/i) { print "$vm running locally - migration to $migrate_from from $migrate_to\n"; my $pvd = &GetPVD($vm); &runcmd("crm resource migrate $pvd $migrate_from"); sleep 1; my$remote_test = join("",&runcmd("ssh $migrate_to -C virsh list | grep -i $vm")); my $status = 'unknown'; while($local_test =~ /(.*$vm.*running)/i) { if ($local_test =~ /(.*$vm.*)/i) { $status = $1; } print " $migrate_to:\t" . $status . "\n"; print "(r)$migrate_from:\t$remote_test\n"; sleep 5; $local_test = join("",&runcmd("virsh list",1)); $remote_test = join("",&runcmd("ssh $migrate_from -C virsh list | grep -i $vm",1)); } print "Migration is Done!\n"; print "(r)$migrate_from:\t$local_test\n"; } } } ## done # And remove the lock file, unless the --keep-lock flag is present unlock_vm() unless ($opts{keeplock}); } sub usage{ print "usage:\n$0 --action=[dump|cleanup|chunkmount|unlock] --vm=vm1[,vm2,vm3] [--debug] [--exclude=hda,hdb] [--compress] ". "[--state] [--no-snapshot] [--snapsize=<size>] [--backupdir=/path/to/dir] [--connect=<URI>] ". "[--keep-lock] [--bs=<block size>]\n" . "\n\n" . "\t--action: What action the script will run. Valid actions are\n\n" . "\t\t- dump: Run the dump routine (dump disk image to temp dir, pausing the VM if needed). It's the default action\n" . "\t\t- unlock: just remove the lock file, but don't cleanup the backup dir\n\n" . "\t--vm=name: The VM you want to work on (as known by libvirt). You can backup several VMs in one shot " . "if you separate them with comma, or with multiple --vm argument. You have to use the name of the domain, ". "ID and UUID are not supported at the moment\n\n" . "\n\nOther options:\n\n" . "\t--snapsize=<snapsize>: The amount of space to use for snapshots. Use the same format as -L option of lvcreate. " . "eg: --snapsize=15G. Default is 5G\n\n" . "\t--compress[=[gzip|bzip2|pbzip2|lzop|xz|lzip|plzip]]: On the fly compress the disks images during the dump. If you " . "don't specify a compression algo, gzip will be used.\n\n" . "\t--backupdir=/path/to/backup: Use an alternate backup dir. The directory must exists and be writable. " . "The default is /var/lib/libvirt/backup\n\n" . "\t--keep-lock: Let the lock file present. This prevent another " . "dump to run while an third party backup software (BackupPC for example) saves the dumped files.\n\n"; } # Dump the domain description as XML sub save_drbd_res{ my $res = shift; print "\nSaving XML description for $vm to $backupdir/$vm.res\n" if ($opts{debug}); open(XML, ">$backupdir/$vm" . ".res") || die $!; print XML $res; close XML; } # Create an LVM snapshot # Pass the original logical volume and the suffix # to be added to the snapshot name as arguments sub create_snapshot{ my ($blk,$suffix) = @_; my $ret = 0; print "Running: $opts{lvcreate} -p r -s -n " . $blk . $suffix . " -L $opts{snapsize} $blk > /dev/null 2>&1\n" if $opts{debug}; if ( system("$opts{lvcreate} -s -n " . $blk . $suffix . " -L $opts{snapsize} $blk > /dev/null 2>&1") == 0 ) { $ret = 1; open SNAPLIST, ">>$backupdir.meta/snapshots" or die "Error, couldn't open snapshot list file\n"; print SNAPLIST $blk.$suffix ."\n"; close SNAPLIST; } return $ret; } # Remove an LVM snapshot sub destroy_snapshot{ my $ret = 0; my ($snap) = @_; print `lvs drbd_$vm`; print "Removing snapshot $snap\n" if $opts{debug}; if (system ("$opts{lvremove} -f $snap > /dev/null 2>&1") == 0 ){ $ret = 1; } return $ret; } # Lock a VM backup dir # Just creates an empty lock file sub lock_vm{ print "Locking $vm\n" if $opts{debug}; open ( LOCK, ">$backupdir.meta/$vm.lock" ) || die $!; print LOCK ""; close LOCK; } # Unlock the VM backup dir # Just removes the lock file sub unlock_vm{ print "Removing lock file for $vm\n\n" if $opts{debug}; unlink <$backupdir.meta/$vm.lock>; } sub runcmd() { my $cmd = shift; my $quiet = shift; my $ignore; ## ignore exit code 1 with greps -- not found is OK.. if ($cmd =~ /grep/) { $ignore = 1; } if (!$quiet) { print "exec: $cmd ... ";} my @output = `$cmd`; if ($?) { my $e = sprintf("%d", $? >> 8); if ($ignore && $ignore == $e) { print "grep - ignore exit code $e\n"; } else { printf "\n******** command $cmd exited with value %d\n", $? >> 8; print @output; exit $? >> 8; } } if (!$quiet) { print "success\n"; } return @output; } ## get primative VirtualDomain sub GetPVD() { my $vm = shift; my $out = join("",&runcmd("crm resource show | grep $vm | grep VirtualDomain")); if ($out =~ /([\d\w\-\_]+)/) { return $1; } else { print "Could not locate Primative VirtualDomain for $vm\n"; } } # Dump the domain description as XML sub save_xml{ print "\nSaving XML description for $vm to $backupdir/$vm.xml\n" if ($opts{debug}); open(XML, ">$backupdir/$vm" . ".xml") || die $!; print XML $dom->get_xml_description(); close XML; } </pre> == Cman/Pacemaker Notes== === Firewall === * Allow UDP 5405 for message layer === cluster.conf === * just a basic cluster.conf config that works <pre> <cluster name="ipa" config_version="31"> <cman two_node="1" expected_votes="1" cluster_id="1208"> <multicast addr="239.192.2.232"/> </cman> <clusternodes> <clusternode name="blindpig" nodeid="1"> <fence> <method name="pcmk-redirect"> <device name="pcmk" port="blindpig"/> </method> </fence> </clusternode> <clusternode name="bigeye" nodeid="3"> <fence> <method name="pcmk-redirect"> <device name="pcmk" port="bigeye"/> </method> </fence> </clusternode> </clusternodes> <fencedevices> <fencedevice name="pcmk" agent="fence_pcmk"/> </fencedevices> <fence_daemon clean_start="1" post_fail_delay="10" post_join_delay="30"> </fence_daemon> <logging to_syslog="yes" syslog_facility="local6" debug="off"> </logging> </cluster> </pre> == Monitoring == ; crm_mon -Qrf1 ; check_crm <pre> #!/usr/bin/perl # # check_crm_v0_5 # # Copyright © 2011 Philip Garner, Sysnix Consultants Limited # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see <http://www.gnu.org/licenses/>. # # Authors: Phil Garner - phil@sysnix.com & Peter Mottram - peter@sysnix.com # # Acknowledgements: Vadym Chepkov, Sönke Martens # # v0.1 09/01/2011 # v0.2 11/01/2011 # v0.3 22/08/2011 - bug fix and changes suggested by Vadym Chepkov # v0.4 23/08/2011 - update for spelling and anchor regex capture (Vadym Chepkov) # v0.5 29/09/2011 - Add standby warn/crit suggested by Sönke Martens & removal # of 'our' to 'my' to completely avoid problems with ePN # # NOTES: Requires Perl 5.8 or higher & the Perl Module Nagios::Plugin # Nagios user will need sudo acces - suggest adding line below to # sudoers # nagios ALL=(ALL) NOPASSWD: /usr/sbin/crm_mon -1 -r -f # # In sudoers if requiretty is on (off state is default) # you will also need to add the line below # Defaults:nagios !requiretty # use warnings; use strict; use Nagios::Plugin; # Lines below may need changing if crm_mon or sudo installed in a # different location. my $sudo = '/usr/bin/sudo'; my $crm_mon = '/usr/sbin/crm_mon'; my $np = Nagios::Plugin->new( shortname => 'check_crm', version => '0.5', usage => "Usage: %s <ARGS>\n\t\t--help for help\n", ); $np->add_arg( spec => 'warning|w', help => 'If failed Nodes, stopped Resources detected or Standby Nodes sends Warning instead of Critical (default) as long as there are no other errors and there is Quorum', required => 0, ); $np->add_arg( spec => 'standbyignore|s', help => 'Ignore any node(s) in standby, by default sends Critical', required => 0, ); $np->getopts; my @standby; # Check for -w option set warn if this is case instead of crit my $warn_or_crit = 'CRITICAL'; $warn_or_crit = 'WARNING' if $np->opts->warning; my $fh; open( $fh, "$sudo $crm_mon -1 -r -f|" ) or $np->nagios_exit( CRITICAL, "Running sudo has failed" ); foreach my $line (<$fh>) { if ( $line =~ m/Connection to cluster failed\:(.*)/i ) { # Check Cluster connected $np->nagios_exit( CRITICAL, "Connection to cluster FAILED: $1" ); } elsif ( $line =~ m/Current DC:/ ) { # Check for Quorum if ( $line =~ m/partition with quorum$/ ) { # Assume cluster is OK - we only add warn/crit after here $np->add_message( OK, "Cluster OK" ); } else { $np->add_message( CRITICAL, "No Quorum" ); } } elsif ( $line =~ m/^offline:\s*\[\s*(\S.*?)\s*\]/i ) { next if $line =~ /\/dev\/block\//i; # Count offline nodes my @offline = split( /\s+/, $1 ); my $numoffline = scalar @offline; $np->add_message( $warn_or_crit, ": $numoffline Nodes Offline" ); } elsif ( $line =~ m/^node\s+(\S.*):\s*standby/i ) { # Check for standby nodes (suggested by Sönke Martens) # See later in code for message created from this push @standby, $1; } elsif ( $line =~ m/\s*([\w-]+)\s+\(\S+\)\:\s+Stopped/ ) { #next if $line =~ /hopvpn/i; # Check Resources Stopped $np->add_message( $warn_or_crit, ": $1 Stopped" ); } elsif ( $line =~ m/\s*stopped\:\s*\[(.*)\]/i ) { next if $line =~ /openvz/i; # Check Master/Slave stopped $np->add_message( $warn_or_crit, ": $1 Stopped" ); } elsif ( $line =~ m/^Failed actions\:/ ) { # Check Failed Actions ### rob fix this next; $np->add_message( CRITICAL, ": FAILED actions detected or not cleaned up" ); } elsif ( $line =~ m/\s*(\S+?)\s+ \(.*\)\:\s+\w+\s+\w+\s+\(unmanaged\)\s+FAILED/ ) { # Check Unmanaged $np->add_message( CRITICAL, ": $1 unmanaged FAILED" ); } elsif ( $line =~ m/\s*(\S+?)\s+ \(.*\)\:\s+not installed/i ) { # Check for errors $np->add_message( CRITICAL, ": $1 not installed" ); } elsif ( $line =~ m/\s*(\S+?):.*(fail-count=\d+)/i ) { my $one = $1; my $two = $2; if (-f "/tmp/backup.$1") { last; } $np->add_message( WARNING, ": $1 failure detected, $2" ); } } # If found any Nodes in standby & no -s option used send warn/crit if ( scalar @standby > 0 && !$np->opts->standbyignore ) { $np->add_message( $warn_or_crit, ": " . join( ', ', @standby ) . " in Standby" ); } close($fh) or $np->nagios_exit( CRITICAL, "Running crm_mon FAILED" ); $np->nagios_exit( $np->check_messages() ); </pre> == Troubleshooting Tips == ==== ocf:heartbeat:LVM patch ==== https://github.com/ljunkie/resource-agents/commit/4ed858bf4184c1b186cdae6efa428814ce4a02f0 * When deactivating multiple clustered (maybe non-cluster too) at once, sometimes vgchange -a ln fails. : '''Instead''' We should continue trying until pacemaker calls it quits (logic used from linbits drbd RA) ===== failure ===== * vgchange -a ln fails, then the operation completes with a failure <source lang=text> lrmd: notice: operation_finished: p_lvm-vhosts_stop_0:845 [ 2013/04/17_23:48:03 INFO: Deactivating volume group drbd_vhosts ] lrmd: notice: operation_finished: p_lvm-vhosts_stop_0:845 [ 2013/04/17_23:48:03 ERROR: Can't deactivate volume group "drbd_vhosts" with 1 open logical volume(s) ] crmd: notice: process_lrm_event: LRM operation p_lvm-vhosts_stop_0 (call=518, rc=1, cib-update=113, confirmed=true) unknown error </source> ===== success: with patch ===== * with the patch - it will try again and succeed <source lang=text> lrmd: notice: operation_finished: p_lvm-vhosts_stop_0:24355 [ 2013/04/18_11:05:31 INFO: Deactivating volume group drbd_vhosts ] lrmd: notice: operation_finished: p_lvm-vhosts_stop_0:24355 [ 2013/04/18_11:05:31 ERROR: Can't deactivate volume group "drbd_vhosts" with 1 open logical volume(s) ] lrmd: notice: operation_finished: p_lvm-vhosts_stop_0:24355 [ 2013/04/18_11:05:31 WARNING: drbd_vhosts still Active, Deactivating volume group drbd_vhosts. ] lrmd: notice: operation_finished: p_lvm-vhosts_stop_0:24355 [ 2013/04/18_11:05:31 INFO: Deactivating volume group drbd_vhosts ] lrmd: notice: operation_finished: p_lvm-vhosts_stop_0:24355 [ 2013/04/18_11:05:31 INFO: 0 logical volume(s) in volume group "drbd_vhosts" now active ] crmd: notice: process_lrm_event: LRM operation p_lvm-vhosts_stop_0 (call=963, rc=0, cib-update=200, confirmed=true) ok </source> ===== patch ===== <source> --- /usr/lib/ocf/resource.d/heartbeat/LVM.orig 2013-04-18 10:08:57.333596804 -0700 +++ /usr/lib/ocf/resource.d/heartbeat/LVM 2013-04-18 10:36:39.741388039 -0700 @@ -229,24 +229,34 @@ # Disable the LVM volume # LVM_stop() { + local first_try=true + rc=$OCF_ERR_GENERIC vgdisplay "$1" 2>&1 | grep 'Volume group .* not found' >/dev/null && { ocf_log info "Volume group $1 not found" return 0 } - ocf_log info "Deactivating volume group $1" - ocf_run vgchange -a ln $1 || return 1 - if - LVM_status $1 - then - ocf_log err "LVM: $1 did not stop correctly" - return $OCF_ERR_GENERIC - fi + # try to deactivate first time + ocf_log info "Deactivating volume group $1" + ocf_run vgchange -a ln $1 - # TODO: This MUST run vgexport as well + # Keep trying to bring down the resource; + # wait for the CRM to time us out if this fails + while :; do + if LVM_status $1; then + ocf_log warn "$1 still Active, Deactivating volume group $1." + ocf_log info "Deactivating volume group $1" + ocf_run vgchange -a ln $1 + else + rc=$OCF_SUCCESS + break; + fi + $first_try || sleep 1 + first_try=false + done - return $OCF_SUCCESS + return $rc } </source> ==== CMAN+CLVMD+Pacemaker 1.1.8+ ==== * Pacemaker now starts and stop CMAN. Issue is that it doesn't account for '''CLVMD''' * Fix INIT script to also start/stop '''CLVMD''' ; /etc/init.d/pacemaker <source> --- pacemaker.orig 2013-04-15 12:40:53.085307309 -0700 +++ pacemaker 2013-04-16 10:17:10.359833467 -0700 @@ -119,6 +119,9 @@ success echo + ## stop clvmd before leaving fence domain + [ -f /etc/rc.d/init.d/clvmd ] && service clvmd stop + echo -n "Leaving fence domain" fence_tool leave -w 10 checkrc @@ -163,6 +166,7 @@ start) # For consistency with stop [ -f /etc/rc.d/init.d/cman ] && service cman start + [ -f /etc/rc.d/init.d/clvmd ] && service clvmd start start ;; restart|reload|force-reload) </source> === Live Migrations Fail === error: Unsafe migration: Migration may lead to data corruption if disks use cache != none * Make sure you set your KVM disk cache to none ==== Verify you can with virsh first ==== <source> virsh migrate --live <hostname> qemu+ssh://<other_server_name>/system </source> <source> virsh migrate --live vhosts-ha qemu+ssh://blindpig/system </source> ==== Pacemaker 1.1.8 / libvirt-0.10.2-18 ==== * '''Live migration fails going standby / online''' :* seems with pacemaker 1.1.8, all live migrations are concurrent (update - use migration-limit) http://clusterlabs.org/doc/en-US/Pacemaker/1.1-plugin/html/Pacemaker_Explained/_available_cluster_options.html :* libvirt used to set live migrations ~30MB/s, but now it's infinite (8796093022207 MB/s) http://www.mail-archive.com/libvir-list@redhat.com/msg60259.html ===== Libvirt Fix (migrate-setspeed) ===== * libvirt - migrate-setspeed (this is not persistent though across service restarts) * persistent fix: edit '''/etc/init.d/libvirtd''' <source> --- libvirtd 2013-04-16 09:28:53.257824206 -0700 +++ libvirtd.orig 2013-04-16 10:25:31.358915941 -0700 @@ -85,22 +85,6 @@ RETVAL=$? echo [ $RETVAL -eq 0 ] && touch /var/lock/subsys/$SERVICE - - ## hook to set bandwidth for live migration - BW=50 - VIRSH=`which virsh` - LIST_VM=`virsh list --all | grep -v Name | awk '{print $2}' | egrep "\w"` - DATE=`date -R` - LOGFILE="/var/log/kvm_setspeed.log" - for vm in $LIST_VM - do - BWprev=`/usr/bin/virsh migrate-getspeed $vm` - /usr/bin/virsh migrate-setspeed --bandwidth $BW $vm > /dev/null - BWcur=`/usr/bin/virsh migrate-getspeed $vm` - echo "$DATE : $VIRSH migrate-setspeed --bandwidth $BW $vm [cur: $BWcur -- prev: $BWprev]" >> $LOGFILE - - done - # end BW hook } stop() { </source> ===== Pacemaker Fix (migration-limit) ===== * Pacemaker - set the migration-limit (default -1 unlimited) <source> crm_attribute --attr-name migration-limit --attr-value 2 crm_attribute --attr-name migration-limit --get-value scope=crm_config name=migration-limit value=2 </source> ===== virsh commands used ===== * setting this to 30mb/s per Virt <source> virsh migrate-setspeed --bandwidth 30 <VIRTNAME> virsh migrate-getspeed <VIRTNAME> </source> <source> # Default is infinite virsh migrate-getspeed vhosts-ha 8796093022207 # set the speed to 30MB/s virsh migrate-setspeed --bandwidth 30 vhosts-ha # now it's limited virsh migrate-getspeed vhosts-ha 30 </source> === pacemaker config - dump === * This has other examples - not just drbd/kvm <pre> node bigeye \ attributes standby="off" node blindpig \ attributes standby="off" primitive p_cluster_mon ocf:pacemaker:ClusterMon \ params pidfile="/var/run/crm_mon.pid" htmlfile="/var/www/html/index.html" \ op start interval="0" timeout="20s" \ op stop interval="0" timeout="20s" \ op monitor interval="10s" timeout="20s" primitive p_drbd-backuppc ocf:linbit:drbd \ params drbd_resource="backuppc" \ operations $id="p_drbd_backuppc-operations" \ op monitor interval="20" role="Slave" timeout="20" \ op monitor interval="10" role="Master" timeout="20" \ op start interval="0" timeout="240" \ op stop interval="0" timeout="100" start-delay="0" primitive p_drbd-dogfish-ha ocf:linbit:drbd \ params drbd_resource="dogfish-ha" \ operations $id="p_drbd_dogfish-ha-operations" \ op monitor interval="20" role="Slave" timeout="20" \ op monitor interval="10" role="Master" timeout="20" \ op start interval="0" timeout="240" \ op stop interval="0" timeout="100" start-delay="0" primitive p_drbd-hopmon ocf:linbit:drbd \ params drbd_resource="hopmon" \ operations $id="p_drbd_hopmon-operations" \ op monitor interval="20" role="Slave" timeout="20" \ op monitor interval="10" role="Master" timeout="20" \ op start interval="0" timeout="240" \ op stop interval="0" timeout="100" start-delay="0" primitive p_drbd-hoptical ocf:linbit:drbd \ params drbd_resource="hoptical" \ operations $id="p_drbd_hoptical-operations" \ op monitor interval="20" role="Slave" timeout="20" \ op monitor interval="10" role="Master" timeout="20" \ op start interval="0" timeout="240" \ op stop interval="0" timeout="100" start-delay="0" primitive p_drbd-hopvpn ocf:linbit:drbd \ params drbd_resource="hopvpn" \ operations $id="p_drbd_hopvpn-operations" \ op monitor interval="20" role="Slave" timeout="30" \ op monitor interval="10" role="Master" timeout="30" \ op start interval="0" timeout="240" \ op stop interval="0" timeout="100" start-delay="0" primitive p_drbd-musicbrainz ocf:linbit:drbd \ params drbd_resource="musicbrainz" \ operations $id="p_drbd_musicbrainz-operations" \ op monitor interval="20" role="Slave" timeout="20" \ op monitor interval="10" role="Master" timeout="20" \ op start interval="0" timeout="240" \ op stop interval="0" timeout="100" start-delay="0" primitive p_drbd-spacewalk ocf:linbit:drbd \ params drbd_resource="spacewalk" \ operations $id="p_drbd_spacewalk-operations" \ op monitor interval="20" role="Slave" timeout="20" \ op monitor interval="10" role="Master" timeout="20" \ op start interval="0" timeout="240" \ op stop interval="0" timeout="240" start-delay="0" primitive p_drbd-vhosts ocf:linbit:drbd \ params drbd_resource="vhosts" \ operations $id="p_drbd_vhosts-operations" \ op monitor interval="20" role="Slave" timeout="20" \ op monitor interval="10" role="Master" timeout="20" \ op start interval="0" timeout="240" \ op stop interval="0" timeout="100" start-delay="0" primitive p_drbd-vz ocf:linbit:drbd \ params drbd_resource="vz" \ operations $id="p_drbd_vz-operations" \ op monitor interval="20" role="Slave" timeout="20" \ op monitor interval="10" role="Master" timeout="20" \ op start interval="0" timeout="240" \ op stop interval="0" timeout="100" start-delay="0" primitive p_drbd-win7 ocf:linbit:drbd \ params drbd_resource="win7" \ operations $id="p_drbd_win7-operations" \ op monitor interval="20" role="Slave" timeout="20" \ op monitor interval="10" role="Master" timeout="20" \ op start interval="0" timeout="240" \ op stop interval="0" timeout="100" start-delay="0" primitive p_gfs2-vz-config ocf:heartbeat:Filesystem \ params device="/dev/mapper/vg_drbd_vz-gfs_vz_config" directory="/etc/vz" fstype="gfs2" \ op start interval="0" timeout="120" \ op stop interval="0" timeout="120" \ op monitor interval="120s" primitive p_gfs2-vz-storage ocf:heartbeat:Filesystem \ params device="/dev/mapper/vg_drbd_vz-gfs_vz_storage" directory="/vz" fstype="gfs2" \ op start interval="0" timeout="120" \ op stop interval="0" timeout="120" \ op monitor interval="120s" primitive p_ip-10.69.1.1 ocf:heartbeat:IPaddr2 \ params ip="10.69.1.1" cidr_netmask="32" nic="lo" \ meta target-role="Started" primitive p_ip-10.69.1.2 ocf:heartbeat:IPaddr2 \ params ip="10.69.1.2" cidr_netmask="32" nic="lo" \ meta target-role="Started" primitive p_lvm-backuppc ocf:heartbeat:LVM \ operations $id="backuppc-LVM-operations" \ op start interval="0" timeout="120" \ op stop interval="0" timeout="120" \ params volgrpname="drbd_backuppc" primitive p_lvm-dogfish-ha ocf:heartbeat:LVM \ operations $id="dogfish-ha-LVM-operations" \ op start interval="0" timeout="120" \ op stop interval="0" timeout="120" \ params volgrpname="drbd_dogfish-ha" primitive p_lvm-hopmon ocf:heartbeat:LVM \ operations $id="hopmon-LVM-operations" \ op start interval="0" timeout="120" \ op stop interval="0" timeout="120" \ params volgrpname="drbd_hopmon" primitive p_lvm-hoptical ocf:heartbeat:LVM \ operations $id="hoptical-LVM-operations" \ op start interval="0" timeout="120" \ op stop interval="0" timeout="120" \ params volgrpname="drbd_hoptical" primitive p_lvm-hopvpn ocf:heartbeat:LVM \ operations $id="hopvpn-LVM-operations" \ op start interval="0" timeout="120" \ op stop interval="0" timeout="120" \ params volgrpname="drbd_hopvpn" primitive p_lvm-musicbrainz ocf:heartbeat:LVM \ operations $id="musicbrainz-LVM-operations" \ op start interval="0" timeout="120" \ op stop interval="0" timeout="120" \ params volgrpname="drbd_musicbrainz" primitive p_lvm-spacewalk ocf:heartbeat:LVM \ operations $id="spacewalk-LVM-operations" \ op start interval="0" timeout="120" \ op stop interval="0" timeout="120" \ params volgrpname="drbd_spacewalk" primitive p_lvm-vhosts ocf:heartbeat:LVM \ operations $id="vhosts-LVM-operations" \ op start interval="0" timeout="120" \ op stop interval="0" timeout="120" \ params volgrpname="drbd_vhosts" primitive p_lvm-vz ocf:heartbeat:LVM \ operations $id="vz-LVM-operations" \ op start interval="0" timeout="120" \ op stop interval="0" timeout="120" \ params volgrpname="vg_drbd_vz" primitive p_lvm-win7 ocf:heartbeat:LVM \ operations $id="win7-LVM-operations" \ op start interval="0" timeout="120" \ op stop interval="0" timeout="120" \ params volgrpname="drbd_win7" primitive p_vd-backuppc-ha ocf:heartbeat:VirtualDomain \ params config="/etc/libvirt/qemu/backuppc-ha.xml" migration_transport="ssh" force_stop="0" hypervisor="qemu:///system" \ operations $id="p_vd-backuppc-operations" \ op start interval="0" timeout="90" \ op stop interval="0" timeout="90" \ op migrate_from interval="0" timeout="240" \ op migrate_to interval="0" timeout="240" \ op monitor interval="10" timeout="30" start-delay="0" \ meta allow-migrate="true" failure-timeout="3min" target-role="Started" is-managed="true" resource-stickiness="100" primitive p_vd-dogfish-ha ocf:heartbeat:VirtualDomain \ params config="/etc/libvirt/qemu/dogfish-ha.xml" migration_transport="ssh" force_stop="0" hypervisor="qemu:///system" \ operations $id="p_vd-dogfish-ha-operations" \ op start interval="0" timeout="90" \ op stop interval="0" timeout="90" \ op migrate_from interval="0" timeout="600" \ op migrate_to interval="0" timeout="600" \ op monitor interval="10" timeout="30" start-delay="10" \ meta allow-migrate="true" failure-timeout="3min" target-role="Started" resource-stickiness="100" is-managed="true" primitive p_vd-hopmon-ha ocf:heartbeat:VirtualDomain \ params config="/etc/libvirt/qemu/hopmon-ha.xml" migration_transport="ssh" force_stop="0" hypervisor="qemu:///system" \ operations $id="p_vd-hopmon-operations" \ op start interval="0" timeout="90" \ op stop interval="0" timeout="90" \ op migrate_from interval="0" timeout="240" \ op migrate_to interval="0" timeout="240" \ op monitor interval="10" timeout="30" start-delay="10" \ meta allow-migrate="true" failure-timeout="3min" target-role="Started" primitive p_vd-hoptical-ha ocf:heartbeat:VirtualDomain \ params config="/etc/libvirt/qemu/hoptical-ha.xml" migration_transport="ssh" force_stop="0" hypervisor="qemu:///system" \ operations $id="p_vd-hoptical-operations" \ op start interval="0" timeout="90" \ op stop interval="0" timeout="90" \ op migrate_from interval="0" timeout="240" \ op migrate_to interval="0" timeout="240" \ op monitor interval="10" timeout="30" start-delay="10" \ meta allow-migrate="true" failure-timeout="3min" target-role="Started" resource-stickiness="100" primitive p_vd-hopvpn-ha ocf:heartbeat:VirtualDomain \ params config="/etc/libvirt/qemu/hopvpn-ha.xml" migration_transport="ssh" force_stop="0" hypervisor="qemu:///system" \ operations $id="p_vd-hopvpn-operations" \ op start interval="0" timeout="90" \ op stop interval="0" timeout="90" \ op migrate_from interval="0" timeout="240" \ op migrate_to interval="0" timeout="240" \ op monitor interval="10" timeout="30" start-delay="10" \ meta allow-migrate="true" failure-timeout="3min" target-role="Started" resource-stickiness="100" is-managed="true" primitive p_vd-musicbrainz-ha ocf:heartbeat:VirtualDomain \ params config="/etc/libvirt/qemu/musicbrainz-ha.xml" migration_transport="ssh" force_stop="0" hypervisor="qemu:///system" \ operations $id="p_vd-musicbrainz-operations" \ op start interval="0" timeout="90" \ op stop interval="0" timeout="90" \ op migrate_from interval="0" timeout="240" \ op migrate_to interval="0" timeout="240" \ op monitor interval="10" timeout="30" start-delay="10" \ meta allow-migrate="true" failure-timeout="3min" target-role="Started" resource-stickiness="100" primitive p_vd-spacewalk-ha ocf:heartbeat:VirtualDomain \ params config="/etc/libvirt/qemu/spacewalk-ha.xml" migration_transport="ssh" force_stop="0" hypervisor="qemu:///system" \ operations $id="p_vd-spacewalk-operations" \ op start interval="0" timeout="90" \ op stop interval="0" timeout="90" \ op migrate_from interval="0" timeout="240" \ op migrate_to interval="0" timeout="240" \ op monitor interval="10" timeout="30" start-delay="0" \ meta allow-migrate="true" failure-timeout="10min" target-role="Started" is-managed="true" primitive p_vd-vhosts-ha ocf:heartbeat:VirtualDomain \ params config="/etc/libvirt/qemu/vhosts-ha.xml" migration_transport="ssh" force_stop="0" hypervisor="qemu:///system" \ operations $id="p_vd-vhosts-ha-operations" \ op start interval="0" timeout="90" \ op stop interval="0" timeout="90" \ op migrate_from interval="0" timeout="240" \ op migrate_to interval="0" timeout="240" \ op monitor interval="10" timeout="30" start-delay="10" \ meta allow-migrate="true" failure-timeout="3min" target-role="Started" resource-stickiness="100" primitive p_vd-win7-ha ocf:heartbeat:VirtualDomain \ params config="/etc/libvirt/qemu/win7-ha.xml" migration_transport="ssh" force_stop="0" hypervisor="qemu:///system" \ operations $id="p_vd-win7-operations" \ op start interval="0" timeout="90" \ op stop interval="0" timeout="90" \ op migrate_from interval="0" timeout="600" \ op migrate_to interval="0" timeout="600" \ op monitor interval="10" timeout="30" start-delay="0" \ meta allow-migrate="true" failure-timeout="3min" target-role="Started" primitive st_bigeye stonith:fence_drac5 \ params ipaddr="<hidden>" login="cman" passwd="<hidden>" action="reboot" secure="true" pcmk_host_list="bigeye" pcmk_host_check="static-list" primitive st_blindpig stonith:fence_apc_snmp \ params inet4_only="1" community="<hidden>" port="blindpig" action="reboot" ipaddr="<hidden>" snmp_version="1" pcmk_host_check="static-list" pcmk_host_list="blindpig" pcmk_host_map="blindpig:6" ms ms_drbd-backuppc p_drbd-backuppc \ meta master-max="2" clone-max="2" notify="true" migration-threshold="1" allow-migrate="true" target-role="Started" interleave="true" is-managed="true" ms ms_drbd-dogfish-ha p_drbd-dogfish-ha \ meta master-max="2" clone-max="2" notify="true" migration-threshold="1" allow-migrate="true" target-role="Started" interleave="true" is-managed="true" ms ms_drbd-hopmon p_drbd-hopmon \ meta master-max="2" clone-max="2" notify="true" migration-threshold="1" allow-migrate="true" target-role="Started" interleave="true" is-managed="true" ms ms_drbd-hoptical p_drbd-hoptical \ meta master-max="2" clone-max="2" notify="true" migration-threshold="1" allow-migrate="true" target-role="Started" interleave="true" is-managed="true" ms ms_drbd-hopvpn p_drbd-hopvpn \ meta master-max="2" clone-max="2" notify="true" migration-threshold="1" allow-migrate="true" target-role="Started" interleave="true" is-managed="true" ms ms_drbd-musicbrainz p_drbd-musicbrainz \ meta master-max="2" clone-max="2" notify="true" migration-threshold="1" allow-migrate="true" target-role="Started" interleave="true" is-managed="true" ms ms_drbd-spacewalk p_drbd-spacewalk \ meta master-max="2" clone-max="2" notify="true" migration-threshold="1" allow-migrate="true" target-role="Started" interleave="true" is-managed="true" ms ms_drbd-vhosts p_drbd-vhosts \ meta master-max="2" clone-max="2" notify="true" migration-threshold="1" allow-migrate="true" target-role="Started" interleave="true" is-managed="true" ms ms_drbd-vz p_drbd-vz \ meta master-max="2" clone-max="2" notify="true" migration-threshold="1" allow-migrate="true" target-role="Started" interleave="true" is-managed="true" ms ms_drbd-win7 p_drbd-win7 \ meta master-max="2" clone-max="2" notify="true" migration-threshold="1" allow-migrate="true" target-role="Started" interleave="true" is-managed="true" clone c_cluster_mon p_cluster_mon \ meta clone-max="2" notify="true" target-role="Started" interleave="true" is-managed="true" clone c_st_bigeye st_bigeye \ meta clone-max="2" notify="true" target-role="Started" interleave="true" is-managed="true" clone c_st_blindpig st_blindpig \ meta clone-max="2" notify="true" target-role="Started" interleave="true" is-managed="true" clone clone_gfs2-vz-config p_gfs2-vz-config \ meta clone-max="2" notify="true" target-role="Started" interleave="true" is-managed="true" clone clone_gfs2-vz-storage p_gfs2-vz-storage \ meta clone-max="2" notify="true" target-role="Started" interleave="true" is-managed="true" clone clone_lvm-backuppc p_lvm-backuppc \ meta clone-max="2" notify="true" target-role="Started" interleave="true" is-managed="true" clone clone_lvm-dogfish-ha p_lvm-dogfish-ha \ meta clone-max="2" notify="true" target-role="Started" interleave="true" is-managed="true" clone clone_lvm-hopmon p_lvm-hopmon \ meta clone-max="2" notify="true" target-role="Started" interleave="true" is-managed="true" clone clone_lvm-hoptical p_lvm-hoptical \ meta clone-max="2" notify="true" target-role="Started" interleave="true" is-managed="true" clone clone_lvm-hopvpn p_lvm-hopvpn \ meta clone-max="2" notify="true" target-role="Started" interleave="true" is-managed="true" clone clone_lvm-musicbrainz p_lvm-musicbrainz \ meta clone-max="2" notify="true" target-role="Started" interleave="true" is-managed="true" clone clone_lvm-spacewalk p_lvm-spacewalk \ meta clone-max="2" notify="true" target-role="Started" interleave="true" is-managed="true" clone clone_lvm-vhosts p_lvm-vhosts \ meta clone-max="2" notify="true" target-role="Started" interleave="true" is-managed="true" clone clone_lvm-vz p_lvm-vz \ meta clone-max="2" notify="true" target-role="Started" interleave="true" is-managed="true" clone clone_lvm-win7 p_lvm-win7 \ meta clone-max="2" notify="true" target-role="Started" interleave="true" is-managed="true" location cli-prefer-p_cluster_mon c_cluster_mon \ rule $id="cli-prefer-rule-p_cluster_mon" inf: #uname eq bigeye location cli-prefer-p_ip-10.69.1.1 p_ip-10.69.1.1 \ rule $id="cli-prefer-rule-p_ip-10.69.1.1" inf: #uname eq blindpig location cli-prefer-p_ip-10.69.1.2 p_ip-10.69.1.2 \ rule $id="cli-prefer-rule-p_ip-10.69.1.2" inf: #uname eq bigeye location cli-prefer-p_vd-backuppc-ha p_vd-backuppc-ha \ rule $id="cli-prefer-rule-p_vd-backuppc-ha" inf: #uname eq blindpig location cli-prefer-p_vd-dogfish-ha p_vd-dogfish-ha \ rule $id="cli-prefer-rule-p_vd-dogfish-ha" inf: #uname eq blindpig location cli-prefer-p_vd-hopmon-ha p_vd-hopmon-ha \ rule $id="cli-prefer-rule-p_vd-hopmon-ha" inf: #uname eq bigeye location cli-prefer-p_vd-hoptical-ha p_vd-hoptical-ha \ rule $id="cli-prefer-rule-p_vd-hoptical-ha" inf: #uname eq bigeye location cli-prefer-p_vd-hopvpn-ha p_vd-hopvpn-ha \ rule $id="cli-prefer-rule-p_vd-hopvpn-ha" inf: #uname eq bigeye location cli-prefer-p_vd-musicbrainz-ha p_vd-musicbrainz-ha \ rule $id="cli-prefer-rule-p_vd-musicbrainz-ha" inf: #uname eq blindpig location cli-prefer-p_vd-spacewalk-ha p_vd-spacewalk-ha \ rule $id="cli-prefer-rule-p_vd-spacewalk-ha" inf: #uname eq blindpig location cli-prefer-p_vd-vhosts-ha p_vd-vhosts-ha \ rule $id="cli-prefer-rule-p_vd-vhosts-ha" inf: #uname eq bigeye location cli-prefer-p_vd-win7-ha p_vd-win7-ha \ rule $id="cli-prefer-rule-p_vd-win7-ha" inf: #uname eq blindpig location drbd_backuppc_excl ms_drbd-backuppc \ rule $id="drbd_backuppc_excl-rule" -inf: #uname eq blindpig2 location drbd_dogfish-ha_excl ms_drbd-dogfish-ha \ rule $id="drbd_dogfish-ha_excl-rule" -inf: #uname eq blindpig2 location drbd_hopmon_excl ms_drbd-hopmon \ rule $id="drbd_hopmon_excl-rule" -inf: #uname eq blindpig2 location drbd_hoptical_excl ms_drbd-hoptical \ rule $id="drbd_hoptical_excl-rule" -inf: #uname eq blindpig2 location drbd_hopvpn_excl ms_drbd-hopvpn \ rule $id="drbd_hopvpn_excl-rule" -inf: #uname eq blindpig2 location drbd_musicbrainz_excl ms_drbd-musicbrainz \ rule $id="drbd_musicbrainz_excl-rule" -inf: #uname eq blindpig2 location drbd_vhost_excl ms_drbd-vhosts \ rule $id="drbd_vhosts_excl-rule" -inf: #uname eq blindpig2 colocation c_gfs-vz-config_on_master inf: clone_gfs2-vz-config ms_drbd-vz:Master colocation c_gfs-vz-storage_on_master inf: clone_gfs2-vz-storage ms_drbd-vz:Master colocation c_lvm-backuppc_on_drbd-backuppc inf: clone_lvm-backuppc ms_drbd-backuppc:Master colocation c_lvm-dogfish-ha_on_drbd-dogfish-ha inf: clone_lvm-dogfish-ha ms_drbd-dogfish-ha:Master colocation c_lvm-hopmon_on_drbd-hopmon inf: clone_lvm-hopmon ms_drbd-hopmon:Master colocation c_lvm-hoptical_on_drbd-hoptical inf: clone_lvm-hoptical ms_drbd-hoptical:Master colocation c_lvm-hopvpn_on_drbd-hopvpn inf: clone_lvm-hopvpn ms_drbd-hopvpn:Master colocation c_lvm-musicbrainz_on_drbd-musicbrainz inf: clone_lvm-musicbrainz ms_drbd-musicbrainz:Master colocation c_lvm-spacewalk_on_drbd-spacewalk inf: clone_lvm-spacewalk ms_drbd-spacewalk:Master colocation c_lvm-vhosts_on_drbd-vhosts inf: clone_lvm-vhosts ms_drbd-vhosts:Master colocation c_lvm-vz_on_drbd-vz inf: clone_lvm-vz ms_drbd-vz:Master colocation c_lvm-win7_on_drbd-win7 inf: clone_lvm-win7 ms_drbd-win7:Master colocation c_vd-backuppc-on-master inf: p_vd-backuppc-ha ms_drbd-backuppc:Master colocation c_vd-dogfish-ha-on-master inf: p_vd-dogfish-ha ms_drbd-dogfish-ha:Master colocation c_vd-hopmon-on-master inf: p_vd-hopmon-ha ms_drbd-hopmon:Master colocation c_vd-hoptical-on-master inf: p_vd-hoptical-ha ms_drbd-hoptical:Master colocation c_vd-hopvpn-on-master inf: p_vd-hopvpn-ha ms_drbd-hopvpn:Master colocation c_vd-musicbrainz-on-master inf: p_vd-musicbrainz-ha ms_drbd-musicbrainz:Master colocation c_vd-spacewalk-on-master inf: p_vd-spacewalk-ha ms_drbd-spacewalk:Master colocation c_vd-vhosts-on-master inf: p_vd-vhosts-ha ms_drbd-vhosts:Master colocation c_vd-win7-on-master inf: p_vd-win7-ha ms_drbd-win7:Master order o_drbm-lvm-gfs2-vz-config-storage inf: ms_drbd-vz:promote clone_lvm-vz:start clone_gfs2-vz-config:start clone_gfs2-vz-storage:start order o_drbm-lvm-vd-start-backuppc inf: ms_drbd-backuppc:promote clone_lvm-backuppc:start p_vd-backuppc-ha:start order o_drbm-lvm-vd-start-dogfish-ha inf: ms_drbd-dogfish-ha:promote clone_lvm-dogfish-ha:start p_vd-dogfish-ha:start order o_drbm-lvm-vd-start-hopmon inf: ms_drbd-hopmon:promote clone_lvm-hopmon:start p_vd-hopmon-ha:start order o_drbm-lvm-vd-start-hoptical inf: ms_drbd-hoptical:promote clone_lvm-hoptical:start p_vd-hoptical-ha:start order o_drbm-lvm-vd-start-hopvpn inf: ms_drbd-hopvpn:promote clone_lvm-hopvpn:start p_vd-hopvpn-ha:start order o_drbm-lvm-vd-start-musicbrainz inf: ms_drbd-musicbrainz:promote clone_lvm-musicbrainz:start p_vd-musicbrainz-ha:start order o_drbm-lvm-vd-start-spacewalk inf: ms_drbd-spacewalk:promote clone_lvm-spacewalk:start p_vd-spacewalk-ha:start order o_drbm-lvm-vd-start-vhosts inf: ms_drbd-vhosts:promote clone_lvm-vhosts:start p_vd-vhosts-ha:start order o_drbm-lvm-vd-start-win7 inf: ms_drbd-win7:promote clone_lvm-win7:start p_vd-win7-ha:start order o_gfs_before_openvz inf: _rsc_set_ clone_gfs2-vz-config clone_gfs2-vz-storage property $id="cib-bootstrap-options" \ dc-version="1.1.7-6.el6-148fccfd5985c5590cc601123c6c16e966b85d14" \ cluster-infrastructure="cman" \ expected-quorum-votes="2" \ stonith-enabled="true" \ no-quorum-policy="ignore" \ default-resource-stickiness="1" \ last-lrm-refresh="1362432862" \ maintenance-mode="off" rsc_defaults $id="rsc-options" \ resource-stickiness="1" \ failure-timeout="60s" </pre> == Other References == https://alteeve.ca/w https://alteeve.ca/w/2-Node_Red_Hat_KVM_Cluster_Tutorial http://www.drbd.org/ http://clusterlabs.org/ [[Category:Clustering]] [[Category:How-to]] [[Category:Linux]] [[Category:Virtualization]]
Summary:
Please note that all contributions to RARForge may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see
RARForge:Copyrights
for details).
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)
Template used on this page:
Template:Header
(
edit
)
Navigation menu
Personal tools
Not logged in
Talk
Contributions
Log in
Namespaces
Page
Discussion
English
Views
Read
Edit
View history
More
Search
Navigation
Home
All Pages
All Files
View Categories
Recent changes
Random page
Edit this menu
Tools
What links here
Related changes
Special pages
Page information