Eine interessante Aufgabe habe ich da!
Es sollen grosse Datenmengen über grosse Distanzen kopiert werden (in meinem Fall sollen Daten einer amerikanische Firma nach Österreich kopiert werden). Wir bewegen uns im Bereich von ca. 500-800TB (Terrabyte). Die Daten sind Messdaten und fallen täglich an, wobei täglich zwischen 10 und 20 Files erstellt werden. Der Tageswert liegt bei einer Grösse von ca 600-1000 GB
Folgendes ist vereinbart:
Meine Aufgabe ist es nun, die Daten auf ein Storage zu kopieren und danach den Platz auf dem Server wieder freizugeben. Dabei muss ich natürlich aufpassen, dass ich keine Daten lösche, die gerade kopiert werden bzw dass ich keine Daten kopiere, die noch nicht vollständig übertragen wurden (dazu dient das JJJJ-MM-TT.done-file).
Ich löse die Anbindung des Servers an unser Storage, indem ich den Server per NFS auf unser Storage mounte und verhindere so auch gleich, dass jemand, der den Server "hacked" Zugriff auf unser Netzwerk hat. Was bleibt, ist das Problem, dass der Platz auf dem Server nie ausgehen darf, ich prüfen muss, ob der Server online ist und natürlich ob die NFS-Freigabe verfügbar ist.
Ich habe mir dazu ein Script geschrieben, dass ich gerne zur Verfügung stellen möchte. Das Script dient inzwischen auch dazu andere Kopierprozesse in meinem Netzwerk zu erledigen. Eine kurze Beschreibung folgt nach dem Script:
#!/bin/bash # # cron will start script each hours # # starts a backup if nfs mount is active with rsync # if mount-dir is not available script mounts dir using nfs for next call # # EVENT R = get PID if script is running # EVENT 0 = cron started # EVENT 1 = copy started # EVENT 2 = no startfile from company found # EVENT 3 = ls input not a directory (maybe a file) # EVENT 4 = directory not mounted, mounting for next cron run # EVENT 5 = copy stopped, local startfile not found # EVENT 6 = company machine not online # EVENT 7 = backup exists, renaming .done file to .done_s and continue loop # EVENT 8 = break running process # EVENT 9 = Stopping starage backup while copying company data # # # MOUNT and START SETTINGS # # ip and dir for mounting IP="xxx.xxx.xxx.xxx" DIRONSERVER="/company" # dir where company server is mounted to DIR_TO_MOUNT="/companydata" # directory where files are stored in (found in mounted directory) DIR_TO_LOOKUP="/companyhome/data" # company sets a startfile, if copy is finished... CP_STARTFILE="START_COPY" # # LOCAL SETTINGS # # locald dir to store data and where startfile "LOCAL_STARTFILE" is found (if not present copy will be stopped) LOCAL_DIR="/tank/open/company" LOCAL_STARTFILE="START_CP_SYNC" # startfile for backup of storage BACKUP_STARTFILE="/tank/start_backup" # LOG-FILE SETTINGS LOG_DIR="/var/log/company" BACKUPLOG="companysync.log" PIDLOG="companypid.log" # # PATH SETTINGS # # Path of executables R_SYNC="/usr/bin/rsync" # mount path M_OUNT="/bin/mount" # umount path U_MOUNT="/bin/umount" # # # SCRIPT STARTS HERE. DO NOT CHANGE SETTINGS BELOW # # # MOUNT COMMAND MOUNT_COMMAND="$M_OUNT -t nfs $IP:$DIRONSERVER $DIR_TO_MOUNT" UMOUNT_COMMAND="$U_MOUNT $DIR_TO_MOUNT" COPY_DIR=$DIR_TO_MOUNT$DIR_TO_LOOKUP # create logdir if not present if ! [ -d $LOG_DIR ]; then mkdir -p $LOG_DIR fi # Self running test # PID - pid of the current script # PID=$$ # SCRIPTNAME - current name of the script without directory prefix SCRIPTNAME=`basename $0` # PIDFILE - where to write the current pid PIDFILE="/tmp/$SCRIPTNAME.pid" # ENDEXECUTION - if 1 then stop script, if 0 everything is ok and continue ENDEXECUTION=0 if [ -f "$PIDFILE" ];then RUNNINGPID=`cat "$PIDFILE"` echo "[EVENT R `date`] got pid $RUNNINGPID from pidfile '$PIDFILE'" >> $LOG_DIR/$PIDLOG PROGRAMPID=`ps xa | grep "$SCRIPTNAME" | grep -v grep | awk '{print $1;}'` for PIDEL in $PROGRAMPID do if [ "$PIDEL" == "$RUNNINGPID" ]; then ENDEXECUTION=1 break fi done fi if [ "$ENDEXECUTION" == "1" ] then echo "[EVENT R `date`] Current script '$SCRIPTNAME' is already running (pid $RUNNINGPID) - end execution" >> $LOG_DIR/$PIDLOG exit 1 fi # # # writing PID to pidfile and start script # # echo $PID > $PIDFILE echo "[EVENT 0 `date`]--------------------- CRON COPY CP STARTED ---------------------------" >> $LOG_DIR/$BACKUPLOG # # Start online-check # # Ping server returns TRUE or exit ping -c 1 $IP >> /dev/null if [ "$?" == "0" ]; then # is there a startfile START_CP_SYNC? If not copy ist stopped manually if [ -e $LOCAL_DIR/$LOCAL_STARTFILE ]; then # is the directory found, where the files are stored? If not, dir is not mounted if [ -d $COPY_DIR ]; then for u in `ls $COPY_DIR`; do # company makes dir's to copy if [ -d $COPY_DIR/$u ]; then # company sets a file called dirname.done if copy is finished # if [ -e $COPY_DIR/$u/$CP_STARTFILE ]; then if [ -e $COPY_DIR/$u".done" ]; then # dir is in backup, stop copy and continue loop if [ -d $LOCAL_DIR/$u ]; then echo "[EVENT 7 `date`] backup $LOCAL_DIR/$u exists! moving .done-file to .done_s and continue loop" >> $LOG_DIR/$BACKUPLOG mv $COPY_DIR/$u".done" $COPY_DIR/$u".done_s" >> $LOG_DIR/BACKUPLOG continue fi # breaking running process if [ -e $LOCAL_DIR/"BREAK" ]; then echo "[EVENT 8 `date`] Break for running process initiated.." >> $LOG_DIR/$BACKUPLOG # rm $LOCAL_DIR/"BREAK" exit 1 fi # stop backup storage while copying company-data if [ -e $BACKUP_STARTFILE ]; then echo "[EVENT 9 `date`] stopping storage backup while copying data.." >> $LOG_DIR/$BACKUPLOG rm $BACKUP_STARTFILE -f fi echo "[EVENT 1 `date`] backup $COPY_DIR/$u started" >> $LOG_DIR/$BACKUPLOG $R_SYNC -av $COPY_DIR/$u $LOCAL_DIR >> $LOG_DIR/$BACKUPLOG echo "[EVENT 1 `date`] backup finished " >> $LOG_DIR/$BACKUPLOG echo "[EVENT 1 `date`] removing dir $COPY_DIR/$u..." >> $LOG_DIR/$BACKUPLOG rm $COPY_DIR/$u/ -R -f >> $LOG_DIR/$BACKUPLOG cp $COPY_DIR/$u".done" $LOCAL_DIR rm $COPY_DIR/$u".done" -f >> $LOG_DIR/BACKUPLOG echo "-----------------------------------------------------------------------------------------------" >> $LOG_DIR/$BACKUPLOG else echo "[EVENT 2 `date`] Startfile $COPY_DIR/$u.done not found! Company maybe is still copying files.." >> $LOG_DIR/$BACKUPLOG fi else echo "[EVENT 3 `date`] $COPY_DIR/$u is not a valid directory, maybe a file?" >> $LOG_DIR/$BACKUPLOG fi done; # if dir is not mounted, mount it for next run else $UMOUNT_COMMAND $MOUNT_COMMAND echo "[EVENT 4 `date`] $IP:$DIRONSERVER not mounted, mounting for next run..." >> $LOG_DIR/$BACKUPLOG fi # no local startfile found, copy ist stopped manually else echo "[EVENT 5 `date`] Copy is manaually stopped! no startfile $LOCAL_STARTFILE... "`date` >> $LOG_DIR/$BACKUPLOG fi # server does not answer ping, maybe server is offline else echo "[EVENT 6 `date`] Company-Server not reachable" >> $LOG_DIR/$BACKUPLOG fi rm $PIDFILE # start backup of storage touch $BACKUP_STARTFILE exit 0
Server
Storage (zfs)
Das Script oben wird stündlich per Cronjob (als root) gestartet (5 * * * * /script/company_sync.sh).
Alle Vorgänge werden in einem Logfile festgehalten. Die EVENT Tags sollen dazu dienen, das Logfile gezielt nach bestimmten Ereignissen durchsuchen zu können. ein
cat /var/log/company/companysync.log|grep EVNT4