blue-ox.nl From coffee-fueled fruity tech to fast runs—think different, let’s run them.

Raspberry Pi System Backup: A Complete Guide

R

Introduction

This guide details a robust backup solution for Raspberry Pi systems, featuring Docker container management, service handling, and smart retention policies. The solution uses rsync over SSH to create efficient, incremental backups to a Synology NAS.

Core Features & Technical Analysis

Service and Container Management

# Smart handling of critical services
CRITICAL_SERVICES=(
    "mariadb"
    "postgres"
)

# Docker container handling
CRITICAL_CONTAINERS=(
    "immich_postgres"
    "immich_redis"
    "immich_server"
)Code language: PHP (php)

Technical Details:

  • Proactive service state detection
  • Docker container pause/unpause instead of stop/start for faster recovery
  • State restoration regardless of backup outcome
  • Graceful failure handling with detailed logging

Backup Integrity

# Critical files monitored
CRITICAL_FILES=(
    "/etc/passwd"
    "/etc/shadow"
    "/etc/fstab"
    "/boot/config.txt"
)Code language: PHP (php)

Verification Process:

  1. SHA256 checksum generation for critical files
  2. Remote verification after transfer
  3. Automatic cleanup of temporary verification files
  4. Intelligent handling of empty critical files list

Resource Management

# Performance optimization
nice -n 19 ionice -c2 -n7 sudo rsyncCode language: PHP (php)

Benefits:

  • System remains responsive during backup
  • I/O prioritization prevents resource starvation
  • Safe for production environments

Smart Retention

# Retention configuration
WEEKLY_RETENTION=4    # Last 4 weeks
MONTHLY_RETENTION=6   # Last 6 months
YEARLY_RETENTION=2    # Last 2 yearsCode language: PHP (php)

Implementation Details:

  • Hard-link based retention for space efficiency
  • Automatic cleanup of expired backups
  • Calendar-aware rotation (weekly on Sundays, monthly on 1st, yearly on Jan 1st)

Error Handling & Logging

# Comprehensive error states
case $exit_code in
    0)  return 0 ;;  # Success
    23) log_message "WARNING: Some files were not transferred" ;;
    12) log_message "ERROR: rsync connection failed" ;;
    *)  log_message "ERROR: Backup failed with code $exit_code" ;;
esacCode language: PHP (php)

Implementation Guide

Initial Setup

  1. Create SSH Infrastructure:
# Generate backup-specific key
sudo ssh-keygen -t ed25519 -f /root/.ssh/id_ed25519_backup -N ""

# Configure Synology NAS (choose one):
# Option 1: Append to existing keys
echo "PASTE_YOUR_PUBLIC_KEY" >> /var/services/homes/[username]/.ssh/authorized_keys

# Option 2: Create new authorized_keys (caution!)
echo "PASTE_YOUR_PUBLIC_KEY" > /var/services/homes/[username]/.ssh/authorized_keysCode language: PHP (php)
  1. Prepare Backup Location:
# On Synology NAS
mkdir -p /volume1/RPi-archive/[hostname]/current
chmod 700 /volume1/RPi-archive/[hostname]Code language: PHP (php)

Script Installation

  1. Deploy Script:
sudo cp backup-script.sh /usr/local/sbin/rpi-system-backup.sh
sudo chmod 700 /usr/local/sbin/rpi-system-backup.sh
  1. Configure Automation:
# Add to root's crontab
sudo crontab -e

# Example: Run at 2 AM every Sunday
0 2 * * 0 /usr/local/sbin/rpi-system-backup.shCode language: PHP (php)

Recovery Procedures

Current Backup Restore

# Mount target SD card
sudo mount /dev/sdX2 /mnt/restore
sudo mount /dev/sdX1 /mnt/restore/boot

# Restore using rsync
sudo rsync -aAXv --delete \
    user@nas:/volume1/RPi-archive/hostname/current/ /mnt/restore/Code language: PHP (php)

Historical Backup Restore

# From weekly backup
sudo rsync -aAXv --delete \
    user@nas:/volume1/RPi-archive/hostname/weekly/2024-W03/ /mnt/restore/

# From monthly backup
sudo rsync -aAXv --delete \
    user@nas:/volume1/RPi-archive/hostname/monthly/2024-01/ /mnt/restore/Code language: PHP (php)

Known Limitations & Solutions

ACL Handling

Recent testing revealed ACL permission issues. Two solutions:

  1. Disable ACL transfer (recommended):
RSYNC_OPTS="-aAX --delete --timeout=120 --no-specials --copy-unsafe-links --partial --quiet --no-acls"Code language: JavaScript (javascript)
  1. Configure NAS ACL support:
sudo synoacltool -enable /volume1/RPi-archive
sudo chmod -R 770 /volume1/RPi-archive

Impact of disabling ACLs:

  • Basic permissions (rwx) preserved
  • Special access controls lost but can be manually restored
  • Suitable for most Pi deployments

Other Considerations

  1. Network Dependency
  • Backup fails if network unreachable
  • Implement timeout handling
  • Consider local staging
  1. Storage Requirements
  • Space verification before backup
  • Hard links minimize space usage
  • Automatic cleanup of old backups
  1. Security
  • Dedicated SSH key
  • Limited NAS user permissions
  • Secure key storage

Best Practices

Monitoring

# Check backup logs
sudo tail -f /var/log/rpi-system-backup.log

# Verify backup integrity
sudo sha256sum -c /tmp/backup_checksumsCode language: PHP (php)

Testing

  1. Regular dry runs:
sudo /usr/local/sbin/rpi-system-backup.sh --dry-run
  1. Periodic test restores
  2. Service state verification

the script

#!/bin/bash
set -e
set -u

# Configuration
HOSTNAME=$(hostname)
REMOTE_HOST="root"
LOG="/var/log/rpi-system-backup.log"
LOG_MAX_SIZE=10M
LOG_BACKUP_COUNT=7
LOCK_TIMEOUT=3600
REMOTE_PATH="/volume1/RPi-archive/${HOSTNAME}"

# Backup exclusions
EXCLUDE_LIST=(
    "/proc"
    "/sys"
    "/dev"
    "/tmp"
    "/run"
    "/lost+found"
    "/var/log"
    "/var/cache/apt/archives"
    "/home/*/.cache"
    "/media"
    "/mnt"
    # Docker specifieke excludes
    "/var/lib/docker/overlay2"
    "/var/lib/docker/containers"
    "/var/lib/docker/volumes"
    "/var/lib/docker/tmp"
    "/var/lib/docker/**/*.sock"
    # Database en applicatie specifieke excludes
    "/home/*/unifi/data/data/db"  # UniFi database directory
    "/home/*/unifi/data/db"       # Alternatieve UniFi database locatie
    # Systeem specifieke excludes
    "/var/run/systemd/inaccessible"
    "/var/run/user/*/systemd/inaccessible"
    "/var/run/**/*.sock"
    "/opt/pivpn/scripts"
    "/usr/local/src/pivpn/scripts"
    # Swapfile exclude
    "/swapfile"
    "/var/swap"
)

# Services to pause during backup
# Add your critical services here, for example:
# CRITICAL_SERVICES=(
#     "mariadb"
#     "postgres"
# )
CRITICAL_SERVICES=()

# Docker containers to pause
# Add your critical containers here, for example:
#CRITICAL_CONTAINERS=(
# media containers
#    "immich_postgres"
#    "immich_redis"
#    "immich_server"
# network containers
#    "netmngr-app-1"
#    "unifi-controller"
#    "adguardhome"
#    "grafana"
#    "prometheus"
#    "cadvisor"
#    "uptime-kuma"
#    "ddclient"
#    "portainer"
# )
CRITICAL_CONTAINERS=()

# Retention configuration (alleen weekly nodig)
WEEKLY_RETENTION=10

# Performance settings
DRY_RUN=false
NICE_LEVEL=19
IO_CLASS="idle"

# Lock file
LOCK_FILE="/tmp/rpi_system_backup.lock"

# Logging function with rotation
log_message() {
    local timestamp=$(date '+%Y-%m-%d %H:%M:%S')
    echo "$timestamp - $1" | tee -a "$LOG"
}

log_debug() {
    if [ "$DRY_RUN" = true ]; then
        log_message "DEBUG: $1"
    fi
}

# Log rotation check
rotate_logs() {
    if [ -f "$LOG" ]; then
        local size
        size=$(stat -f%z "$LOG" 2>/dev/null || stat -c%s "$LOG" 2>/dev/null)
        local max_size
        max_size=$(numfmt --from=iec "$LOG_MAX_SIZE")

        if [ "${size:-0}" -gt "$max_size" ]; then
            for i in $(seq $((LOG_BACKUP_COUNT-1)) -1 1); do
                [ -f "${LOG}.$i" ] && mv "${LOG}.$i" "${LOG}.$((i+1))"
            done
            mv "$LOG" "${LOG}.1"
            touch "$LOG"
        fi
    fi
}

# Lock management
check_lock() {
    if [ -f "$LOCK_FILE" ]; then
        local pid
        pid=$(cat "$LOCK_FILE" 2>/dev/null || echo "")
        if [ -n "$pid" ] && kill -0 "$pid" 2>/dev/null; then
            local lock_age
            lock_age=$(($(date +%s) - $(stat -c %Y "$LOCK_FILE")))
            if [ "$lock_age" -gt "$LOCK_TIMEOUT" ]; then
                log_message "Lock file is stale (age: ${lock_age}s). Removing."
                rm -f "$LOCK_FILE"
            else
                log_message "Another backup process is running (PID: $pid)"
                exit 1
            fi
        fi
    fi
    echo $$ > "$LOCK_FILE"
}

# Cleanup function
cleanup() {
    handle_services start
    rm -f "$LOCK_FILE"
    log_message "Cleanup completed"
}

# Handle services
handle_services() {
    local action=$1
    local failed=0
    log_message "${action^}ing critical services and containers"

    # System services
    for service in "${CRITICAL_SERVICES[@]}"; do
        if systemctl is-active --quiet "$service"; then
            if ! systemctl "$action" "$service"; then
                log_message "Failed to $action $service"
                failed=1
            fi
        fi
    done

    # Docker containers
    if command -v docker >/dev/null 2>&1; then
        for container in "${CRITICAL_CONTAINERS[@]}"; do
            if docker ps -q -f name="$container" >/dev/null; then
                if [ "$action" = "stop" ]; then
                    if ! docker inspect --format '{{.State.Paused}}' "$container" | grep -q "true"; then
                        docker pause "$container" || failed=1
                    fi
                else
                    if docker inspect --format '{{.State.Paused}}' "$container" | grep -q "true"; then
                        docker unpause "$container" || failed=1
                    fi
                fi
            fi
        done
    fi

    return $failed
}

# Check available space
check_space() {
    log_message "Checking free space..."

    local required_space
    required_space=$(du -sx --exclude=/proc --exclude=/sys --exclude=/dev / 2>/dev/null | awk '{print $1}')
    local available_space

    if ! available_space=$(ssh "$REMOTE_HOST" "df -k '${REMOTE_PATH}' | tail -1 | awk '{print \$4}'"); then
        log_message "ERROR: Failed to check remote space"
        return 1
    fi

    if [ -z "$available_space" ] || [ -z "$required_space" ]; then
        log_message "ERROR: Failed to determine space requirements"
        return 1
    fi

    if [ "$available_space" -lt "$required_space" ]; then
        log_message "ERROR: Insufficient space. Required: ${required_space}KB, Available: ${available_space}KB"
        return 1
    fi

    log_message "Space check passed (${available_space}KB available)"
    return 0
}

# Initialize backup structure
initialize_backup_structure() {
    log_message "Initializing backup directory structure..."

    local init_command="mkdir -p '${REMOTE_PATH}/current' '${REMOTE_PATH}/weekly' && chown Erik:users '${REMOTE_PATH}/current' '${REMOTE_PATH}/weekly' && chmod 755 '${REMOTE_PATH}/current' '${REMOTE_PATH}/weekly'"
    log_debug "Executing: $init_command"

    if ! ssh "$REMOTE_HOST" "$init_command" 2>&1; then
        log_message "ERROR: Failed to initialize backup structure"
        return 1
    fi

    return 0
}

# Generate exclude parameters
generate_exclude_params() {
    local exclude_params=""
    for item in "${EXCLUDE_LIST[@]}"; do
        exclude_params+="--exclude=${item} "
    done
    echo "$exclude_params"
}

# Perform backup
perform_backup() {
    log_message "Starting backup process..."

    if ! initialize_backup_structure; then
        return 1
    fi

    local exclude_params
    exclude_params=$(generate_exclude_params)

    local rsync_opts=(
        -aAX            # Archive mode met extra flags voor system backup
        --delete        # Delete extraneous files
        --numeric-ids   # Don't map uid/gid values
        --partial      # Keep partially transferred files
        --timeout=300  # Increase timeout to 5 minutes
        --rsync-path="/usr/bin/rsync"  # Explicit rsync path on remote
        --ignore-errors # Continue on errors
        --force        # Force deletion of dirs even if not empty
        --delete-missing-args  # Delete files that are missing on sender
    )

    # Add link-dest if previous backup exists
    if ssh "$REMOTE_HOST" "[ -d '${REMOTE_PATH}/weekly' ] && [ ! -z \"\$(ls -A '${REMOTE_PATH}/weekly')\" ]"; then
        local latest_weekly
    latest_weekly=$(ssh "$REMOTE_HOST" "ls -1t '${REMOTE_PATH}/weekly' | head -n1")

        latest_weekly=$(ssh root "ls -1t '${REMOTE_PATH}/weekly' | head -n1")
        if [ -n "$latest_weekly" ]; then
            log_debug "Found previous backup: weekly/${latest_weekly}"
            rsync_opts+=("--link-dest=${REMOTE_PATH}/weekly/${latest_weekly}")
        fi
    fi

    # Add verbose options for dry run
    if [ "$DRY_RUN" = true ]; then
        rsync_opts+=(
            "--dry-run"
            "--verbose"
            "--itemize-changes"
        )
    fi

    # Execute rsync
    local rsync_command="nice -n ${NICE_LEVEL} ionice -c2 -n7 rsync ${rsync_opts[*]} $exclude_params / ${REMOTE_HOST}:'${REMOTE_PATH}/current/'"
    log_message "Executing rsync command: $rsync_command"

    if ! eval "$rsync_command" 2> >(tee -a "$LOG" >&2); then
        log_message "ERROR: rsync failed"
        return 1
    fi

    log_message "rsync completed successfully"
    return 0
}


# Create snapshot of the current backup
create_snapshot() {
    local target_dir="$1"
    local retention="$2"
    local snapshot_type="$3"

    log_message "Creating ${snapshot_type} snapshot in ${target_dir}"

    # Voer de rsync uit op de remote host zelf met sudo om root rechten te krijgen
    if ! ssh "$REMOTE_HOST" "sudo rsync -a \
              -H \
              -x \
              --delete \
              --numeric-ids \
              --inplace \
              --partial \
              --modify-window=1 \
              --link-dest='${REMOTE_PATH}/current' \
              '${REMOTE_PATH}/current/' \
              '${target_dir}/'"
    then
        log_message "ERROR: Failed to create ${snapshot_type} snapshot"
        return 1
    fi

    # Cleanup old snapshots (ook met sudo)
    local cleanup_command="cd \"\$(dirname '${target_dir}')\" && ls -1t | tail -n +$((retention + 1)) | xargs -r sudo rm -rf"
    log_debug "Executing cleanup command: $cleanup_command"

    if ! ssh "$REMOTE_HOST" "$cleanup_command" 2>&1; then
        log_message "ERROR: Failed to cleanup old ${snapshot_type} snapshots"
        return 1
    fi
    return 0
}

# Main execution
main() {
    log_message "Starting backup process"
    rotate_logs
    check_lock
    trap cleanup EXIT INT TERM

    if ! check_space; then
        exit 1
    fi

    if ! handle_services stop; then
        log_message "Failed to stop services"
        handle_services start
        exit 1
    fi

    if perform_backup; then
        local week=$(date +%Y-W%V)
        local weekly_target="${REMOTE_PATH}/weekly/${week}"

        if ! create_snapshot "$weekly_target" "$WEEKLY_RETENTION" "weekly"; then
            log_message "ERROR: Snapshot creation failed"
            exit 1
        fi

        log_message "Backup process completed successfully"
    else
        log_message "ERROR: Backup process failed"
        exit 1
    fi

}

# Parse command line arguments
while [[ $# -gt 0 ]]; do
    case $1 in
        --dry-run)
            DRY_RUN=true
            shift
            ;;
        *)
            log_message "Unknown parameter: $1"
            exit 1
            ;;
    esac
done

# Execute main
if [ "$DRY_RUN" = true ]; then
    log_message "Performing dry run..."
fi

main
exit 0Code language: PHP (php)

About the author

Add comment

By Erik
blue-ox.nl From coffee-fueled fruity tech to fast runs—think different, let’s run them.

Pages

Tags