# Restic Data Backup and Restore Usage Guide ## Overview Restic is a fast, secure, and efficient backup tool that is now fully integrated into the HuggingFace environment, supporting automatic backup and restore functionality. ## Quick Start ### 1. Basic Configuration Copy the configuration template and modify as needed: ```bash # Configuration file is already available in container # Edit configuration file directly nano /home/user/config/restic.conf ``` Basic configuration: ```bash # Enable Restic service RESTIC_ENABLED="true" # Set backup paths RESTIC_BACKUP_PATHS="/home/user/data:/home/user/config" # Auto-restore on HuggingFace restart RESTIC_AUTO_RESTORE="true" RESTIC_RESTORE_MODE="replace" RESTIC_FORCE_RESTORE="true" ``` ### 2. Start Service ```bash # Start Restic service ./scripts/start/restic-start.sh # Or auto-start in Docker container # Add Restic startup script in docker-entrypoint.sh ``` ## Core Features ### Automatic Backup - **Scheduled Backup**: Automatic backup every hour by default - **Startup Backup**: Automatic backup when service starts - **Incremental Backup**: Only backup changed data to save space - **Deduplication and Compression**: Automatic deduplication and compression for optimized storage ### Automatic Restore - **Startup Restore**: Automatic data restore when HuggingFace restarts - **Precise Restore**: Only restore specified directories without affecting other files - **Multiple Modes**: - `safe`: Safe mode, restore to temporary directory - `direct`: Direct mode, restore directly to target location - `replace`: Replace mode, clear then restore (recommended for HF restart) ### Data Management - **Snapshot Management**: Keep multiple historical versions - **Auto Cleanup**: Automatically clean old snapshots according to retention policy - **Integrity Check**: Regularly check backup integrity ## Usage Scenarios ### Scenario 1: Basic HuggingFace Environment Backup ```bash # Configuration file example RESTIC_ENABLED="true" RESTIC_BACKUP_PATHS="/home/user/data:/home/user/config" RESTIC_AUTO_RESTORE="true" RESTIC_RESTORE_MODE="replace" RESTIC_BACKUP_INTERVAL="3600" # 1 hour RESTIC_KEEP_DAILY="7" # Keep 7 days ``` **Suitable for**: Most HuggingFace deployment scenarios ### Scenario 2: Production Environment High Availability ```bash # Use remote S3 storage RESTIC_REPOSITORY_PATH="s3:s3.amazonaws.com/my-hf-backup" RESTIC_BACKUP_INTERVAL="1800" # 30 minutes RESTIC_KEEP_DAILY="30" # Keep 30 days RESTIC_VERIFY_BACKUP="true" # Verify backup ``` **Suitable for**: Critical production environments ### Scenario 3: Development and Testing Environment ```bash # Lower frequency backup RESTIC_BACKUP_INTERVAL="7200" # 2 hours RESTIC_AUTO_RESTORE="false" # Manual restore RESTIC_KEEP_DAILY="3" # Keep 3 days ``` **Suitable for**: Development and testing environments ## Command Line Tools ### Restore Script Use the dedicated restore script: ```bash # Show help ./scripts/utils/restic-restore.sh --help # List available snapshots ./scripts/utils/restic-restore.sh --list # Safe mode restore (recommended) ./scripts/utils/restic-restore.sh --mode safe # Direct restore (fast) ./scripts/utils/restic-restore.sh --mode direct --force # Replace mode restore (recommended for HF restart) ./scripts/utils/restic-restore.sh --mode replace --force # Only restore config files ./scripts/utils/restic-restore.sh --include "/home/user/config" # Restore specific snapshot ./scripts/utils/restic-restore.sh a1b2c3d4 ``` ### Direct Restic Commands ```bash # Set environment variables export RESTIC_REPOSITORY="/home/user/data/restic-repo" export RESTIC_PASSWORD="$(cat /home/user/config/restic-password)" # View snapshots restic snapshots # Manual backup restic backup /home/user/data --tag manual # Manual restore restic restore latest --target /tmp/restore # Check repository restic check # Clean old snapshots restic forget --keep-daily 7 --prune ``` ## Configuration Options Details ### Required Configuration ```bash RESTIC_ENABLED="true" # Enable service RESTIC_REPOSITORY_PATH="/path/to/repo" # Repository path RESTIC_BACKUP_PATHS="/path1:/path2" # Backup paths (colon-separated) ``` ### Backup Configuration ```bash RESTIC_BACKUP_INTERVAL="3600" # Backup interval (seconds) RESTIC_BACKUP_TAG="automated,hf" # Backup tags RESTIC_BACKUP_ON_STARTUP="true" # Backup on startup RESTIC_EXCLUDE_PATTERNS="*.tmp,*.log" # Exclude patterns ``` ### Restore Configuration ```bash RESTIC_AUTO_RESTORE="true" # Auto restore RESTIC_RESTORE_MODE="replace" # Restore mode RESTIC_FORCE_RESTORE="true" # Force restore RESTIC_RESTORE_INCLUDE_PATHS="/home/user" # Only restore specified paths ``` ### Retention Policy ```bash RESTIC_KEEP_HOURLY="24" # Keep hourly backups for 24 hours RESTIC_KEEP_DAILY="7" # Keep daily backups for 7 days RESTIC_KEEP_WEEKLY="4" # Keep weekly backups for 4 weeks RESTIC_KEEP_MONTHLY="12" # Keep monthly backups for 12 months ``` ### Performance Optimization ```bash RESTIC_COMPRESSION="auto" # Compression mode: off, auto, max RESTIC_BACKEND_CONNECTIONS="5" # Backend connections RESTIC_PACK_SIZE="32" # Pack size in MiB RESTIC_READ_CONCURRENCY="4" # Read concurrency RESTIC_NO_SCAN="false" # Disable backup progress estimation ``` ### Advanced Options ```bash RESTIC_EXTRA_VERIFY="true" # Enable extra data verification RESTIC_CHECK_INTERVAL="259200" # Repository check interval (3 days) RESTIC_AUTO_REPAIR="true" # Auto-repair index if needed RESTIC_CACHE_SIZE="1000" # Cache size in MB RESTIC_LOG_LEVEL="INFO" # Log level: DEBUG, INFO, WARN, ERROR ``` ## Best Practices ### For HuggingFace Environment 1. **Use Replace Mode for Auto-restore**: Set `RESTIC_RESTORE_MODE="replace"` for HF restart scenarios 2. **Enable Force Restore**: Set `RESTIC_FORCE_RESTORE="true"` to avoid prompts 3. **Optimize Backup Frequency**: Use 1-2 hours interval for active development 4. **Monitor Storage Usage**: Adjust retention policy based on available space ### Security Considerations 1. **Password Management**: Use strong auto-generated passwords 2. **Repository Access**: Restrict access to backup repository 3. **Data Verification**: Enable extra verification for critical data 4. **Regular Testing**: Test restore procedures regularly ### Performance Tips 1. **Use Compression**: Enable auto compression for better storage efficiency 2. **Adjust Concurrency**: Increase read concurrency for fast storage 3. **Cache Optimization**: Set appropriate cache size based on available memory 4. **Network Optimization**: Increase backend connections for remote repositories ## Troubleshooting ### Common Issues | Issue | Solution | |-------|----------| | Service not starting | Check `RESTIC_ENABLED="true"` and configuration file | | Backup fails | Verify repository path and permissions | | Restore fails | Check snapshot availability and target path permissions | | High memory usage | Reduce pack size and backend connections | | Slow performance | Enable no-scan mode and adjust concurrency | ### Debug Commands ```bash # Check service status ps aux | grep restic # View logs tail -f /home/user/log/restic.log # Test repository restic check --read-data # View backup statistics restic stats # Check cache restic cache --help ``` ### Performance Monitoring ```bash # Monitor backup progress restic backup --verbose /path/to/data # Check repository size restic stats --mode raw-data # Analyze snapshot differences restic diff snapshot1 snapshot2 ``` ## Migration Guide ### From Persistence to Restic 1. Disable persistence service: `PERSISTENCE_ENABLED="false"` 2. Enable restic service: `RESTIC_ENABLED="true"` 3. Configure backup paths to include persistence data 4. Test backup and restore functionality 5. Update startup scripts if needed ### Backup Format Compatibility Restic repositories are not compatible with other backup formats. For migration: 1. Create full backup with old system 2. Initialize new restic repository 3. Perform initial backup with restic 4. Verify data integrity 5. Switch to restic service ## FAQ **Q: Can I use both persistence and restic services?** A: No, they are mutually exclusive. Choose one based on your needs. **Q: How much space do backups consume?** A: Restic uses deduplication and compression, typically reducing space by 50-80%. **Q: Can I backup to cloud storage?** A: Yes, restic supports S3, Azure, Google Cloud, and other backends. **Q: How to recover from corrupted repository?** A: Run `restic check --read-data` and `restic repair index` to fix issues. **Q: Is encryption enabled by default?** A: Yes, restic encrypts all data with AES-256.