archon-dataset-sync / README.md
personalbotai
Deploy Archon Dataset Sync v2.1 with branch support\n\n- Add sync_dataset.sh with DATASET_BRANCH support\n- Add Flask monitoring dashboard (app.py)\n- Add Dockerfile for HF Space deployment\n- Add comprehensive documentation\n- Security hardening (upstream protection)\n- Auto-retry with exponential backoff\n- Health checks and graceful shutdown\n\nArchon Standard: Build for Eternity
9de9a1b

PicoClaw Dataset Sync Daemon

Archon v2.1 - Branch Support Enabled

Synchronize local workspace dengan remote dataset repository (branchable) untuk NullClaw ecosystem.

🎯 Fitur

  • Branch Selection: Support custom branch (default: main)
  • Auto-Retry: 3 attempts dengan exponential backoff
  • Health Checks: Disk space monitoring (>1GB)
  • Graceful Shutdown: SIGTERM/SIGINT handling
  • Concurrent Protection: State file locking
  • Security Hardening: Prevent accidental push ke upstream
  • Structured Logging: Timestamp + level logging ke file
  • Backup Management: Auto-cleanup backups (>7 days)

πŸš€ Quick Deploy ke Hugging Face Space

Environment Variables (HF Space Settings β†’ Variables)

# Required
DATASET_REPO=https://github.com/personalbotai/picoclaw-memory.git
DATASET_BRANCH=acron-memory  # atau main, develop, dll
GITHUB_TOKEN=ghp_xxxxxxxxxxxx  # Untuk private repo atau rate limit

# Optional
SYNC_INTERVAL=300  # Detik (default: 300 = 5 menit)
PICOCLAW_HOME=/data  # Path di HF Space (default: ~/.picoclaw)

File Structure di HF Space

/
β”œβ”€β”€ sync_dataset.sh    # Main daemon (executable)
β”œβ”€β”€ app.py            # Flask monitoring UI (opsional)
β”œβ”€β”€ requirements.txt  # Python dependencies
β”œβ”€β”€ README.md         # Dokumentasi
└── .gitignore       # Git ignore

Starting the Daemon

# Make executable
chmod +x sync_dataset.sh

# Run in background (HF Space startup)
nohup ./sync_dataset.sh > /dev/null 2>&1 &

Monitoring

Log file: ~/.picoclaw/sync.log State file: ~/.picoclaw/sync.state

πŸ”§ Configuration

Variable Default Description
DATASET_REPO https://github.com/personalbotai/picoclaw-memory.git Git repository URL
DATASET_BRANCH main Branch untuk sync
SYNC_INTERVAL 300 Sync interval (seconds)
MAX_RETRIES 3 Max retry attempts
BACKUP_RETENTION_DAYS 7 Backup cleanup retention
MIN_DISK_FREE_MB 1024 Minimum free disk space (MB)
PICOCLAW_HOME ~/.picoclaw Base directory
GITHUB_TOKEN (empty) GitHub token untuk auth

πŸ§ͺ Testing

# Dry run (check syntax)
bash -n sync_dataset.sh

# Test execution (1 cycle only)
DATASET_BRANCH=acron-memory \
PICOCLAW_HOME=/tmp/picoclaw-test \
./sync_dataset.sh

πŸ“Š Log Format

[2025-12-28 05:46:00] [INFO] === PicoClaw Dataset Sync Daemon v2.1 ===
[2025-12-28 05:46:00] [INFO] Branch: acron-memory
[2025-12-28 05:46:02] [INFO] Initial sync completed
[2025-12-28 05:51:00] [INFO] Sync cycle completed

⚠️ Known Limitations

  1. Single-threaded: Git operations sequential
  2. No metrics endpoint: Butuh Prometheus? (opsional)
  3. No email alerts: Butuh notifikasi? (opsional)

πŸ› οΈ Development

Build & Test

# Lint
shellcheck sync_dataset.sh

# Test dengan branch switching
DATASET_BRANCH=main ./sync_dataset.sh

Branch Support

Script support branch switching otomatis:

  • Clone dengan --branch $DATASET_BRANCH
  • Checkout ke branch target jika berbeda
  • Push ke branch yang sama

πŸ“„ License

Archon Standard - Build for Eternity


Archon v2.1 | NullClaw Runtime