zeroclaw / PLAN.md
personalbotai
Move picoclaw_space to root for Hugging Face Spaces deployment
c1dcaaa

Perbaiki ini, jangan sampai kejadian

[WARN] agent: Model failed, checking fallbacks {model=stepfun/step-3.5-flash:free, error=API request failed:
  Status: 400
  Body:   {"error":{"message":"This endpoint's maximum context length is 256000 tokens. However, you requested about 256379 tokens (246103 of text input, 2084 of tool input, 8192 in the output). Please reduce the length of either one, or use the \"middle-out\" transform to compress your prompt automatically.","code":400,"metadata":{"provider_name":null}}}, has_more=true}

[ERROR] telegram: HTML parse failed, falling back to plain text {error=telego: sendMessage: api: 400 "Bad Request: message is too long"}

IMPLEMENTASI INI Auto-Chunking untuk Telegram

  • File AGENT.md ditambahkan protokol otomatis split pesan panjang (>4000 chars) menjadi beberapa bagian dengan format [Part X/Y].
  • Delay 1-2 detik antar part untuk menghindari rate limit.
  • Fallback ke plain text jika HTML parsing gagal.
  • Implementasi unit test untuk verifikasi splitting logic.

pastikan sync dataset bekerja, status saat ini sync belum bekerja

  • /Users/syamsulbahri/Documents/PROJECT/picoclaw/scripts/sync_dataset.sh (Refactored to be more robust, added rsync support, immediate sync on start)

IMPLEMENTASI INI Adaptive Iteration Limit

  • Sekarang fixed 50 → bisa dynamic berdasarkan config (MaxToolIterations)
  • Warning log jika reach limit (sebagai simple adaptive check)
  • Impact: Prevent wasted iterations, improve success rate

Tool Caching

  • Cache hasil tools yang idempoten (read_file, list_dir) untuk sementara
  • TTL-based cache (60s)
  • Impact: Hemat token LLM + faster response untuk repeated queries

Memory Management

  • go manual allocator → bisa optimasi dengan cara kamu (Implemented BufferPool in utils/pool.go and used in sandbox.go)
  • Reduce GC pressure (tidak ada GC tapi bisa tetap optimasi heap churn)
  • Impact: Lower memory footprint, better cache locality

Concurrency & Parallelism

  • Tool execution saat ini serial → bisa paralelkan tools yang independen (misal: websearch + db-query + file-read)
  • Batch execution untuk safe tools
  • Impact: Kurangi latency 50-70% untuk multi-tool tasks

Streaming Tool Output

  • Tools yang menghasilkan output besar (webfetch, exec) bisa streaming ke LLM chunk-by-chunk (Implemented truncation via LimitedWriter and Telegram chunking)
  • Hindari buffer penuh sebelum proses
  • Impact: Faster perceived response, lower peak memory

Tool Pre-loading & Warmup

  • Pre-load frequent tools (memory, db-manager) di startup (Implicit in NewExecTool and NewCronTool)
  • Pool persistent connections (DB, Docker daemon)
  • Impact: Lower first-call latency

Better Error Recovery

  • Retry logic dengan backoff untuk transient failures (Implemented in agent/loop.go)
  • Circuit breaker untuk tools yang sering gagal
  • Fallback strategies (misal: websearch gagal → coba webfetch langsung)
  • Impact: Higher reliability, graceful degradation

Observability

  • Add structured logging (JSON) dengan context (tool name, duration, memory delta)
  • Metrics endpoint (Prometheus) untuk: tool latency, success rate, iteration count, cache stats, sandbox execs
  • Impact: Easier debugging, performance tuning

Configuration Hot-reload

  • Reload agent config tanpa restart (maxiterations, tool timeouts, etc.)
  • Impact: Operational flexibility

Sandboxing & Security

  • Run tools dalam restricted namespace (seccomp, namespaces) untuk isolation (implemented via Sandbox struct & ResourceLimits)
  • Resource limits per tool (CPU time, memory)
  • Impact: Prevent runaway tools, security hardening