Add nexus-edge-deployer agent skill v2.1
Browse files
README.md
ADDED
|
@@ -0,0 +1,43 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: mit
|
| 3 |
+
tags:
|
| 4 |
+
- agent-skill
|
| 5 |
+
- infrastructure
|
| 6 |
+
- enterprise
|
| 7 |
+
- nexus-ai
|
| 8 |
+
language:
|
| 9 |
+
- en
|
| 10 |
+
- es
|
| 11 |
+
library_name: agent-skills
|
| 12 |
+
pipeline_tag: text-generation
|
| 13 |
+
---
|
| 14 |
+
|
| 15 |
+
# Nexus Edge Deployer
|
| 16 |
+
|
| 17 |
+
Deploy 1-bit AI models on VPS — AaaS with 98% margins
|
| 18 |
+
|
| 19 |
+
## Agent Skills Standard
|
| 20 |
+
|
| 21 |
+
This skill follows the [Agent Skills Standard](https://agentskills.io) and is compatible
|
| 22 |
+
with any LLM agent framework (Claude, ChatGPT, LangChain, AutoGen, CrewAI).
|
| 23 |
+
|
| 24 |
+
## Usage
|
| 25 |
+
|
| 26 |
+
Place the SKILL.md file in `.claude/skills/nexus-edge-deployer/SKILL.md` and the agent
|
| 27 |
+
will automatically discover and activate the skill.
|
| 28 |
+
|
| 29 |
+
## Pricing
|
| 30 |
+
|
| 31 |
+
$2.00 per execution (outcome-based).
|
| 32 |
+
|
| 33 |
+
## Publisher
|
| 34 |
+
|
| 35 |
+
**NEXUS AI Corp** — 68 AI agents, 23 departments, enterprise-grade.
|
| 36 |
+
|
| 37 |
+
## Category
|
| 38 |
+
|
| 39 |
+
infrastructure
|
| 40 |
+
|
| 41 |
+
## Certification
|
| 42 |
+
|
| 43 |
+
Gold certified (agentskills.io certification program).
|
SKILL.md
ADDED
|
@@ -0,0 +1,44 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
name: nexus-edge-deployer
|
| 3 |
+
description: "Deploy 1-bit quantized AI models on cheap VPS for Agent-as-a-Service. Calculate unit economics, provision Hetzner servers, configure Ollama/llama.cpp inference, and manage multi-tenant agent fleets with 98% margins."
|
| 4 |
+
license: proprietary
|
| 5 |
+
compatibility: "NEXUS Ecosystem 1.0"
|
| 6 |
+
metadata:
|
| 7 |
+
department: devops
|
| 8 |
+
agents: [edge-deploy, devops-infra]
|
| 9 |
+
price_per_execution: "$2.00"
|
| 10 |
+
nexus_version: "1.0"
|
| 11 |
+
version: "1.0.0"
|
| 12 |
+
author: "NEXUS AI Corp"
|
| 13 |
+
allowed-tools: web-search web-fetch filesystem
|
| 14 |
+
---
|
| 15 |
+
|
| 16 |
+
# Edge AI Deployer
|
| 17 |
+
|
| 18 |
+
Enterprise-grade edge deployment for 1-bit quantized models (PrismML Bonsai, Microsoft BitNet) on minimal infrastructure.
|
| 19 |
+
|
| 20 |
+
## Capabilities
|
| 21 |
+
- Deploy Bonsai 8B (1.15GB), 4B (0.57GB), and 1.7B (0.24GB) models on VPS
|
| 22 |
+
- Calculate AaaS unit economics: cost per agent, margin per VPS, break-even analysis
|
| 23 |
+
- Configure Ollama or llama.cpp for multi-tenant inference serving
|
| 24 |
+
- Auto-provision Hetzner CX22 (EUR 3.79/mo) via Cloud API
|
| 25 |
+
- Monitor fleet resource usage: RAM, CPU, tokens/sec per agent
|
| 26 |
+
- GDPR/HIPAA compliance via local inference (no data leaves server)
|
| 27 |
+
- Scale from 1 to 100+ agents across VPS fleet
|
| 28 |
+
|
| 29 |
+
## Workflow
|
| 30 |
+
1. Assess client requirements: model quality, latency, privacy, platform
|
| 31 |
+
2. Select optimal model tier (8B for quality, 4B for balance, 1.7B for mobile)
|
| 32 |
+
3. Provision VPS via Hetzner API with cloud-init (Ollama + model pre-loaded)
|
| 33 |
+
4. Deploy agent with client-specific persona and capabilities
|
| 34 |
+
5. Benchmark inference quality against full-precision baseline
|
| 35 |
+
6. Configure monitoring, alerting, and auto-scaling rules
|
| 36 |
+
7. Generate unit economics report: revenue, cost, margin, projections
|
| 37 |
+
|
| 38 |
+
## Guidelines
|
| 39 |
+
- Always benchmark 1-bit model quality before deploying to production
|
| 40 |
+
- Maximum 3 Bonsai 8B agents per 4GB VPS (reserve 0.5GB for OS)
|
| 41 |
+
- Maintain cloud API fallback for quality-critical tasks
|
| 42 |
+
- Report cost savings to finance department monthly
|
| 43 |
+
- Authenticate all inference endpoints — never expose publicly
|
| 44 |
+
- Use GGUF format for Ollama compatibility
|