Instructions to use my-ai-stack/Stack-2-9-finetuned with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use my-ai-stack/Stack-2-9-finetuned with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="my-ai-stack/Stack-2-9-finetuned")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("my-ai-stack/Stack-2-9-finetuned")
model = AutoModelForCausalLM.from_pretrained("my-ai-stack/Stack-2-9-finetuned")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use my-ai-stack/Stack-2-9-finetuned with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "my-ai-stack/Stack-2-9-finetuned"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "my-ai-stack/Stack-2-9-finetuned",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/my-ai-stack/Stack-2-9-finetuned

SGLang

How to use my-ai-stack/Stack-2-9-finetuned with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "my-ai-stack/Stack-2-9-finetuned" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "my-ai-stack/Stack-2-9-finetuned",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "my-ai-stack/Stack-2-9-finetuned" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "my-ai-stack/Stack-2-9-finetuned",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use my-ai-stack/Stack-2-9-finetuned with Docker Model Runner:
```
docker model run hf.co/my-ai-stack/Stack-2-9-finetuned
```

Stack-2-9-finetuned

File size: 7,667 Bytes

7f7972d

# OpenRouter Submission Checklist

**Project:** OpenClaw + Voice Components  
**Date:** 2025-04-01 (assessment date)  
**Status:** NOT READY FOR SUBMISSION  
**Reviewer:** Subagent Checklist Agent

---

## Executive Summary

**Recommendation: NO-GO**

The workspace contains:
- OpenClaw: A TypeScript-based AI assistant CLI (not a model)
- Voice cloning Python prototypes (not production-ready)
- Strategic plans for integration

**Critical Issue**: There is no standalone model file or inference endpoint ready for OpenRouter submission. OpenRouter expects an OpenAI-compatible API serving a specific model, not a full application codebase.

---

## Technical Requirements

| # | Requirement | Status | Notes |
|---|-------------|--------|-------|
| 1 | Model uploaded to Hugging Face (or accessible) | ❌ **BLOCKER** | No model file exists. OpenClaw is an application, not a model. Voice cloning code exists but no trained model artifact uploaded to HF. |
| 2 | API endpoint OpenAI-compatible and tested | ❌ **BLOCKER** | No API endpoint. Need to create a REST API that accepts `/v1/chat/completions` format. Current components are CLI tools and Python scripts. |
| 3 | Rate limits documented and enforced | ❌ **BLOCKER** | No rate limiting implemented. Must add token-based rate limiting (e.g., 100 requests/minute). |
| 4 | Error handling proper | ❌ **BLOCKER** | No standardized error responses for API. Need proper HTTP status codes, error messages in OpenAI format. |
| 5 | Monitoring/logging in place | ❌ **BLOCKER** | No logging infrastructure. Need structured logging, request/response tracking, error monitoring (Sentry/datadog). |

---

## Benchmarks

| # | Requirement | Status | Notes |
|---|-------------|--------|-------|
| 6 | HumanEval score published | ❌ **BLOCKER** | No HumanEval evaluation run. Must run HumanEval benchmark (at least pass@1) and document results. |
| 7 | MBPP score published | ❌ **BLOCKER** | No MBPP evaluation. Must run MBPP benchmark and report scores. |
| 8 | Tool use accuracy documented | ❌ **BLOCKER** | No tooluse evaluation. If claiming tool capabilities, need accuracy metrics on tool calling benchmarks. |
| 9 | Throughput/latency numbers | ❌ **BLOCKER** | No performance testing. Need tokens/sec, p50/p99 latency, time-to-first-token metrics. |
| 10 | Context length capability verified | ❌ **BLOCKER** | Context window not characterized. Need to document max context (e.g., 128k, 256k) and test with long prompts. |

---

## Documentation

| # | Requirement | Status | Notes |
|---|-------------|--------|-------|
| 11 | README up-to-date with real numbers | ⚠️ **PARTIAL** | README.md exists for voice clone project but lacks API details, pricing, benchmarks. Needs major updates for model submission. |
| 12 | Model card complete | ❌ **BLOCKER** | No model card (model-card.yaml or README section). Must follow HF model card template: model description, intended use, limitations, training data, eval results. |
| 13 | Safety/ethics section filled | ❌ **BLOCKER** | No safety documentation. Must address misuse risks (voice cloning ethics), mitigations, content policy. |
| 14 | Pricing clear | ❌ **BLOCKER** | No pricing defined. OpenRouter pricing must be set (free tier? per token? subscription?). |
| 15 | Contact info valid | ❌ **BLOCKER** | Contact info not specified. Need maintainer email, support channel, SLA contact. |

---

## Legal

| # | Requirement | Status | Notes |
|---|-------------|--------|-------|
| 16 | License (Apache 2.0) is clear | ⚠️ **PARTIAL** | LICENSE file exists (MIT for voice clone). Need Apache 2.0 for OpenRouter submission (or other permissive license). |
| 17 | Training data sources documented | ❌ **BLOCKER** | No documentation of training data. Must list datasets used, sources, licenses. Voice cloning uses Coqui models - need attribution. |
| 18 | No copyright infringement (code under permissive licenses) | ⚠️ **NEEDS REVIEW** | Code includes third-party dependencies. Need audit of all licenses (TypeScript deps in package.json, Python deps in requirements.txt). |
| 19 | Third-party attributions included | ❌ **BLOCKER** | No attributions file. Must include notices for Coqui TTS, HF Transformers, etc. |

---

## Operational

| # | Requirement | Status | Notes |
|---|-------------|--------|-------|
| 20 | Support process defined | ❌ **BLOCKER** | No support plan. Need: how users report issues, response time SLA, escalation path. |
| 21 | SLA commitment realistic | ❌ **BLOCKER** | No SLA defined. Must commit to uptime (e.g., 99.9%), support response times, incident resolution. |
| 22 | Incident response plan | ❌ **BLOCKER** | No incident response process. Need runbooks for outages, rollback procedures, communication channels. |
| 23 | Monitoring dashboard (Grafana) ready | ❌ **BLOCKER** | No monitoring stack. Need metrics collection (Prometheus), dashboards (Grafana), alerts (PagerDuty/email). |

---

## Blockers Summary

### Critical Path Blockers (Must Fix Before Submission)

1. **No Model Artifact**: No `.gguf`, `.safetensors`, or other model file prepared. Must train/fine-tune a model or use existing base (e.g., CodeLlama) and document modifications.

2. **No API Endpoint**: OpenRouter requires an OpenAI-compatible API. Must build a REST server (FastAPI/Express) that wraps model inference.

3. **Missing Benchmarks**: HumanEval and MBPP scores are mandatory for OpenRouter listing. Must evaluate and publish numbers.

4. **No Model Card**: Required by OpenRouter for transparency. Must create detailed documentation.

5. **No Pricing**: Must decide free/paid tiers and set token prices.

6. **No Monitoring**: Production API requires observability stack.

7. **No SLA/Support**: Commitments required for reliability.

---

## Go/No-Go Recommendation

**NO-GO** ❌

### Reason

The project is **not a model submission** but a **tooling codebase**. To be eligible for OpenRouter:

1. **Extract a model** from OpenClaw or fine-tune a base model (e.g., CodeLlama-7B) on your codebase to create "OpenClaw-7B"
2. **Package as inference API** with OpenAI compatibility
3. **Complete all 23 checklist items** (currently only 1-2 partial, rest are blockers)
4. **Estimated effort**: 4-8 weeks minimum (benchmarking, API development, documentation, monitoring setup)

### Suggested Path Forward

**Phase 1: Model Preparation (2 weeks)**
- Fine-tune CodeLlama or similar on OpenClaw codebase
- Export model to GGUF/Safetensors
- Upload to Hugging Face
- Run HumanEval/MBPP benchmarks

**Phase 2: API Development (1-2 weeks)**
- Build FastAPI server with `/v1/chat/completions`
- Implement rate limiting, error handling
- Test with OpenAI client libraries
- Deploy to cloud (Railway/Render/Cloud Run)

**Phase 3: Documentation & Compliance (1 week)**
- Write model card
- Define pricing (start free, then $X/1M tokens)
- Create README with examples
- Add safety/ethics section

**Phase 4: Monitoring & Ops (1 week)**
- Set up logging (Sentry)
- Add metrics (Prometheus + Grafana)
- Create incident response playbook
- Define support process (GitHub Issues, Discord)

**Phase 5: Submission**
- Submit to OpenRouter with all required fields
- Wait for review (typically 1-3 business days)

---

## Conclusion

**Do not submit yet.** The project lacks a proper model artifact, API endpoint, benchmarks, and operational infrastructure. Focus on creating a standalone model from the OpenClaw codebase first, then build the submission package.

---

**Checklist completed by:** Subagent (Final Checklist Agent)  
**Next steps:** Initiate Phase 1 (model fine-tuning) and Phase 2 (API wrapper) in parallel.