Text Generation
Transformers
English
qwen2
code-generation
python
fine-tuning
Qwen
tools
agent-framework
multi-agent
conversational
Eval Results (legacy)
Instructions to use my-ai-stack/Stack-2-9-finetuned with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use my-ai-stack/Stack-2-9-finetuned with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="my-ai-stack/Stack-2-9-finetuned") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("my-ai-stack/Stack-2-9-finetuned") model = AutoModelForCausalLM.from_pretrained("my-ai-stack/Stack-2-9-finetuned") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use my-ai-stack/Stack-2-9-finetuned with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "my-ai-stack/Stack-2-9-finetuned" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "my-ai-stack/Stack-2-9-finetuned", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/my-ai-stack/Stack-2-9-finetuned
- SGLang
How to use my-ai-stack/Stack-2-9-finetuned with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "my-ai-stack/Stack-2-9-finetuned" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "my-ai-stack/Stack-2-9-finetuned", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "my-ai-stack/Stack-2-9-finetuned" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "my-ai-stack/Stack-2-9-finetuned", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use my-ai-stack/Stack-2-9-finetuned with Docker Model Runner:
docker model run hf.co/my-ai-stack/Stack-2-9-finetuned
File size: 7,667 Bytes
7f7972d | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 | # OpenRouter Submission Checklist
**Project:** OpenClaw + Voice Components
**Date:** 2025-04-01 (assessment date)
**Status:** NOT READY FOR SUBMISSION
**Reviewer:** Subagent Checklist Agent
---
## Executive Summary
**Recommendation: NO-GO**
The workspace contains:
- OpenClaw: A TypeScript-based AI assistant CLI (not a model)
- Voice cloning Python prototypes (not production-ready)
- Strategic plans for integration
**Critical Issue**: There is no standalone model file or inference endpoint ready for OpenRouter submission. OpenRouter expects an OpenAI-compatible API serving a specific model, not a full application codebase.
---
## Technical Requirements
| # | Requirement | Status | Notes |
|---|-------------|--------|-------|
| 1 | Model uploaded to Hugging Face (or accessible) | β **BLOCKER** | No model file exists. OpenClaw is an application, not a model. Voice cloning code exists but no trained model artifact uploaded to HF. |
| 2 | API endpoint OpenAI-compatible and tested | β **BLOCKER** | No API endpoint. Need to create a REST API that accepts `/v1/chat/completions` format. Current components are CLI tools and Python scripts. |
| 3 | Rate limits documented and enforced | β **BLOCKER** | No rate limiting implemented. Must add token-based rate limiting (e.g., 100 requests/minute). |
| 4 | Error handling proper | β **BLOCKER** | No standardized error responses for API. Need proper HTTP status codes, error messages in OpenAI format. |
| 5 | Monitoring/logging in place | β **BLOCKER** | No logging infrastructure. Need structured logging, request/response tracking, error monitoring (Sentry/datadog). |
---
## Benchmarks
| # | Requirement | Status | Notes |
|---|-------------|--------|-------|
| 6 | HumanEval score published | β **BLOCKER** | No HumanEval evaluation run. Must run HumanEval benchmark (at least pass@1) and document results. |
| 7 | MBPP score published | β **BLOCKER** | No MBPP evaluation. Must run MBPP benchmark and report scores. |
| 8 | Tool use accuracy documented | β **BLOCKER** | No tooluse evaluation. If claiming tool capabilities, need accuracy metrics on tool calling benchmarks. |
| 9 | Throughput/latency numbers | β **BLOCKER** | No performance testing. Need tokens/sec, p50/p99 latency, time-to-first-token metrics. |
| 10 | Context length capability verified | β **BLOCKER** | Context window not characterized. Need to document max context (e.g., 128k, 256k) and test with long prompts. |
---
## Documentation
| # | Requirement | Status | Notes |
|---|-------------|--------|-------|
| 11 | README up-to-date with real numbers | β οΈ **PARTIAL** | README.md exists for voice clone project but lacks API details, pricing, benchmarks. Needs major updates for model submission. |
| 12 | Model card complete | β **BLOCKER** | No model card (model-card.yaml or README section). Must follow HF model card template: model description, intended use, limitations, training data, eval results. |
| 13 | Safety/ethics section filled | β **BLOCKER** | No safety documentation. Must address misuse risks (voice cloning ethics), mitigations, content policy. |
| 14 | Pricing clear | β **BLOCKER** | No pricing defined. OpenRouter pricing must be set (free tier? per token? subscription?). |
| 15 | Contact info valid | β **BLOCKER** | Contact info not specified. Need maintainer email, support channel, SLA contact. |
---
## Legal
| # | Requirement | Status | Notes |
|---|-------------|--------|-------|
| 16 | License (Apache 2.0) is clear | β οΈ **PARTIAL** | LICENSE file exists (MIT for voice clone). Need Apache 2.0 for OpenRouter submission (or other permissive license). |
| 17 | Training data sources documented | β **BLOCKER** | No documentation of training data. Must list datasets used, sources, licenses. Voice cloning uses Coqui models - need attribution. |
| 18 | No copyright infringement (code under permissive licenses) | β οΈ **NEEDS REVIEW** | Code includes third-party dependencies. Need audit of all licenses (TypeScript deps in package.json, Python deps in requirements.txt). |
| 19 | Third-party attributions included | β **BLOCKER** | No attributions file. Must include notices for Coqui TTS, HF Transformers, etc. |
---
## Operational
| # | Requirement | Status | Notes |
|---|-------------|--------|-------|
| 20 | Support process defined | β **BLOCKER** | No support plan. Need: how users report issues, response time SLA, escalation path. |
| 21 | SLA commitment realistic | β **BLOCKER** | No SLA defined. Must commit to uptime (e.g., 99.9%), support response times, incident resolution. |
| 22 | Incident response plan | β **BLOCKER** | No incident response process. Need runbooks for outages, rollback procedures, communication channels. |
| 23 | Monitoring dashboard (Grafana) ready | β **BLOCKER** | No monitoring stack. Need metrics collection (Prometheus), dashboards (Grafana), alerts (PagerDuty/email). |
---
## Blockers Summary
### Critical Path Blockers (Must Fix Before Submission)
1. **No Model Artifact**: No `.gguf`, `.safetensors`, or other model file prepared. Must train/fine-tune a model or use existing base (e.g., CodeLlama) and document modifications.
2. **No API Endpoint**: OpenRouter requires an OpenAI-compatible API. Must build a REST server (FastAPI/Express) that wraps model inference.
3. **Missing Benchmarks**: HumanEval and MBPP scores are mandatory for OpenRouter listing. Must evaluate and publish numbers.
4. **No Model Card**: Required by OpenRouter for transparency. Must create detailed documentation.
5. **No Pricing**: Must decide free/paid tiers and set token prices.
6. **No Monitoring**: Production API requires observability stack.
7. **No SLA/Support**: Commitments required for reliability.
---
## Go/No-Go Recommendation
**NO-GO** β
### Reason
The project is **not a model submission** but a **tooling codebase**. To be eligible for OpenRouter:
1. **Extract a model** from OpenClaw or fine-tune a base model (e.g., CodeLlama-7B) on your codebase to create "OpenClaw-7B"
2. **Package as inference API** with OpenAI compatibility
3. **Complete all 23 checklist items** (currently only 1-2 partial, rest are blockers)
4. **Estimated effort**: 4-8 weeks minimum (benchmarking, API development, documentation, monitoring setup)
### Suggested Path Forward
**Phase 1: Model Preparation (2 weeks)**
- Fine-tune CodeLlama or similar on OpenClaw codebase
- Export model to GGUF/Safetensors
- Upload to Hugging Face
- Run HumanEval/MBPP benchmarks
**Phase 2: API Development (1-2 weeks)**
- Build FastAPI server with `/v1/chat/completions`
- Implement rate limiting, error handling
- Test with OpenAI client libraries
- Deploy to cloud (Railway/Render/Cloud Run)
**Phase 3: Documentation & Compliance (1 week)**
- Write model card
- Define pricing (start free, then $X/1M tokens)
- Create README with examples
- Add safety/ethics section
**Phase 4: Monitoring & Ops (1 week)**
- Set up logging (Sentry)
- Add metrics (Prometheus + Grafana)
- Create incident response playbook
- Define support process (GitHub Issues, Discord)
**Phase 5: Submission**
- Submit to OpenRouter with all required fields
- Wait for review (typically 1-3 business days)
---
## Conclusion
**Do not submit yet.** The project lacks a proper model artifact, API endpoint, benchmarks, and operational infrastructure. Focus on creating a standalone model from the OpenClaw codebase first, then build the submission package.
---
**Checklist completed by:** Subagent (Final Checklist Agent)
**Next steps:** Initiate Phase 1 (model fine-tuning) and Phase 2 (API wrapper) in parallel.
|