Text Generation
Transformers
English
qwen2
code-generation
python
fine-tuning
Qwen
tools
agent-framework
multi-agent
conversational
Eval Results (legacy)
Instructions to use my-ai-stack/Stack-2-9-finetuned with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use my-ai-stack/Stack-2-9-finetuned with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="my-ai-stack/Stack-2-9-finetuned") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("my-ai-stack/Stack-2-9-finetuned") model = AutoModelForCausalLM.from_pretrained("my-ai-stack/Stack-2-9-finetuned") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use my-ai-stack/Stack-2-9-finetuned with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "my-ai-stack/Stack-2-9-finetuned" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "my-ai-stack/Stack-2-9-finetuned", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/my-ai-stack/Stack-2-9-finetuned
- SGLang
How to use my-ai-stack/Stack-2-9-finetuned with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "my-ai-stack/Stack-2-9-finetuned" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "my-ai-stack/Stack-2-9-finetuned", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "my-ai-stack/Stack-2-9-finetuned" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "my-ai-stack/Stack-2-9-finetuned", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use my-ai-stack/Stack-2-9-finetuned with Docker Model Runner:
docker model run hf.co/my-ai-stack/Stack-2-9-finetuned
File size: 3,772 Bytes
fcb2b04 cc5f046 fcb2b04 cc5f046 fcb2b04 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 | # OpenRouter Submission - Stack 2.9
## Model Information
**Model Name**: Qwen/Qwen2.5-Coder-32B
**Fine-Tuned Version**: Stack 2.9 (OpenClaw tool patterns)
**Context Length**: 131072 tokens
**Architecture**: Transformer-based
**Parameters**: 32 billion
## Capabilities
### Core Capabilities
- **Code Generation**: Multi-language code writing and completion
- **Tool Use**: Native integration with OpenClaw tool patterns
- **Voice Integration Ready**: Compatible with voice cloning systems
- **API Compatibility**: OpenAI-compatible endpoints
### Advanced Features
- **Context Understanding**: 128K token context window
- **Multi-file Operations**: Work across entire codebases
- **Error Detection**: Identify and suggest fixes
- **Code Review**: Automated quality analysis
- **Documentation Generation**: Auto-create API docs
## Pricing Proposal
### Free Tier
- **Requests**: 100,000 tokens/day
- **Concurrent Requests**: 5
- **Features**: All core capabilities
### Pay-Per-Use
- **Tier 1**: $0.50 per 1M tokens
- **Tier 2**: $0.40 per 1M tokens (for volumes > 100M tokens)
- **Tier 3**: $0.30 per 1M tokens (for volumes > 500M tokens)
### Enterprise
- **Custom Pricing**: Contact for volume discounts
- **SLA**: 99.9% uptime guarantee
- **Support**: Priority support included
## Review Process Timeline
### Submission Phase (Week 1)
- Initial submission and documentation review
- Model capabilities verification
- API endpoint testing
### Testing Phase (Weeks 2-3)
- Performance benchmarking
- Safety and bias evaluation
- Integration testing
### Approval Phase (Week 4)
- Final review and approval
- Listing preparation
- Launch planning
## Contact Information
**Primary Contact**: Stack 2.9 Team
**Email**: stack29@openclaw.org
**Website**: https://stack2.9.openclaw.org
**GitHub**: https://github.com/my-ai-stack/stack-2.9
## Unique Value Proposition
### Why Stack 2.9?
1. **Voice-Enabled Coding**: The only open-source coding assistant with native voice integration
2. **Tool Pattern Excellence**: Fine-tuned on OpenClaw's extensive tool-use patterns
3. **Cost-Effective**: Significantly cheaper than commercial alternatives
4. **Self-Hosting Freedom**: Apache 2.0 license allows unrestricted deployment
5. **Community-Driven**: Developed by the open-source community
### Competitive Advantages
- **Voice Integration**: Unlike Claude Code or GitHub Copilot, Stack 2.9 supports voice commands
- **Open Source**: Fully transparent with Apache 2.0 licensing
- **Tool Patterns**: Specialized in OpenClaw tool patterns for superior tool use
- **Cost**: Free tier available, pay-per-use model
- **Flexibility**: Self-hosting option for complete control
### Target Markets
- **Individual Developers**: Free tier for hobbyists and students
- **Startups**: Cost-effective alternative to commercial solutions
- **Enterprises**: Self-hosting option for data privacy
- **Educational Institutions**: Open source for learning and research
## Safety and Ethics
### Safety Measures
- **Bias Mitigation**: Fine-tuning includes bias reduction techniques
- **Content Filtering**: Built-in content safety filters
- **Tool Validation**: All tool calls are validated before execution
### Ethical Considerations
- **Open Source**: Transparent development process
- **Community Governance**: Community-driven development
- **Responsible AI**: Committed to ethical AI development
## Performance Metrics
### Benchmark Results
- **HumanEval**: 75% pass@1 (estimated)
- **MBPP**: 80% pass@1 (estimated)
- **Tokens/Second**: 25-30 tokens/second on A100 GPU
### Latency
- **Average Response Time**: 2-3 seconds
- **Streaming**: Real-time response generation
---
**Stack 2.9** - Revolutionizing coding with voice and open source. Ready for OpenRouter listing approval. |