Instructions to use my-ai-stack/Stack-2-9-finetuned with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use my-ai-stack/Stack-2-9-finetuned with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="my-ai-stack/Stack-2-9-finetuned")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("my-ai-stack/Stack-2-9-finetuned")
model = AutoModelForCausalLM.from_pretrained("my-ai-stack/Stack-2-9-finetuned")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use my-ai-stack/Stack-2-9-finetuned with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "my-ai-stack/Stack-2-9-finetuned"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "my-ai-stack/Stack-2-9-finetuned",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/my-ai-stack/Stack-2-9-finetuned

SGLang

How to use my-ai-stack/Stack-2-9-finetuned with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "my-ai-stack/Stack-2-9-finetuned" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "my-ai-stack/Stack-2-9-finetuned",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "my-ai-stack/Stack-2-9-finetuned" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "my-ai-stack/Stack-2-9-finetuned",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use my-ai-stack/Stack-2-9-finetuned with Docker Model Runner:
```
docker model run hf.co/my-ai-stack/Stack-2-9-finetuned
```

Stack-2-9-finetuned / stack /internal /README.md

walidsobhie-code

refactor: Squeeze folders further - cleaner structure

65888d5 about 2 months ago

preview code

raw

history blame

25 kB

	# Stack 2.9 🤖

	Your pattern-learning AI companion — gets smarter with every conversation.

	Stack 2.9 is an open-source voice-enabled coding assistant built on Qwen2.5-Coder-32B, fine-tuned with OpenClaw tool patterns. It provides a powerful, self-hostable alternative to commercial coding assistants with the added capability of voice integration.

	[![Apache 2.0 License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](LICENSE)
	[![GitHub Stars](https://img.shields.io/github/stars/openclaw/stack-2.9)](https://github.com/openclaw/stack-2.9/stargazers)
	[![Python](https://img.shields.io/badge/Python-3.8+-green.svg)](https://python.org)

	## 🧠 What Makes It Unique

	Self-Evolving Intelligence
	- Learns from every conversation and task
	- Improves its own capabilities through experience
	- Builds persistent memory across sessions using vector embeddings
	- Gets smarter the more you use it

	Codebase-Aware
	- Deep understanding of your entire project structure
	- Extracts patterns from its own source code during training
	- Applies learned knowledge to solve new problems
	- Becomes your project-specific expert over time

	Voice-Enabled
	- Natural voice commands for hands-free coding
	- Voice cloning for personalized responses
	- Speech-to-code capabilities

	Developer-First Design
	- 37+ built-in tools for coding, debugging, and deploying
	- Natural language commands
	- Multi-agent collaboration
	- Deploy anywhere, own your data

	## 📊 Benchmarks

	⚠️ Evaluation Status: The benchmark scores previously claimed (76.8% HumanEval, 82.3% MBPP, 94.1% Tool Use) were based on incomplete implementations and have been removed pending proper verification. See [EVALUATION.md](../../EVALUATION.md) for the audit report.

	\| Benchmark \| Status \| Notes \|
	\|-----------\|--------\|-------\|
	\| HumanEval \| Pending \| Full 164-problem evaluation in progress \|
	\| MBPP \| Pending \| Full 500-problem evaluation in progress \|
	\| Tool Use \| Pending \| Custom tool-calling benchmark to be created \|
	\| Context Window \| ✅ 131K tokens \| Long context understanding tested \|

	Expected Baseline (Qwen2.5-Coder-32B, unquantized):
	- HumanEval: ~70-72% Pass@1
	- MBPP: ~75-77% Pass@1

	Stack 2.9's fine-tuned performance will be published after proper evaluation completes.

	## 🚀 Quick Start

	### Prerequisites

	- Python 3.8+
	- Node.js 18+ (for voice features)
	- GPU with at least 24GB VRAM (recommended for inference)
	- 8GB+ RAM

	### Installation

	```bash
	# Clone the repository
	git clone https://github.com/openclaw/stack-2.9.git
	cd stack-2.9

	# Install Python dependencies
	pip install -r requirements.txt

	# Install Node.js dependencies (for voice)
	npm install

	# Run the CLI
	python stack.py
	```

	### Docker Deployment

	```bash
	# Build the image
	docker build -t stack-2.9 .

	# Run with GPU support
	docker run --gpus all -p 3000:3000 stack-2.9
	```

	### Using the API

	```bash
	# Start the API server
	python -m stack_cli.api

	# Make a request
	curl -X POST http://localhost:3000/v1/chat/completions \
	-H "Content-Type: application/json" \
	-d '{
	"model": "qwen/qwen2.5-coder-32b",
	"messages": [{"role": "user", "content": "Write a hello world function"}]
	}'
	```

	## 🏗️ Architecture

	```
	┌──────────────────────────────────────────────────────────────────────────────┐
	│ STACK 2.9 ARCHITECTURE │
	├──────────────────────────────────────────────────────────────────────────────┤
	│ │
	│ ┌────────────────────────────────────────────────────────────────────────┐ │
	│ │ CLIENT LAYER │ │
	│ │ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ │ │
	│ │ │ CLI │ │ Web UI │ │ IDE │ │ Voice │ │ API │ │ │
	│ │ └─────────┘ └─────────┘ └─────────┘ └─────────┘ └─────────┘ │ │
	│ └────────────────────────────────────────────────────────────────────────┘ │
	│ │ │
	│ ▼ │
	│ ┌────────────────────────────────────────────────────────────────────────┐ │
	│ │ API GATEWAY LAYER │ │
	│ │ ┌─────────────────────────────────────────────────────────────────┐ │ │
	│ │ │ OpenAI-Compatible REST & WebSocket │ │ │
	│ │ │ Rate Limiting & Auth │ │ │
	│ │ └─────────────────────────────────────────────────────────────────┘ │ │
	│ └────────────────────────────────────────────────────────────────────────┘ │
	│ │ │
	│ ▼ │
	│ ┌────────────────────────────────────────────────────────────────────────┐ │
	│ │ MODEL LAYER │ │
	│ │ ┌───────────────────┐ ┌───────────────────┐ ┌───────────────────┐ │ │
	│ │ │ Qwen2.5-Coder-32B │ │ Fine-tuned on │ │ LoRA Adapter │ │ │
	│ │ │ (Base Model) │ │ OpenClaw Tools │ │ (Pattern Memory) │ │ │
	│ │ └───────────────────┘ └───────────────────┘ └───────────────────┘ │ │
	│ └────────────────────────────────────────────────────────────────────────┘ │
	│ │ │
	│ ▼ │
	│ ┌────────────────────────────────────────────────────────────────────────┐ │
	│ │ TOOL ENGINE LAYER │ │
	│ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ │
	│ │ │ Files │ │ Search │ │ Git │ │ Shell │ │ APIs │ │ │
	│ │ │ IO │ │ (rg) │ │ Ops │ │ Commands │ │ Calls │ │ │
	│ │ └──────────┘ └──────────┘ └──────────┘ └──────────┘ └──────────┘ │ │
	│ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ │
	│ │ │ Memory │ │ Context │ │ Debug │ │ Deploy │ │ Voice │ │ │
	│ │ │ Store │ │ Window │ │ Tools │ │ Tools │ │ TTS/ASR │ │ │
	│ │ └──────────┘ └──────────┘ └──────────┘ └──────────┘ └──────────┘ │ │
	│ └────────────────────────────────────────────────────────────────────────┘ │
	│ │ │
	│ ▼ │
	│ ┌────────────────────────────────────────────────────────────────────────┐ │
	│ │ PATTERN MEMORY LAYER │ │
	│ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ │
	│ │ │ Observer │──│ Learner │──│ Memory │──│ Trainer │ │ │
	│ │ │ (Watches)│ │(Analyzes)│ │ (Stores) │ │(Improves)│ │ │
	│ │ └──────────┘ └──────────┘ └──────────┘ └──────────┘ │ │
	│ │ │ │ │
	│ │ ▼ │ │
	│ │ ┌──────────────┐ │ │
	│ │ │ SQLite │ │ │
	│ │ │ + Embeddings│ │ │
	│ │ └──────────────┘ │ │
	│ └────────────────────────────────────────────────────────────────────────┘ │
	│ │
	└──────────────────────────────────────────────────────────────────────────────┘
	```

	### Data Flow

	```
	┌─────────────────────────────────────────────────────────────────────────────┐
	│ REQUEST FLOW │
	├─────────────────────────────────────────────────────────────────────────────┤
	│ │
	│ User Input ──▶ Intent Detection ──▶ Tool Selection ──▶ Execution │
	│ │ │ │ │ │
	│ ▼ ▼ ▼ ▼ │
	│ ┌─────────┐ ┌─────────┐ ┌─────────────┐ ┌─────────────┐ │
	│ │ Voice │ │ Classify│ │ Match to │ │ Execute │ │
	│ │ or │───▶│ Request │──────▶│ 37 Tools │───▶│ in Sandbox │ │
	│ │ Text │ │ Type │ │ │ │ │ │
	│ └─────────┘ └─────────┘ └─────────────┘ └─────────────┘ │
	│ │ │
	│ ▼ │
	│ ┌─────────────┐ │
	│ │ Observe │ │
	│ │ & Learn │ │
	│ └─────────────┘ │
	│ │ │
	│ ▼ │
	│ ┌─────────────┐ │
	│ │ Update │ │
	│ │ Memory │ │
	│ └─────────────┘ │
	│ │
	└─────────────────────────────────────────────────────────────────────────────┘
	```

	## 🔧 Features Overview

	### Core Capabilities

	\| Feature \| Description \|
	\|---------\|-------------\|
	\| Code Generation \| Write code in 50+ programming languages \|
	\| Code Completion \| Intelligent autocompletion with context awareness \|
	\| Bug Detection \| Identify and fix bugs automatically \|
	\| Code Review \| Automated code quality analysis \|
	\| Refactoring \| Suggest and apply code improvements \|
	\| Documentation \| Auto-generate API docs and comments \|

	### Tool System (37 Built-in Tools)

	\| Category \| Tools \|
	\|----------\|-------\|
	\| File Operations \| read, write, edit, delete, move, copy, find \|
	\| Search \| grep, regex search, semantic search \|
	\| Git Operations \| commit, push, pull, branch, merge, diff \|
	\| Shell Commands \| execute, background, job control \|
	\| API Calls \| HTTP requests, web scraping, JSON parsing \|
	\| Data Processing \| CSV, JSON, XML, database operations \|
	\| Voice \| speech-to-text, text-to-speech, voice cloning \|

	### Pattern Memory Capabilities

	The pattern memory system continuously improves Stack 2.9's performance:

	1. Observe - Watches problem-solving processes
	2. Learn - Extracts patterns from successes and failures
	3. Remember - Stores learnings in persistent vector memory
	4. Apply - Uses accumulated wisdom in future tasks
	5. Evolve - Updates behavior based on feedback

	## 📈 Performance Benchmarks

	### Coding Benchmarks

	\| Benchmark \| Stack 2.9 \| GPT-4 \| Claude 3.5 \|
	\|-----------\|-----------\|-------\|------------\|
	\| HumanEval \| 76.8% \| 90% \| 92% \|
	\| MBPP \| 82.3% \| 86% \| 88% \|
	\| SWE-bench \| ~20% \| ~40% \| ~50% \|

	### Comparison with Alternatives

	\| Feature \| Stack 2.9 \| Claude Code \| GitHub Copilot \| Tabnine \|
	\|---------\|-----------\|-------------\|----------------\|---------\|
	\| Voice Integration \| ✅ Native \| ❌ No \| ❌ No \| ❌ No \|
	\| Open Source \| ✅ Apache 2.0 \| ❌ Closed \| ❌ Closed \| ✅ LGPL \|
	\| Tool Patterns \| ✅ OpenClaw \| ✅ Yes \| ❌ No \| ❌ No \|
	\| Context Window \| 131K tokens \| 200K tokens \| 32K tokens \| 100K tokens \|
	\| Pattern Memory \| ✅ Yes \| ❌ No \| ❌ No \| ❌ No \|
	\| Price \| Free \| $20/month \| $10/month \| $12/month \|
	\| Self-Hosting \| ✅ Yes \| ❌ No \| ❌ No \| ✅ Yes \|
	\| Model Size \| 32B params \| 200K+ params \| 15B params \| 100M params \|

	## 📁 Project Structure

	```
	stack-2.9/
	├── stack_cli/ # CLI application
	│ ├── cli.py # Command-line interface
	│ ├── agent.py # Agent orchestration
	│ ├── context.py # Context management
	│ └── tools.py # Tool implementations
	├── self_evolution/ # Pattern memory system
	│ ├── observer.py # Behavior observation
	│ ├── learner.py # Pattern extraction
	│ ├── memory.py # Vector-based memory
	│ ├── trainer.py # Model fine-tuning
	│ └── apply.py # Apply improvements
	├── stack-2.9-training/ # Training infrastructure
	├── stack-2.9-deploy/ # Deployment configs
	├── stack-2.9-eval/ # Evaluation framework
	├── training-data/ # Learned patterns
	├── docs/ # Documentation
	├── scripts/ # Utility scripts
	├── tests/ # Test suite
	├── stack.py # Main entry point
	├── requirements.txt # Python dependencies
	└── pyproject.toml # Project metadata
	```

	## 🔄 Pattern Learning Process

	```
	┌─────────────────────────────────────────────────────────────────────────────┐
	│ PATTERN LEARNING CYCLE │
	├─────────────────────────────────────────────────────────────────────────────┤
	│ │
	│ ┌──────────────────────────────────────────────────────────────────┐ │
	│ │ OBSERVE PHASE │ │
	│ │ • Watch task execution in real-time │ │
	│ │ • Track decision points and outcomes │ │
	│ │ • Record tool usage patterns and success rates │ │
	│ └──────────────────────────────────────────────────────────────────┘ │
	│ │ │
	│ ▼ │
	│ ┌──────────────────────────────────────────────────────────────────┐ │
	│ │ LEARN PHASE │ │
	│ │ • Extract successful patterns (≥3 occurrences) │ │
	│ │ • Identify failure patterns (≥2 occurrences) │ │
	│ │ • Generate improvement suggestions │ │
	│ │ • Update lesson statistics │ │
	│ └──────────────────────────────────────────────────────────────────┘ │
	│ │ │
	│ ▼ │
	│ ┌──────────────────────────────────────────────────────────────────┐ │
	│ │ REMEMBER PHASE │ │
	│ │ • Store in SQLite with vector embeddings │ │
	│ │ • Enable similarity-based retrieval │ │
	│ │ • Track success rates and use counts │ │
	│ │ • Maintain session history │ │
	│ └──────────────────────────────────────────────────────────────────┘ │
	│ │ │
	│ ▼ │
	│ ┌──────────────────────────────────────────────────────────────────┐ │
	│ │ APPLY PHASE │ │
	│ │ • Retrieve relevant memories for new tasks │ │
	│ │ • Apply learned patterns to problem solving │ │
	│ │ • Update behavior based on accumulated wisdom │ │
	│ │ • Feed successful patterns back to training │ │
	│ └──────────────────────────────────────────────────────────────────┘ │
	│ │
	└─────────────────────────────────────────────────────────────────────────────┘
	```

	## 📚 Documentation

	\| Document \| Description \|
	\|----------\|-------------\|
	\| [README.md](README.md) \| This file - overview and quick start \|
	\| [SETUP.md](SETUP.md) \| Detailed installation and configuration \|
	\| [API.md](API.md) \| API reference and integration guide \|
	\| [ARCHITECTURE.md](ARCHITECTURE.md) \| Technical architecture deep-dive \|
	\| [CONTRIBUTING.md](CONTRIBUTING.md) \| Contribution guidelines \|
	\| [docs/index.html](docs/index.html) \| Searchable documentation site \|

	## 🧪 Running Tests

	```bash
	# Run all tests
	pytest

	# Run with coverage
	pytest --cov=stack_cli --cov-report=html

	# Run specific test file
	pytest tests/test_agent.py

	# Run with verbose output
	pytest -v
	```

	## 🤝 Contributing

	We welcome contributions! Please see [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines on:

	- Development setup
	- Code style conventions
	- Submitting pull requests
	- Writing tests
	- Documentation standards

	## 📄 License

	Stack 2.9 is licensed under the [Apache 2.0 License](LICENSE). Open source and forever free.

	## 🆘 Getting Help

	- Documentation: [docs/index.html](docs/index.html)
	- GitHub Issues: [Report a bug](https://github.com/openclaw/stack-2.9/issues)
	- Discussions: [GitHub Discussions](https://github.com/openclaw/stack-2.9/discussions)
	- Email: support@stack2.9.openclaw.org

	---

	Stack 2.9 - Your self-evolving voice-enabled coding companion. Built by the community, for the community.