Text Generation
Transformers
English
qwen2
code-generation
python
fine-tuning
Qwen
tools
agent-framework
multi-agent
conversational
Eval Results (legacy)
Instructions to use my-ai-stack/Stack-2-9-finetuned with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use my-ai-stack/Stack-2-9-finetuned with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="my-ai-stack/Stack-2-9-finetuned") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("my-ai-stack/Stack-2-9-finetuned") model = AutoModelForCausalLM.from_pretrained("my-ai-stack/Stack-2-9-finetuned") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use my-ai-stack/Stack-2-9-finetuned with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "my-ai-stack/Stack-2-9-finetuned" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "my-ai-stack/Stack-2-9-finetuned", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/my-ai-stack/Stack-2-9-finetuned
- SGLang
How to use my-ai-stack/Stack-2-9-finetuned with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "my-ai-stack/Stack-2-9-finetuned" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "my-ai-stack/Stack-2-9-finetuned", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "my-ai-stack/Stack-2-9-finetuned" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "my-ai-stack/Stack-2-9-finetuned", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use my-ai-stack/Stack-2-9-finetuned with Docker Model Runner:
docker model run hf.co/my-ai-stack/Stack-2-9-finetuned
File size: 4,551 Bytes
b5998ff fcb2b04 b5998ff fcb2b04 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 | .PHONY: help install test train deploy clean lint format check check-types lint-ci
help: ## Show this help message
@echo "Stack 2.9 - Makefile Commands"
@echo ""
@echo "Setup:"
@echo " install Install Python and Node dependencies"
@echo ""
@echo "Training:"
@echo " train Run full training pipeline"
@echo " prepare-data Prepare training dataset"
@echo ""
@echo "Deployment:"
@echo " deploy-local Deploy vLLM server locally with Docker"
@echo " deploy-runpod Deploy to RunPod"
@echo " deploy-vast Deploy to Vast.ai"
@echo ""
@echo "Voice:"
@echo " voice-up Start voice integration service"
@echo " voice-down Stop voice service"
@echo ""
@echo "Evaluation:"
@echo " eval Run full benchmark suite"
@echo " eval-tool-use Run tool-use evaluation"
@echo " eval-code Run code quality evaluation"
@echo ""
@echo "Utilities:"
@echo " test Run unit tests"
@echo " lint Run linters"
@echo " clean Remove build artifacts and temporary files"
@echo " docs Generate documentation"
install: ## Install dependencies
@echo "π¦ Installing dependencies..."
pip install -r requirements.txt
cd stack-2.9-training && pip install -r requirements.txt
cd stack-2.9-voice && pip install -r requirements.txt 2>/dev/null || true
npm install 2>/dev/null || true
@echo "β
Installation complete"
train: ## Run full training pipeline
@echo "π€ Starting training pipeline..."
cd stack-2.9-training && ./run_training.sh
deploy-local: ## Deploy locally with Docker Compose
@echo "π Deploying to local Docker..."
cd stack-2.9-deploy && ./local_deploy.sh
deploy-runpod: ## Deploy to RunPod
@echo "βοΈ Deploying to RunPod..."
cd stack-2.9-deploy && ./runpod_deploy.sh
deploy-vast: ## Deploy to Vast.ai
@echo "βοΈ Deploying to Vast.ai..."
cd stack-2.9-deploy && ./vastai_deploy.sh
voice-up: ## Start voice integration service
@echo "π€ Starting voice service..."
cd stack-2.9-voice && docker-compose up -d
@echo "β
Voice service running on http://localhost:8001"
voice-down: ## Stop voice service
@echo "π€ Stopping voice service..."
cd stack-2.9-voice && docker-compose down
eval: ## Run full benchmark suite
@echo "π Running evaluation suite..."
cd stack-2.9-eval && ./benchmark_suite.sh
eval-tool-use: ## Run tool-use evaluation
@echo "π§ Running tool-use evaluation..."
cd stack-2.9-eval && python tool_use_eval.py
eval-code: ## Run code quality evaluation
@echo "β¨ Running code quality evaluation..."
cd stack-2.9-eval && python code_quality_eval.py
test: ## Run unit tests
@echo "π§ͺ Running tests..."
pytest -xvs 2>/dev/null || echo "No pytest tests found"
cd stack-2.9-voice && python -m pytest test_integration.py 2>/dev/null || true
lint: ## Run ruff linter
@echo "π Running ruff linter..."
ruff check .
@echo "β
Lint complete"
format: ## Run black formatter
@echo "π¨ Running black formatter..."
black .
@echo "β
Format complete"
check: ## Run all quality checks
@echo "π Running all checks (lint + format check + type check)..."
@echo ""
@echo "--- Lint (ruff) ---"
ruff check . || true
@echo ""
@echo "--- Format check (black) ---"
black --check . || true
@echo ""
@echo "--- Type check (mypy) ---"
bash scripts/check_types.sh
@echo ""
@echo "β
All checks complete"
check-types: ## Run mypy type checks
@echo "π Running mypy type checks..."
bash scripts/check_types.sh
@echo "β
Type check complete"
lint-ci: ## Run linters (CI-friendly, fail on errors)
@echo "π Running linters (CI mode)..."
ruff check . --exit-non-zero-on-error
clean: ## Clean build artifacts
@echo "π§Ή Cleaning..."
rm -rf data/ output/ models/ logs/
find . -name "*.pyc" -delete
find . -name "__pycache__" -delete
find . -name ".pytest_cache" -delete
@echo "β
Clean complete"
docs: ## Generate documentation
@echo "π Generating documentation..."
cd stack-2.9-docs && cp -R ../README.md . 2>/dev/null || true
@echo "β
Docs ready in stack-2.9-docs/"
status: ## Show deployment status
@echo "π Stack 2.9 Status"
@echo "=================="
@if docker ps | grep -q stack; then \
echo "β
vLLM server: running"; \
else \
echo "β vLLM server: stopped"; \
fi
@if docker ps | grep -q voice; then \
echo "β
Voice service: running"; \
else \
echo "β Voice service: stopped"; \
fi
@echo ""
@echo "Directories:"
@ls -ld training-data/ stack-2.9-*/ 2>/dev/null | awk '{print " " $$NF}' |