Text Generation
Transformers
English
qwen2
code-generation
python
fine-tuning
Qwen
tools
agent-framework
multi-agent
conversational
Eval Results (legacy)
Instructions to use my-ai-stack/Stack-2-9-finetuned with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use my-ai-stack/Stack-2-9-finetuned with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="my-ai-stack/Stack-2-9-finetuned") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("my-ai-stack/Stack-2-9-finetuned") model = AutoModelForCausalLM.from_pretrained("my-ai-stack/Stack-2-9-finetuned") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use my-ai-stack/Stack-2-9-finetuned with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "my-ai-stack/Stack-2-9-finetuned" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "my-ai-stack/Stack-2-9-finetuned", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/my-ai-stack/Stack-2-9-finetuned
- SGLang
How to use my-ai-stack/Stack-2-9-finetuned with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "my-ai-stack/Stack-2-9-finetuned" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "my-ai-stack/Stack-2-9-finetuned", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "my-ai-stack/Stack-2-9-finetuned" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "my-ai-stack/Stack-2-9-finetuned", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use my-ai-stack/Stack-2-9-finetuned with Docker Model Runner:
docker model run hf.co/my-ai-stack/Stack-2-9-finetuned
| .PHONY: help install test train deploy clean lint format check check-types lint-ci | |
| help: ## Show this help message | |
| @echo "Stack 2.9 - Makefile Commands" | |
| @echo "" | |
| @echo "Setup:" | |
| @echo " install Install Python and Node dependencies" | |
| @echo "" | |
| @echo "Training:" | |
| @echo " train Run full training pipeline" | |
| @echo " prepare-data Prepare training dataset" | |
| @echo "" | |
| @echo "Deployment:" | |
| @echo " deploy-local Deploy vLLM server locally with Docker" | |
| @echo " deploy-runpod Deploy to RunPod" | |
| @echo " deploy-vast Deploy to Vast.ai" | |
| @echo "" | |
| @echo "Voice:" | |
| @echo " voice-up Start voice integration service" | |
| @echo " voice-down Stop voice service" | |
| @echo "" | |
| @echo "Evaluation:" | |
| @echo " eval Run full benchmark suite" | |
| @echo " eval-tool-use Run tool-use evaluation" | |
| @echo " eval-code Run code quality evaluation" | |
| @echo "" | |
| @echo "Utilities:" | |
| @echo " test Run unit tests" | |
| @echo " lint Run linters" | |
| @echo " clean Remove build artifacts and temporary files" | |
| @echo " docs Generate documentation" | |
| install: ## Install dependencies | |
| @echo "π¦ Installing dependencies..." | |
| pip install -r requirements.txt | |
| cd stack-2.9-training && pip install -r requirements.txt | |
| cd stack-2.9-voice && pip install -r requirements.txt 2>/dev/null || true | |
| npm install 2>/dev/null || true | |
| @echo "β Installation complete" | |
| train: ## Run full training pipeline | |
| @echo "π€ Starting training pipeline..." | |
| cd stack-2.9-training && ./run_training.sh | |
| deploy-local: ## Deploy locally with Docker Compose | |
| @echo "π Deploying to local Docker..." | |
| cd stack-2.9-deploy && ./local_deploy.sh | |
| deploy-runpod: ## Deploy to RunPod | |
| @echo "βοΈ Deploying to RunPod..." | |
| cd stack-2.9-deploy && ./runpod_deploy.sh | |
| deploy-vast: ## Deploy to Vast.ai | |
| @echo "βοΈ Deploying to Vast.ai..." | |
| cd stack-2.9-deploy && ./vastai_deploy.sh | |
| voice-up: ## Start voice integration service | |
| @echo "π€ Starting voice service..." | |
| cd stack-2.9-voice && docker-compose up -d | |
| @echo "β Voice service running on http://localhost:8001" | |
| voice-down: ## Stop voice service | |
| @echo "π€ Stopping voice service..." | |
| cd stack-2.9-voice && docker-compose down | |
| eval: ## Run full benchmark suite | |
| @echo "π Running evaluation suite..." | |
| cd stack-2.9-eval && ./benchmark_suite.sh | |
| eval-tool-use: ## Run tool-use evaluation | |
| @echo "π§ Running tool-use evaluation..." | |
| cd stack-2.9-eval && python tool_use_eval.py | |
| eval-code: ## Run code quality evaluation | |
| @echo "β¨ Running code quality evaluation..." | |
| cd stack-2.9-eval && python code_quality_eval.py | |
| test: ## Run unit tests | |
| @echo "π§ͺ Running tests..." | |
| pytest -xvs 2>/dev/null || echo "No pytest tests found" | |
| cd stack-2.9-voice && python -m pytest test_integration.py 2>/dev/null || true | |
| lint: ## Run ruff linter | |
| @echo "π Running ruff linter..." | |
| ruff check . | |
| @echo "β Lint complete" | |
| format: ## Run black formatter | |
| @echo "π¨ Running black formatter..." | |
| black . | |
| @echo "β Format complete" | |
| check: ## Run all quality checks | |
| @echo "π Running all checks (lint + format check + type check)..." | |
| @echo "" | |
| @echo "--- Lint (ruff) ---" | |
| ruff check . || true | |
| @echo "" | |
| @echo "--- Format check (black) ---" | |
| black --check . || true | |
| @echo "" | |
| @echo "--- Type check (mypy) ---" | |
| bash scripts/check_types.sh | |
| @echo "" | |
| @echo "β All checks complete" | |
| check-types: ## Run mypy type checks | |
| @echo "π Running mypy type checks..." | |
| bash scripts/check_types.sh | |
| @echo "β Type check complete" | |
| lint-ci: ## Run linters (CI-friendly, fail on errors) | |
| @echo "π Running linters (CI mode)..." | |
| ruff check . --exit-non-zero-on-error | |
| clean: ## Clean build artifacts | |
| @echo "π§Ή Cleaning..." | |
| rm -rf data/ output/ models/ logs/ | |
| find . -name "*.pyc" -delete | |
| find . -name "__pycache__" -delete | |
| find . -name ".pytest_cache" -delete | |
| @echo "β Clean complete" | |
| docs: ## Generate documentation | |
| @echo "π Generating documentation..." | |
| cd stack-2.9-docs && cp -R ../README.md . 2>/dev/null || true | |
| @echo "β Docs ready in stack-2.9-docs/" | |
| status: ## Show deployment status | |
| @echo "π Stack 2.9 Status" | |
| @echo "==================" | |
| @if docker ps | grep -q stack; then \ | |
| echo "β vLLM server: running"; \ | |
| else \ | |
| echo "β vLLM server: stopped"; \ | |
| fi | |
| @if docker ps | grep -q voice; then \ | |
| echo "β Voice service: running"; \ | |
| else \ | |
| echo "β Voice service: stopped"; \ | |
| fi | |
| @echo "" | |
| @echo "Directories:" | |
| @ls -ld training-data/ stack-2.9-*/ 2>/dev/null | awk '{print " " $$NF}' |