Instructions to use my-ai-stack/Stack-2-9-finetuned with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use my-ai-stack/Stack-2-9-finetuned with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="my-ai-stack/Stack-2-9-finetuned")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("my-ai-stack/Stack-2-9-finetuned")
model = AutoModelForCausalLM.from_pretrained("my-ai-stack/Stack-2-9-finetuned")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use my-ai-stack/Stack-2-9-finetuned with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "my-ai-stack/Stack-2-9-finetuned"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "my-ai-stack/Stack-2-9-finetuned",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/my-ai-stack/Stack-2-9-finetuned

SGLang

How to use my-ai-stack/Stack-2-9-finetuned with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "my-ai-stack/Stack-2-9-finetuned" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "my-ai-stack/Stack-2-9-finetuned",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "my-ai-stack/Stack-2-9-finetuned" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "my-ai-stack/Stack-2-9-finetuned",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use my-ai-stack/Stack-2-9-finetuned with Docker Model Runner:
```
docker model run hf.co/my-ai-stack/Stack-2-9-finetuned
```

Stack-2-9-finetuned / stack /eval /benchmarks /bigbench.py

walidsobhie-code

refactor: Squeeze folders further - cleaner structure

65888d5 about 2 months ago

raw

history blame

2.98 kB

	"""
	BIG-Bench Hard benchmark implementation
	"""

	from typing import Dict, Any, List

	class BIGBenchHard:
	def __init__(self):
	self.benchmark_name = "BIG-Bench Hard"
	self.test_cases = self._load_test_cases()
	self.total_cases = len(self.test_cases)

	def _load_test_cases(self) -> List[Dict]:
	"""Load BIG-Bench Hard test cases"""
	# This would typically load from a file or API
	# For now, return a placeholder structure
	return [
	{
	"description": "Logical reasoning problem",
	"prompt": "If all cats are mammals and all mammals are animals, are all cats animals?",
	"answer": "Yes"
	},
	{
	"description": "Common sense reasoning",
	"prompt": "What happens when you drop a glass on a hard floor?",
	"answer": "It breaks"
	},
	{
	"description": "Mathematical reasoning",
	"prompt": "If a train travels 60 miles in 1.5 hours, what is its average speed?",
	"answer": "40 mph"
	}
	# Add more test cases here
	]

	def evaluate(self, model_name: str) -> Dict[str, Any]:
	"""Evaluate model against BIG-Bench Hard benchmark"""
	correct_answers = 0

	for i, test_case in enumerate(self.test_cases):
	prompt = test_case["prompt"]

	# Simulate model response
	response = self._generate_response(model_name, prompt)

	# Check if answer is correct
	if self._check_answer(response, test_case["answer"]):
	correct_answers += 1

	accuracy = correct_answers / self.total_cases if self.total_cases > 0 else 0

	return {
	"pass_at_1": correct_answers,
	"pass_at_3": correct_answers, # Simplified for now
	"pass_at_5": correct_answers, # Simplified for now
	"total_cases": self.total_cases,
	"accuracy": accuracy,
	"benchmark": self.benchmark_name
	}

	def _generate_response(self, model_name: str, prompt: str) -> str:
	"""Generate response using the specified model"""
	# This would call the actual model API
	# For now, return a placeholder
	return "Yes"

	def _check_answer(self, response: str, correct_answer: str) -> bool:
	"""Check if the response matches the correct answer"""
	try:
	response = response.strip().lower()
	correct_answer = correct_answer.strip().lower()

	# Simple string comparison for now
	return response == correct_answer

	except Exception as e:
	return False


	if __name__ == "__main__":
	benchmark = BIGBenchHard()
	results = benchmark.evaluate("test_model")
	print(f"BIG-Bench Hard Results: {results}")