Instructions to use StentorLabs/Stentor-30M-Instruct with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use StentorLabs/Stentor-30M-Instruct with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="StentorLabs/Stentor-30M-Instruct")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("StentorLabs/Stentor-30M-Instruct")
model = AutoModelForCausalLM.from_pretrained("StentorLabs/Stentor-30M-Instruct")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

PEFT
How to use StentorLabs/Stentor-30M-Instruct with PEFT:
```
Task type is invalid.
```
Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use StentorLabs/Stentor-30M-Instruct with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "StentorLabs/Stentor-30M-Instruct"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "StentorLabs/Stentor-30M-Instruct",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/StentorLabs/Stentor-30M-Instruct

SGLang

How to use StentorLabs/Stentor-30M-Instruct with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "StentorLabs/Stentor-30M-Instruct" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "StentorLabs/Stentor-30M-Instruct",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "StentorLabs/Stentor-30M-Instruct" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "StentorLabs/Stentor-30M-Instruct",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use StentorLabs/Stentor-30M-Instruct with Docker Model Runner:
```
docker model run hf.co/StentorLabs/Stentor-30M-Instruct
```

Welcome and Guidelines

pinned

by StentorLabs - opened Feb 22

Discussion

StentorLabs

Owner Feb 22

Welcome to the official discussion hub for Stentor-30M-Instruct!
We are excited to have you here. This space is dedicated to sharing feedback, asking questions, and collaborating on the development and application of the Stentor-30M-Instruct model. Whether you are using it for research, fine-tuning it for a specific task, or deploying it on edge devices, your input is invaluable.
To ensure this community remains a helpful and productive space for everyone, please follow these guidelines:
🌟 How to Use This Discussion Board
Questions & Support: If you’re having trouble running the model or need help with implementation, please check the Model Card first. If your question hasn't been answered, feel free to start a new thread.
Showcase Your Work: Did you fine-tune Stentor-30M-Instruct on a unique dataset? Are you using it in a cool project? We’d love to see it! Share your results and links to your spaces or repos.
Feature Requests & Feedback: As a 30M parameter model, we are constantly looking for ways to optimize its performance. Let us know what features or architectural improvements you'd like to see.
📜 Community Guidelines
Be Respectful: Maintain a professional and welcoming tone. We are all here to learn.
Search Before Posting: Before opening a new topic, please use the search bar to see if your question has already been answered.
Provide Context: When reporting a bug or unexpected behavior, please include:
The environment you are using (e.g., Transformers version, hardware).[1]
A minimal code snippet to reproduce the issue.
Expected vs. actual results.[2]
Follow the HF Code of Conduct: We adhere to the Hugging Face Community Code of Conduct.
Report Misuse: If you find any safety concerns or misuse of the model, please use the "Report" button or open a private issue.
Thank you for being part of the StentorLabs journey. Let’s build something great together!

StentorLabs pinned discussion Feb 22

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment