Instructions to use TomPei/YOCO with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use TomPei/YOCO with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="TomPei/YOCO")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("TomPei/YOCO")
model = AutoModelForCausalLM.from_pretrained("TomPei/YOCO")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use TomPei/YOCO with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "TomPei/YOCO"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "TomPei/YOCO",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/TomPei/YOCO

SGLang

How to use TomPei/YOCO with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "TomPei/YOCO" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "TomPei/YOCO",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "TomPei/YOCO" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "TomPei/YOCO",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use TomPei/YOCO with Docker Model Runner:
```
docker model run hf.co/TomPei/YOCO
```

license: apache-2.0

YOCO Model

Model description

YOCO is a state-of-the-art natural language processing model designed for a wide range of NLP tasks such as text classification, sentiment analysis, and question answering. It has been trained on a large corpus of text data and fine-tuned for optimal performance.

Limitations and bias

The YOCO model, like all machine learning models, may carry biases from its training data. Users should be cautious of these limitations when using the model for sensitive applications.

Ethical considerations

Special attention has been given to ensure that YOCO adheres to ethical guidelines in AI development, including fairness, accountability, and transparency. Users are encouraged to use this model responsibly.

Citation

If you use YOCO in your research, please cite it using the following BibTeX entry:

@inproceedings{yoco2024,
  title={YOCO: A High-Performance Model for NLP},
  author={Author Name},
  booktitle={HuggingFace Model Hub},
  year={2024}
}

Acknowledgments

We would like to thank the HuggingFace community for providing the infrastructure and tools that made the development of YOCO possible. ```

Downloads last month: 8

Safetensors

Model size

1B params

Tensor type

BF16