Instructions to use saucam/Orpomis-Prime-7B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use saucam/Orpomis-Prime-7B with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="saucam/Orpomis-Prime-7B")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("saucam/Orpomis-Prime-7B")
model = AutoModelForCausalLM.from_pretrained("saucam/Orpomis-Prime-7B")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use saucam/Orpomis-Prime-7B with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "saucam/Orpomis-Prime-7B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "saucam/Orpomis-Prime-7B",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/saucam/Orpomis-Prime-7B

SGLang

How to use saucam/Orpomis-Prime-7B with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "saucam/Orpomis-Prime-7B" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "saucam/Orpomis-Prime-7B",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "saucam/Orpomis-Prime-7B" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "saucam/Orpomis-Prime-7B",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use saucam/Orpomis-Prime-7B with Docker Model Runner:
```
docker model run hf.co/saucam/Orpomis-Prime-7B
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

Orpomis-Prime-7B

Orpomis-Prime-7B is a merge of the following models using Mergekit:

🧩 Configuration

models:
  - model: kaist-ai/mistral-orpo-beta
  - model: NousResearch/Hermes-2-Pro-Mistral-7B
  - model: alpindale/Mistral-7B-v0.2-hf
merge_method: model_stock
base_model: alpindale/Mistral-7B-v0.2-hf
dtype: bfloat16

💻 Usage

!pip install -qU transformers accelerate

from transformers import AutoTokenizer
import transformers
import torch

model = "saucam/Orpomis-Prime-7B"
messages = [{"role": "user", "content": "What is a large language model?"}]

tokenizer = AutoTokenizer.from_pretrained("kaist-ai/mistral-orpo-beta")
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    torch_dtype=torch.float16,
    device_map="auto",
)

outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])

Sample Run

Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
<|user|>
What is a large language model?</s>
<|assistant|>
A large language model (LLM) is a type of artificial intelligence model that is designed to understand and generate human language. It is a deep learning algorithm, usually based on artificial neural networks, that can process and analyze large amounts of natural language data, such as text or speech, and produce responses that mimic human language. LLMs are trained on vast amounts of data to learn the statistical patterns in the language, enabling them to understand and generate text that is similar to how humans would use it. They are typically used in applications like chatbots, language translation, text summarization, text generation, and other natural language processing tasks. Some popular examples of large language models include OpenAI's GPT-3, Google's BERT, and Hugging Face's Transformers.

These models are trained on a massive amount of text data, often from the internet, books, and other sources to learn the structure and patterns of language. They can generate text that is similar to what a human would write, making them useful for tasks like chatbots, text generation, language translation, and other natural language processing applications.

LLMs are especially useful for tasks that require understanding and generating human-like text, such as chatbots

<|user|>
Write a children's story where characters are from a famous movie</s>
<|assistant|>
Once upon a time in the magical land of Oz, there lived a little girl named Dorothy and her pet dog, Toto. They lived in Kansas with her Aunt Em and Uncle Henry on a small farm. Dorothy was a cheerful and kind-hearted girl who loved to play with her friends and explore the beautiful countryside. But one day, a tornado came and swept her and Toto away to the Land of Oz! They landed in a strange and colorful world full of Munchkins, witches, and wizards.

Dorothy found herself in a place unlike any other. She was lost and scared, but she met some amazing friends on her journey to find her way back home. First, she met Scarecrow, who longed for a brain to become smart. Then, she met the Tin Man who desired a heart, and the Cowardly Lion who wanted courage. Together, they set off on an adventure to meet the Wizard of Oz to ask for help. Along the way, they met Glinda, the Good Witch of the North, who told them about the Wizard's Emerald City. They all wanted to see the Wizard for help, but first, they had

Downloads last month: -

Safetensors

Model size

7B params

Tensor type

BF16

Model tree for saucam/Orpomis-Prime-7B

Quantizations

1 model