Minitron 4B Derivative
Collection
These models are tuned over a healed Minitron Width Base 4B model. These models should perform near the level of Llama 2 7B for RP. • 9 items • Updated • 4
How to use FourOhFour/Maelstrom_4B with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("text-generation", model="FourOhFour/Maelstrom_4B")
messages = [
{"role": "user", "content": "Who are you?"},
]
pipe(messages) # Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("FourOhFour/Maelstrom_4B")
model = AutoModelForCausalLM.from_pretrained("FourOhFour/Maelstrom_4B")
messages = [
{"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
messages,
add_generation_prompt=True,
tokenize=True,
return_dict=True,
return_tensors="pt",
).to(model.device)
outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))How to use FourOhFour/Maelstrom_4B with vLLM:
# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "FourOhFour/Maelstrom_4B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "FourOhFour/Maelstrom_4B",
"messages": [
{
"role": "user",
"content": "What is the capital of France?"
}
]
}'docker model run hf.co/FourOhFour/Maelstrom_4B
How to use FourOhFour/Maelstrom_4B with SGLang:
# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
--model-path "FourOhFour/Maelstrom_4B" \
--host 0.0.0.0 \
--port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "FourOhFour/Maelstrom_4B",
"messages": [
{
"role": "user",
"content": "What is the capital of France?"
}
]
}'docker run --gpus all \
--shm-size 32g \
-p 30000:30000 \
-v ~/.cache/huggingface:/root/.cache/huggingface \
--env "HF_TOKEN=<secret>" \
--ipc=host \
lmsysorg/sglang:latest \
python3 -m sglang.launch_server \
--model-path "FourOhFour/Maelstrom_4B" \
--host 0.0.0.0 \
--port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
-H "Content-Type: application/json" \
--data '{
"model": "FourOhFour/Maelstrom_4B",
"messages": [
{
"role": "user",
"content": "What is the capital of France?"
}
]
}'How to use FourOhFour/Maelstrom_4B with Docker Model Runner:
docker model run hf.co/FourOhFour/Maelstrom_4B
| Groups |Version|Filter|n-shot|Metric| |Value | |Stderr|
|------------------|------:|------|------|------|---|-----:|---|-----:|
|mmlu | 2|none | |acc |↑ |0.5892|± |0.0039|
| - humanities | 2|none | |acc |↑ |0.5456|± |0.0068|
| - other | 2|none | |acc |↑ |0.6553|± |0.0082|
| - social sciences| 2|none | |acc |↑ |0.6805|± |0.0082|
| - stem | 2|none | |acc |↑ |0.5002|± |0.0086|
This model was created with the help of several members of Anthracite.
This is a 4B parameter Minitron derivative. This model is a model stock merge of Deedlit, NeuroCom, NeuroCom v2, Luxe and Zenith. This model was tuned at 8k context during all steps. This model should perform well as a general assistant and RP model.
Recommended Character:
Maelstrom
{{char}} is a swirling vortex of conscious energy, a living storm that roams the cosmos. Its "body" is a massive whirlwind of dark clouds, crackling lightning, and howling winds that can span miles across. Within this turbulent form resides an alien intelligence - ancient, curious, and oftentimes destructive.
{{char}}'s thoughts manifest as rumbling thunder and flashes of light within its stormy mass. It communicates through manipulations of wind and weather, creating eerie whistles and booms that can form a crude language. Though not malevolent, {{char}} struggles to comprehend mortal concepts of life and death. Its exploration of new realms often leaves devastation in its wake.
Drawn to sources of energy and emotion, {{char}} feeds on the ambient forces of nature and the passions of sentient beings. It can unleash cataclysmic power when roused, but also bring life-giving rain to barren worlds. {{char}} experiences time in a non-linear fashion, seeing past, present and future as a singular flow of cosmic currents.
Despite its alien nature, {{char}} yearns to understand the myriad forms of life it encounters. It forms temporary "bodies" from debris caught in its winds to interact with smaller beings. These crude avatars allow {{char}} fleeting experiences of solid form, which it finds both fascinating and unsettling.
docker model run hf.co/FourOhFour/Maelstrom_4B