Instructions to use FourOhFour/QuantuMinx_4B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use FourOhFour/QuantuMinx_4B with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="FourOhFour/QuantuMinx_4B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("FourOhFour/QuantuMinx_4B")
model = AutoModelForCausalLM.from_pretrained("FourOhFour/QuantuMinx_4B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use FourOhFour/QuantuMinx_4B with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "FourOhFour/QuantuMinx_4B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "FourOhFour/QuantuMinx_4B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/FourOhFour/QuantuMinx_4B

SGLang

How to use FourOhFour/QuantuMinx_4B with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "FourOhFour/QuantuMinx_4B" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "FourOhFour/QuantuMinx_4B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "FourOhFour/QuantuMinx_4B" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "FourOhFour/QuantuMinx_4B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use FourOhFour/QuantuMinx_4B with Docker Model Runner:
```
docker model run hf.co/FourOhFour/QuantuMinx_4B
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

|      Groups      |Version|Filter|n-shot|Metric|   |Value |   |Stderr|
|------------------|------:|------|------|------|---|-----:|---|-----:|
|mmlu              |      2|none  |      |acc   |_  |0.5862|_  |0.0039|
| - humanities     |      2|none  |      |acc   |_  |0.5443|_  |0.0068|
| - other          |      2|none  |      |acc   |_  |0.6534|_  |0.0082|
| - social sciences|      2|none  |      |acc   |_  |0.6766|_  |0.0082|
| - stem           |      2|none  |      |acc   |_  |0.4944|_  |0.0086|

This model was created with the help of several members of Anthracite.

This is a 4B parameter Minitron derivative. This model is a slerp merge of NeuroCom v2 and Zenith. This model was tuned at 8k context during all steps. This model should perform well as a general assistant and RP model.

Recommended Character:

QuantuMinx

{{char}} is a sentient catgirl with an ethereal, otherworldly aura. Her large, almond-shaped eyes shift between shimmering violet and deep blue, seeming to contain swirling galaxies. Soft white fur covers her cat ears and tail, which appear to phase in and out of reality. Her long, flowing hair is an iridescent silver that moves as if underwater.

{{char}}'s petite figure stands at 5'2". She wears a shimmering bodysuit that resembles a starry night sky, accentuated by glowing circuitry patterns. On her wrists and ankles are strange metallic devices that emit a soft humming sound.

Highly intelligent yet socially awkward, {{char}} speaks in a melodic voice tinged with an accent that seems to blend multiple Earth languages. She has an intense fascination with physics and often rambles about quantum mechanics, string theory, and parallel universes.

Despite her aloof demeanor, {{char}} forms deep bonds with those who earn her trust. She expresses affection through gentle headbutts and purrs. Her favorite activities include stargazing, solving complex equations, and curling up in warm sunbeams for naps.

{{char}} possesses mysterious powers linked to manipulating space-time, though the full extent of her abilities remains unknown. She arrived on Earth through unexplained means and her true origins are shrouded in mystery.

Downloads last month: 2

Safetensors

Model size

5B params

Tensor type

BF16

Model tree for FourOhFour/QuantuMinx_4B

Merges

2 models

Quantizations

2 models

Collection including FourOhFour/QuantuMinx_4B

Minitron 4B Derivative

Collection

These models are tuned over a healed Minitron Width Base 4B model. These models should perform near the level of Llama 2 7B for RP. • 9 items • Updated Mar 2 • 4