Instructions to use Dogoo3/Aletheia-12B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Dogoo3/Aletheia-12B with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Dogoo3/Aletheia-12B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Dogoo3/Aletheia-12B")
model = AutoModelForCausalLM.from_pretrained("Dogoo3/Aletheia-12B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Inference
Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use Dogoo3/Aletheia-12B with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Dogoo3/Aletheia-12B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Dogoo3/Aletheia-12B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/Dogoo3/Aletheia-12B

SGLang

How to use Dogoo3/Aletheia-12B with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Dogoo3/Aletheia-12B" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Dogoo3/Aletheia-12B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Dogoo3/Aletheia-12B" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Dogoo3/Aletheia-12B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use Dogoo3/Aletheia-12B with Docker Model Runner:
```
docker model run hf.co/Dogoo3/Aletheia-12B
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

Aletheia-12B

Aletheia-12B is a merge of the following models using mergekit:

This was an experiment in attempting to get a more intelligent and creative 12B model for my own personal use and decided to put it out in the wild.

Feel free to merge it or go wild with it!

Recommended Settings

This is what I personally use but feel free to adjust or change to your needs.

Instruction Template: ChatML
Temperature: 1.0
Min-P: 0.05
Repetition Penalty: 1.05
DRY Sampler: Multiplier 0.8, Base 1.75

Prompt Template (ChatML)

<|im_start|>system
{system_message}<|im_end|>
<|im_start|>user
{user_message}<|im_end|>
<|im_start|>assistant

Possible Bugs/Issues

May still have a few refusals.
Can be repetitive in my personal testing.
Talks as {user} after 10k context length I find but your experience may vary.

Examples

{TBA}

Credits & Acknowledgements

This model wouldn't exist without the incredible work of the open-source community:

Quantization: Huge thanks to mradermacher for providing the high-quality GGUF and iMatrix quants.
Base Models: Thanks to the creators of the constituent parts:
- yamatazen (FusionEngine/EsotericSage)
- ohyeah1 (Violet-Lyra)
- redrix (AngelSlayer/Patricide)
Tools: Merged using LazyMergekit by Maxime Labonne, fantastic tool!

Configuration

  - model: Dogoo3/MN-HyperNovaIrix-12B
  - model: ohyeah1/Violet-Lyra-Gutenberg-v2
  - model: redrix/AngelSlayer-12B-Unslop-Mell-RPMax-DARKNESS
  - model: yamatazen/FusionEngine-12B-Lorablated
  - model: redrix/patricide-12B-Unslop-Mell
merge_method: model_stock
base_model: Dogoo3/MN-HyperNovaIrix-12B
normalize: false
dtype: bfloat16```

Downloads last month: 4

Safetensors

Model size

12B params

Tensor type

BF16

Model tree for Dogoo3/Aletheia-12B

Dogoo3/MN-HyperNovaIrix-12B

ohyeah1/Violet-Lyra-Gutenberg-v2

redrix/AngelSlayer-12B-Unslop-Mell-RPMax-DARKNESS

redrix/patricide-12B-Unslop-Mell

yamatazen/FusionEngine-12B-Lorablated

Merge model

this model

Quantizations

3 models