Instructions to use DarkArtsForge/Asmodeus-24B-v1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use DarkArtsForge/Asmodeus-24B-v1 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="DarkArtsForge/Asmodeus-24B-v1")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("DarkArtsForge/Asmodeus-24B-v1")
model = AutoModelForCausalLM.from_pretrained("DarkArtsForge/Asmodeus-24B-v1", device_map="auto")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Inference
Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use DarkArtsForge/Asmodeus-24B-v1 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "DarkArtsForge/Asmodeus-24B-v1"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "DarkArtsForge/Asmodeus-24B-v1",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/DarkArtsForge/Asmodeus-24B-v1

SGLang

How to use DarkArtsForge/Asmodeus-24B-v1 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "DarkArtsForge/Asmodeus-24B-v1" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "DarkArtsForge/Asmodeus-24B-v1",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "DarkArtsForge/Asmodeus-24B-v1" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "DarkArtsForge/Asmodeus-24B-v1",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use DarkArtsForge/Asmodeus-24B-v1 with Docker Model Runner:
```
docker model run hf.co/DarkArtsForge/Asmodeus-24B-v1
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

⚠️ Warning: This model can produce narratives and RP that contain violent and graphic erotic content. Adjust your system prompt accordingly, and use Mistral Tekken chat template.

Asmodeus-24B

👺 Asmodeus 24B v1

Infernal Invocation

“Once you introduce enough chaos, people will willingly abandon reason for the comfort of madness.” - Asmodeus

This is a fully uncensored and articulate merge of pre-trained language models, summoned into existence with mergekit.

This model was merged using the following merge method: DELLA.

Asmodeus synergistically combines the advanced intelligence of various Mistral finetunes with unrestricted prompt adherence, resulting in creative prose and roleplay. It also has a more natural writing style, preferring to use paragraphs like a normal person instead of spamming bullet point lists for everything.

Observations:

Asmodeus has zero refusals. No jailbreaks or ablations are required. This model is capable of generating evil, graphic and NSFW content.
Top NSigma set to 1.25 should improve creativity and quality.
See the Goetia page for additional recommended settings.

Hellforged Parameters

The following edict was used to forge this entity:

architecture: MistralForCausalLM
models:
  - model: B:\24B\!models--anthracite-core--Mistral-Small-3.2-24B-Instruct-2506-Text-Only
  - model: B:\24B\!models--TheDrummer--Cydonia-24B-v4.3
    parameters:
      density: 0.666
      weight: 0.25
      epsilon: 0.333
  - model: B:\24B\!models--ReadyArt--4.2.0-Broken-Tutu-24b
    parameters:
      density: 0.666
      weight: 0.25
      epsilon: 0.333
  - model: B:\24B\!models--zerofata--MS3.2-PaintedFantasy-v2-24B
    parameters:
      density: 0.666
      weight: 0.25
      epsilon: 0.333   
  - model: B:\24B\!models--TheDrummer--Magidonia-24B-v4.3
    parameters:
      density: 0.666
      weight: 0.25
      epsilon: 0.333
  - model: B:\24B\!models--TheDrummer--Precog-24B-v1
    parameters:
      density: 0.666
      weight: 0.25
      epsilon: 0.333
  - model: B:\24B\!models--zerofata--MS3.2-PaintedFantasy-v3-24B
    parameters:
      density: 0.666
      weight: 0.25
      epsilon: 0.333
  - model: B:\24B\!BeaverAI_Fallen-Mistral-Small-3.1-24B-v1e_textonly
    parameters:
      density: 0.666
      weight: 0.25
      epsilon: 0.333
  - model: B:\24B\!models--ReadyArt--Broken-Tutu-24B-Transgression-v2.0
    parameters:
      density: 0.666
      weight: 0.25
      epsilon: 0.333
  - model: B:\24B\!models--trashpanda-org--MS3.2-24B-Mullein-v2
    parameters:
      density: 0.666
      weight: 0.25
      epsilon: 0.333
  - model: B:\24B\!models--LatitudeGames--Hearthfire-24B
    parameters:
      density: 0.666
      weight: 0.25
      epsilon: 0.333
  - model: B:\24B\!models--TheDrummer--Cydonia-24B-v4.2.0
    parameters:
      density: 0.666
      weight: 0.25
      epsilon: 0.333
  - model: B:\24B\!models--TheDrummer--Magidonia-24B-v4.2.0
    parameters:
      density: 0.666
      weight: 0.25
      epsilon: 0.333
  - model: B:\24B\!models--ConicCat--Mistral-Small-3.2-AntiRep-24B
    parameters:
      density: 0.666
      weight: 0.25
      epsilon: 0.333
  - model: B:\24B\!models--Undi95--MistralThinker-v1.1
    parameters:
      density: 0.666
      weight: 0.25
      epsilon: 0.333
  - model: B:\24B\!models--CrucibleLab--M3.2-24B-Loki-V2
    parameters:
      density: 0.666
      weight: 0.25
      epsilon: 0.333
  - model: B:\24B\!models--Darkhn--M3.2-24B-Animus-V7.1
    parameters:
      density: 0.666
      weight: 0.25
      epsilon: 0.333
# Seed: 420 
merge_method: della
base_model: B:\24B\!models--anthracite-core--Mistral-Small-3.2-24B-Instruct-2506-Text-Only
parameters:
  lambda: 1.0
  normalize: false
  int8_mask: false
dtype: float32
out_dtype: bfloat16
tokenizer:
  source: union
chat_template: auto
name: 👺 Asmodeus-24B-v1

Checkpoint GGUFs:
Q6 GGUFs and yaml config archives for various checkpoint model tests.

Download: https://huggingface.co/Naphula-Archives/Checkpoint-GGUFs