Instructions to use Steelskull/L3.3-Nevoria-R1-70b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Steelskull/L3.3-Nevoria-R1-70b with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Steelskull/L3.3-Nevoria-R1-70b")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Steelskull/L3.3-Nevoria-R1-70b")
model = AutoModelForCausalLM.from_pretrained("Steelskull/L3.3-Nevoria-R1-70b")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Inference
Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use Steelskull/L3.3-Nevoria-R1-70b with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Steelskull/L3.3-Nevoria-R1-70b"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Steelskull/L3.3-Nevoria-R1-70b",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/Steelskull/L3.3-Nevoria-R1-70b

SGLang

How to use Steelskull/L3.3-Nevoria-R1-70b with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Steelskull/L3.3-Nevoria-R1-70b" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Steelskull/L3.3-Nevoria-R1-70b",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Steelskull/L3.3-Nevoria-R1-70b" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Steelskull/L3.3-Nevoria-R1-70b",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use Steelskull/L3.3-Nevoria-R1-70b with Docker Model Runner:
```
docker model run hf.co/Steelskull/L3.3-Nevoria-R1-70b
```

Ollama version

by ScarabParamit - opened Jun 1, 2025

Discussion

ScarabParamit

Jun 1, 2025

Hi.

Can I kindly ask for upload to ollama? Base Nevoria is already there:
https://ollama.com/dabl/L3.3-MS-Nevoria-70b-Q4_K_M.gguf
and there is even a place prepared, but no files were uploaded yet:
https://ollama.com/kevinkoehler/L3.3-Nevoria-R1-70b-GGUF

After months of Midnight Miqu I was looking for something else that fallows instructions, and after playing with Emotion-abliterated and realizing it's horrible flaws with spoken dialogue, I could not shake the feeling, that such model with good flow, focus and narration as fundament with something to fix dialogue would be crazy, and YOU mixed it with other two I was interested in like Anubis and EVA-LLAMA-0.1 ...and you can die? Gore, Uncensored and no positive crap? Just FINALLY. Time to start another crazy story with lewd unhinged dark humor... I wonder how this LLM will survive my absurd mind.

This might be what I was looking for months!
For now, I check INTIMATELY base NEVORIA, but if you could upload this alternative, I could compare them side by side live. I would appreciate.

All the best and THANK YOU.

010O11

Jun 2, 2025

•

edited Jun 2, 2025

@ScarabParamit

Create new folder 'modelfiles' at Ollama. C:\Users\USER_NAME\ .ollama
Inside it create a FILE called 'YourModel'. Don't give any file extension to it, like .txt or anything.
Write into the FILE the path of your downloaded .gguf file. for example : ' FROM c:\Models\L3.3-MS-Nevoria-70b-Q4_K_M.gguf '
open cmd at modelfiles directory. C:\Users\USER_NAME\ .ollama\modelfiles
ollama create Your_Model_fancy_name -f C:\Users\USER_NAME\ .ollama\modelfiles\YourModel '
ollama run Your_Model_fancy_name

ScarabParamit

Jun 2, 2025

Thanks, but I use Hammer AI... and even when I manage to download the file, it lacks some description files or something, spit error 200 and remove entire model. It literally need to be on that webside so I could get it.

@ScarabParamit

Create new folder 'modelfiles' at Ollama. C:\Users\USER_NAME\ .ollama

Inside it create a FILE called 'YourModel'. Don't give any file extension to it, like .txt or anything.

Write into the FILE the path of your downloaded .gguf file. for example : ' FROM c:\Models\L3.3-MS-Nevoria-70b-Q4_K_M.gguf '

open cmd at modelfiles directory. C:\Users\USER_NAME\ .ollama\modelfiles

ollama create Your_Model_fancy_name -f C:\Users\USER_NAME\ .ollama\modelfiles\YourModel '

ollama run Your_Model_fancy_name

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment