Instructions to use Steelskull/L3.3-Nevoria-R1-70b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Steelskull/L3.3-Nevoria-R1-70b with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="Steelskull/L3.3-Nevoria-R1-70b") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("Steelskull/L3.3-Nevoria-R1-70b") model = AutoModelForCausalLM.from_pretrained("Steelskull/L3.3-Nevoria-R1-70b") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Inference
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use Steelskull/L3.3-Nevoria-R1-70b with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Steelskull/L3.3-Nevoria-R1-70b" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Steelskull/L3.3-Nevoria-R1-70b", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/Steelskull/L3.3-Nevoria-R1-70b
- SGLang
How to use Steelskull/L3.3-Nevoria-R1-70b with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Steelskull/L3.3-Nevoria-R1-70b" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Steelskull/L3.3-Nevoria-R1-70b", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Steelskull/L3.3-Nevoria-R1-70b" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Steelskull/L3.3-Nevoria-R1-70b", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use Steelskull/L3.3-Nevoria-R1-70b with Docker Model Runner:
docker model run hf.co/Steelskull/L3.3-Nevoria-R1-70b
Ollama version
Hi.
Can I kindly ask for upload to ollama? Base Nevoria is already there:
https://ollama.com/dabl/L3.3-MS-Nevoria-70b-Q4_K_M.gguf
and there is even a place prepared, but no files were uploaded yet:
https://ollama.com/kevinkoehler/L3.3-Nevoria-R1-70b-GGUF
After months of Midnight Miqu I was looking for something else that fallows instructions, and after playing with Emotion-abliterated and realizing it's horrible flaws with spoken dialogue, I could not shake the feeling, that such model with good flow, focus and narration as fundament with something to fix dialogue would be crazy, and YOU mixed it with other two I was interested in like Anubis and EVA-LLAMA-0.1 ...and you can die? Gore, Uncensored and no positive crap? Just FINALLY. Time to start another crazy story with lewd unhinged dark humor... I wonder how this LLM will survive my absurd mind.
This might be what I was looking for months!
For now, I check INTIMATELY base NEVORIA, but if you could upload this alternative, I could compare them side by side live. I would appreciate.
All the best and THANK YOU.
- Create new folder 'modelfiles' at Ollama. C:\Users\USER_NAME\ .ollama
- Inside it create a FILE called 'YourModel'. Don't give any file extension to it, like .txt or anything.
- Write into the FILE the path of your downloaded .gguf file. for example : ' FROM c:\Models\L3.3-MS-Nevoria-70b-Q4_K_M.gguf '
- open cmd at modelfiles directory. C:\Users\USER_NAME\ .ollama\modelfiles
- ollama create Your_Model_fancy_name -f C:\Users\USER_NAME\ .ollama\modelfiles\YourModel '
- ollama run Your_Model_fancy_name
Thanks, but I use Hammer AI... and even when I manage to download the file, it lacks some description files or something, spit error 200 and remove entire model. It literally need to be on that webside so I could get it.
- Create new folder 'modelfiles' at Ollama. C:\Users\USER_NAME\ .ollama
- Inside it create a FILE called 'YourModel'. Don't give any file extension to it, like .txt or anything.
- Write into the FILE the path of your downloaded .gguf file. for example : ' FROM c:\Models\L3.3-MS-Nevoria-70b-Q4_K_M.gguf '
- open cmd at modelfiles directory. C:\Users\USER_NAME\ .ollama\modelfiles
- ollama create Your_Model_fancy_name -f C:\Users\USER_NAME\ .ollama\modelfiles\YourModel '
- ollama run Your_Model_fancy_name