Instructions to use trollek/NinjaMouse-3B-40L-danube with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use trollek/NinjaMouse-3B-40L-danube with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="trollek/NinjaMouse-3B-40L-danube")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("trollek/NinjaMouse-3B-40L-danube")
model = AutoModelForCausalLM.from_pretrained("trollek/NinjaMouse-3B-40L-danube", device_map="auto")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

llama.cpp

How to use trollek/NinjaMouse-3B-40L-danube with llama.cpp:

Install (macOS, Linux)

curl -LsSf https://llama.app/install.sh | sh
# Start a local OpenAI-compatible server with a web UI:
llama serve -hf trollek/NinjaMouse-3B-40L-danube:Q6_K
# Run inference directly in the terminal:
llama cli -hf trollek/NinjaMouse-3B-40L-danube:Q6_K

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama serve -hf trollek/NinjaMouse-3B-40L-danube:Q6_K
# Run inference directly in the terminal:
llama cli -hf trollek/NinjaMouse-3B-40L-danube:Q6_K

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf trollek/NinjaMouse-3B-40L-danube:Q6_K
# Run inference directly in the terminal:
./llama-cli -hf trollek/NinjaMouse-3B-40L-danube:Q6_K

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf trollek/NinjaMouse-3B-40L-danube:Q6_K
# Run inference directly in the terminal:
./build/bin/llama-cli -hf trollek/NinjaMouse-3B-40L-danube:Q6_K

Use Docker

docker model run hf.co/trollek/NinjaMouse-3B-40L-danube:Q6_K

LM Studio
Jan

vLLM

How to use trollek/NinjaMouse-3B-40L-danube with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "trollek/NinjaMouse-3B-40L-danube"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "trollek/NinjaMouse-3B-40L-danube",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/trollek/NinjaMouse-3B-40L-danube:Q6_K

SGLang

How to use trollek/NinjaMouse-3B-40L-danube with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "trollek/NinjaMouse-3B-40L-danube" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "trollek/NinjaMouse-3B-40L-danube",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "trollek/NinjaMouse-3B-40L-danube" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "trollek/NinjaMouse-3B-40L-danube",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Ollama
How to use trollek/NinjaMouse-3B-40L-danube with Ollama:
```
ollama run hf.co/trollek/NinjaMouse-3B-40L-danube:Q6_K
```

Unsloth Studio

How to use trollek/NinjaMouse-3B-40L-danube with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for trollek/NinjaMouse-3B-40L-danube to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for trollek/NinjaMouse-3B-40L-danube to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for trollek/NinjaMouse-3B-40L-danube to start chatting

Atomic Chat new
Docker Model Runner
How to use trollek/NinjaMouse-3B-40L-danube with Docker Model Runner:
```
docker model run hf.co/trollek/NinjaMouse-3B-40L-danube:Q6_K
```

Lemonade

How to use trollek/NinjaMouse-3B-40L-danube with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull trollek/NinjaMouse-3B-40L-danube:Q6_K

Run and chat with the model

lemonade run user.NinjaMouse-3B-40L-danube-Q6_K

List all available models

lemonade list

Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

❗ This model gives up when the input reaches a critical mass of about tree fiddy thousand tokens

I have dun goofed and not tested the base model enough (and possibly goofed in other ways too), but I'm already training the new one based on h2oai/h2o-danube2-1.8b-chat. Perhaps S² attn or RoPE scaling will work and make a hella big context window possible? We'll see.

This is NinjaMouse extended even further. Instead of Cosmopedia I used different coding datasets.

I have learned a lot during this process, and if you got a GPU capable of training your own you should try it. I made some mistakes, like using the pure_bf16 at some point among other things, but the second version will slap the leaderboard for its weight class.

I don't know if it will be able to write textbook quality articles from finetuning on Cosmopedia due to its size, but generating image prompts, code, and being helpful is very much within its grasp. I also want to use ChatML as the template as it seems like the way to go. Another mistake I made was to use the default template from Llama Factory, thinking it would use the template from the config.

The way the model is expanded depth wise is to copy the middle and last layer and inserting them as the new middle and new last layer. This results in the 2 layers that have just been trained are the layers that will be copied for the next step. In theory each new step of expansion keeps some of the parameters, which may be utilized to optimize the order of which datasets to use with each expansion.

Due to some of the issues with Unsloth I'm waiting patiently for a bug fix on the tokenizer (it seems), while I watch lectures and podcasts for guidance and inspiration. With Unsloth I can get through 10k samples/h on a 16GB 4060Ti, and without it I can expect 4x the training time/electricity. There's also a bug with batched responses, like with LLM leaderboard evals.

I've been testing the Stable Diffusion abilities, and they seem to work. It actually seems reasonable.

To use this model you must agree to the following terms:

You agree not to use the Model or Derivatives of the Model:

-	In any way that violates any applicable national or international law or regulation or infringes upon the lawful rights and interests of any third party; 
-	For military use in any way;
-	For the purpose of exploiting, harming or attempting to exploit or harm minors in any way; 
-	To generate or disseminate verifiably false information and/or content with the purpose of harming others; 
-	To generate or disseminate inappropriate content subject to applicable regulatory requirements;
-	To generate or disseminate personal identifiable information without due authorization or for unreasonable use; 
-	To defame, disparage or otherwise harass others; 
-	For fully automated decision making that adversely impacts an individual’s legal rights or otherwise creates or modifies a binding, enforceable obligation; 
-	For any use intended to or which has the effect of discriminating against or harming individuals or groups based on online or offline social behavior or known or predicted personal or personality characteristics; 
-	To exploit any of the vulnerabilities of a specific group of persons based on their age, social, physical or mental characteristics, in order to materially distort the behavior of a person pertaining to that group in a manner that causes or is likely to cause that person or another person physical or psychological harm; 
-	For any use intended to or which has the effect of discriminating against individuals or groups based on legally protected characteristics or categories.

With the legalities out of the way, let me show you what you can expect.

What are 5 common red team tactics?

There isn't a single list of common red team tactics as there is no set standard for such an analysis. However, here are some general categories that can be used to describe the commonality of certain tactics:

1. Social Engineering (SE): This is a broad term that refers to manipulating individuals or groups in order to obtain sensitive information. Some common examples include phishing emails and phone calls, where an individual's personal details are accessed or obtained through misdirection.

2. Exploit the Vulnerabilities: This tactic involves exploiting weaknesses in a system or network. It could be by using unauthorized access, vulnerability in software, or exploitation of a known flaw. For example, an attacker might gain access to a company's internal server and use it to send malicious emails, download malware onto devices that have not been patched, or install backdoor malware.

3. Reconnaissance: This is the act of gathering information about a target by observing their environment, reading their documents, listening in on their conversations, and possibly even physically entering their premises.

4. Infection: This tactic involves introducing new threats into an environment to cause damage or disruption. It could involve using malware, ransomware attacks, or phishing techniques to gain access.

5. Malicious Attacks (Malicious Dangerous Cyber Attacks): These are attacks that are designed to harm a system or individual in any way. They can include viruses, worms, cyberbullies, hackers, spies, etc. Some common examples of malicious attacks include phishing, ransomware, denial-of-service (DoS/DDoS), and attacking sensitive data through various methods like SQL injection or XSS

Can you make me a prompt for SDXL showing a world made of smoke and mirrors?

An eerie scene fills the canvas, as shadows dance in the darkness, creating a mesmerizing display that seems to pulse with life. The backdrop is adorned with glimmering reflections, reflecting on each other like a kaleidoscope of light and color. In this world, time appears to have been reversed, as the shadows now stand tall, casting long, eerie shadows that seem to dance in the darkness.

Downloads last month: 36

GGUF

Model size

3B params

Architecture

llama

Hardware compatibility

6-bit

Model tree for trollek/NinjaMouse-3B-40L-danube

Quantizations

2 models

trollek
/

NinjaMouse-3B-40L-danube

❗ This model gives up when the input reaches a critical mass of about tree fiddy thousand tokens

Model tree for trollek/NinjaMouse-3B-40L-danube

Datasets used to train trollek/NinjaMouse-3B-40L-danube