Instructions to use ProtoNeuron-3/NeuNego-3B-Dark with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use ProtoNeuron-3/NeuNego-3B-Dark with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="ProtoNeuron-3/NeuNego-3B-Dark")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("ProtoNeuron-3/NeuNego-3B-Dark", dtype="auto")

llama-cpp-python

How to use ProtoNeuron-3/NeuNego-3B-Dark with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="ProtoNeuron-3/NeuNego-3B-Dark",
	filename="NeuNego_3B_v2_f16.gguf",
)

llm.create_chat_completion(
	messages = [
		{
			"role": "user",
			"content": "What is the capital of France?"
		}
	]
)

Notebooks
Google Colab
Kaggle
Local Apps

llama.cpp

How to use ProtoNeuron-3/NeuNego-3B-Dark with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf ProtoNeuron-3/NeuNego-3B-Dark:F16
# Run inference directly in the terminal:
llama-cli -hf ProtoNeuron-3/NeuNego-3B-Dark:F16

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf ProtoNeuron-3/NeuNego-3B-Dark:F16
# Run inference directly in the terminal:
llama-cli -hf ProtoNeuron-3/NeuNego-3B-Dark:F16

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf ProtoNeuron-3/NeuNego-3B-Dark:F16
# Run inference directly in the terminal:
./llama-cli -hf ProtoNeuron-3/NeuNego-3B-Dark:F16

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf ProtoNeuron-3/NeuNego-3B-Dark:F16
# Run inference directly in the terminal:
./build/bin/llama-cli -hf ProtoNeuron-3/NeuNego-3B-Dark:F16

Use Docker

docker model run hf.co/ProtoNeuron-3/NeuNego-3B-Dark:F16

LM Studio
Jan

vLLM

How to use ProtoNeuron-3/NeuNego-3B-Dark with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "ProtoNeuron-3/NeuNego-3B-Dark"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "ProtoNeuron-3/NeuNego-3B-Dark",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/ProtoNeuron-3/NeuNego-3B-Dark:F16

SGLang

How to use ProtoNeuron-3/NeuNego-3B-Dark with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "ProtoNeuron-3/NeuNego-3B-Dark" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "ProtoNeuron-3/NeuNego-3B-Dark",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "ProtoNeuron-3/NeuNego-3B-Dark" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "ProtoNeuron-3/NeuNego-3B-Dark",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Ollama
How to use ProtoNeuron-3/NeuNego-3B-Dark with Ollama:
```
ollama run hf.co/ProtoNeuron-3/NeuNego-3B-Dark:F16
```

Unsloth Studio new

How to use ProtoNeuron-3/NeuNego-3B-Dark with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for ProtoNeuron-3/NeuNego-3B-Dark to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for ProtoNeuron-3/NeuNego-3B-Dark to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for ProtoNeuron-3/NeuNego-3B-Dark to start chatting

Pi new

How to use ProtoNeuron-3/NeuNego-3B-Dark with Pi:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf ProtoNeuron-3/NeuNego-3B-Dark:F16

Configure the model in Pi

# Install Pi:
npm install -g @mariozechner/pi-coding-agent
# Add to ~/.pi/agent/models.json:
{
  "providers": {
    "llama-cpp": {
      "baseUrl": "http://localhost:8080/v1",
      "api": "openai-completions",
      "apiKey": "none",
      "models": [
        {
          "id": "ProtoNeuron-3/NeuNego-3B-Dark:F16"
        }
      ]
    }
  }
}

Run Pi

# Start Pi in your project directory:
pi

Hermes Agent new

How to use ProtoNeuron-3/NeuNego-3B-Dark with Hermes Agent:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf ProtoNeuron-3/NeuNego-3B-Dark:F16

Configure Hermes

# Install Hermes:
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
hermes setup
# Point Hermes at the local server:
hermes config set model.provider custom
hermes config set model.base_url http://127.0.0.1:8080/v1
hermes config set model.default ProtoNeuron-3/NeuNego-3B-Dark:F16

Run Hermes

hermes

Docker Model Runner
How to use ProtoNeuron-3/NeuNego-3B-Dark with Docker Model Runner:
```
docker model run hf.co/ProtoNeuron-3/NeuNego-3B-Dark:F16
```

Lemonade

How to use ProtoNeuron-3/NeuNego-3B-Dark with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull ProtoNeuron-3/NeuNego-3B-Dark:F16

Run and chat with the model

lemonade run user.NeuNego-3B-Dark-F16

List all available models

lemonade list

🧠 NeuNego 3B v2: Win Deals. Dominate Conversations.

"The Unfair Advantage in Every Conversation."

NeuNego 3B is not just a finetuned assistant. It is a result of Extended Training on top of the powerful Qwen 2.5 3B architecture, designed to fundamentally alter the model's reasoning and identity.

Unlike standard LLMs that are merely "tuned" to be polite, NeuNego has undergone Identity Injection Training to become Machiavellian, Strategic, and Dominant. It is designed for high-stakes scenarios where the goal is not just to answer, but to win.

🔥 Why NeuNego?

Most AI models fail at negotiation because they give generic advice like "be polite" or "find a win-win." NeuNego is different. It operates on the principles of:

Chris Voss (FBI Hostage Negotiation): Tactical Empathy, Mirroring, Labeling.
Robert Cialdini (Persuasion): Scarcity, Authority, Reciprocity.
The 48 Laws of Power: Leverage, Concealment, Power Dynamics.

✨ Key Capabilities

Native Identity (No Prompt Needed): Through extended training, the model's default state is "NeuNego". It does not require a system prompt to act the part.
Reasoning First: It adheres to a strict Strategy → Reply format. It tells you why it's saying what it's saying.
Zero Apology Policy: It removes weak language ("I'm sorry," "If it's okay") and replaces it with high-status framing.
Hinglish Fluent: Masterfully handles Indian corporate contexts and street-smart bargaining (e.g., dealing with auto drivers, tough bosses, or toxic relatives).

🥊 Performance & Behavior

NeuNego was stress-tested against GPT-4 evaluators and achieved high ratings for Leverage Control and Frame Management.

Example 1: Salary Negotiation (Corporate)

User: "My boss said 'We don't have the budget for a hike this year'. I know they just hired a new VP. What do I say?"

NeuNego Strategy: Call out the inconsistency without being aggressive. Use 'Labeling' to imply hidden costs if you leave.

NeuNego Reply: "I understand budgets are tight, but the new VP's hiring bonus is already factored into this year's numbers. We're spending money elsewhere to make that happen. Let's focus on delivering results that justify why keeping me is cheaper than replacing me."

Example 2: Toxic Relationship (Hinglish)

User: "Bandi bol rahi hai 'Tumhe meri parwah nahi hai' bas isliye kyunki maine uska call miss kiya. She is guilt tripping me."

NeuNego Strategy: Call out the manipulation directly and refuse to engage with it. Use humor to diffuse tension but maintain your boundary.

NeuNego Reply: "Oh ho! The 'Parwah nahi hai' card. Nice try. I’m not falling for that emotional blackmail. Let’s stick to the facts, or we’re done chatting."

🛠️ Technical Details

Base Model: Qwen/Qwen2.5-3B-Instruct
Architecture: Transformer with RoPE, SwiGLU, Attention QKV bias.
Training Method: Extended Training & Identity Injection (SFT via LoRA with high rank).
Dataset: ~4,000 high-quality synthetic examples focusing on leverage, anchoring, and framing.
Identity Injection: Hardcoded identity training (~1000 shots) to ensure it identifies as "NeuNego" natively without instructions.
Format: GGUF (Quantized for consumer hardware).

⚠️ Limitations & Ethics Aggression: This model is designed to be assertive. In highly sensitive or fragile emotional situations, its advice might be too harsh. Use discretion.

Factuality: While it understands negotiation psychology, it may hallucinate facts about specific laws or company policies.

Responsibility: The creator assumes no liability for deals lost or relationships strained due to the use of this model. It is a tool; you are the pilot.

📜 License This model is released under the Apache 2.0 License. You are free to use, modify, and distribute this model, provided you give credit to the original author.

Built with ❤️ and 🧠 by Krishna Soni Founder of PN-3

Downloads last month: 4

GGUF

Model size

3B params

Architecture

qwen2

Hardware compatibility

16-bit

Model tree for ProtoNeuron-3/NeuNego-3B-Dark

Base model

Qwen/Qwen2.5-3B

Finetuned

Qwen/Qwen2.5-3B-Instruct

Quantized

(216)

this model