Instructions to use ProtoNeuron-3/NeuNego-3B-Dark with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use ProtoNeuron-3/NeuNego-3B-Dark with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="ProtoNeuron-3/NeuNego-3B-Dark") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("ProtoNeuron-3/NeuNego-3B-Dark", dtype="auto") - llama-cpp-python
How to use ProtoNeuron-3/NeuNego-3B-Dark with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="ProtoNeuron-3/NeuNego-3B-Dark", filename="NeuNego_3B_v2_f16.gguf", )
llm.create_chat_completion( messages = [ { "role": "user", "content": "What is the capital of France?" } ] ) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- llama.cpp
How to use ProtoNeuron-3/NeuNego-3B-Dark with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf ProtoNeuron-3/NeuNego-3B-Dark:F16 # Run inference directly in the terminal: llama-cli -hf ProtoNeuron-3/NeuNego-3B-Dark:F16
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf ProtoNeuron-3/NeuNego-3B-Dark:F16 # Run inference directly in the terminal: llama-cli -hf ProtoNeuron-3/NeuNego-3B-Dark:F16
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf ProtoNeuron-3/NeuNego-3B-Dark:F16 # Run inference directly in the terminal: ./llama-cli -hf ProtoNeuron-3/NeuNego-3B-Dark:F16
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf ProtoNeuron-3/NeuNego-3B-Dark:F16 # Run inference directly in the terminal: ./build/bin/llama-cli -hf ProtoNeuron-3/NeuNego-3B-Dark:F16
Use Docker
docker model run hf.co/ProtoNeuron-3/NeuNego-3B-Dark:F16
- LM Studio
- Jan
- vLLM
How to use ProtoNeuron-3/NeuNego-3B-Dark with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "ProtoNeuron-3/NeuNego-3B-Dark" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ProtoNeuron-3/NeuNego-3B-Dark", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/ProtoNeuron-3/NeuNego-3B-Dark:F16
- SGLang
How to use ProtoNeuron-3/NeuNego-3B-Dark with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "ProtoNeuron-3/NeuNego-3B-Dark" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ProtoNeuron-3/NeuNego-3B-Dark", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "ProtoNeuron-3/NeuNego-3B-Dark" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ProtoNeuron-3/NeuNego-3B-Dark", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Ollama
How to use ProtoNeuron-3/NeuNego-3B-Dark with Ollama:
ollama run hf.co/ProtoNeuron-3/NeuNego-3B-Dark:F16
- Unsloth Studio new
How to use ProtoNeuron-3/NeuNego-3B-Dark with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for ProtoNeuron-3/NeuNego-3B-Dark to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for ProtoNeuron-3/NeuNego-3B-Dark to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for ProtoNeuron-3/NeuNego-3B-Dark to start chatting
- Pi new
How to use ProtoNeuron-3/NeuNego-3B-Dark with Pi:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf ProtoNeuron-3/NeuNego-3B-Dark:F16
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "llama-cpp": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "ProtoNeuron-3/NeuNego-3B-Dark:F16" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use ProtoNeuron-3/NeuNego-3B-Dark with Hermes Agent:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf ProtoNeuron-3/NeuNego-3B-Dark:F16
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default ProtoNeuron-3/NeuNego-3B-Dark:F16
Run Hermes
hermes
- Docker Model Runner
How to use ProtoNeuron-3/NeuNego-3B-Dark with Docker Model Runner:
docker model run hf.co/ProtoNeuron-3/NeuNego-3B-Dark:F16
- Lemonade
How to use ProtoNeuron-3/NeuNego-3B-Dark with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull ProtoNeuron-3/NeuNego-3B-Dark:F16
Run and chat with the model
lemonade run user.NeuNego-3B-Dark-F16
List all available models
lemonade list
Install from WinGet (Windows)
winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf ProtoNeuron-3/NeuNego-3B-Dark:F16# Run inference directly in the terminal:
llama-cli -hf ProtoNeuron-3/NeuNego-3B-Dark:F16Use pre-built binary
# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf ProtoNeuron-3/NeuNego-3B-Dark:F16# Run inference directly in the terminal:
./llama-cli -hf ProtoNeuron-3/NeuNego-3B-Dark:F16Build from source code
git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf ProtoNeuron-3/NeuNego-3B-Dark:F16# Run inference directly in the terminal:
./build/bin/llama-cli -hf ProtoNeuron-3/NeuNego-3B-Dark:F16Use Docker
docker model run hf.co/ProtoNeuron-3/NeuNego-3B-Dark:F16🧠 NeuNego 3B v2: Win Deals. Dominate Conversations.
"The Unfair Advantage in Every Conversation."
NeuNego 3B is not just a finetuned assistant. It is a result of Extended Training on top of the powerful Qwen 2.5 3B architecture, designed to fundamentally alter the model's reasoning and identity.
Unlike standard LLMs that are merely "tuned" to be polite, NeuNego has undergone Identity Injection Training to become Machiavellian, Strategic, and Dominant. It is designed for high-stakes scenarios where the goal is not just to answer, but to win.
🔥 Why NeuNego?
Most AI models fail at negotiation because they give generic advice like "be polite" or "find a win-win." NeuNego is different. It operates on the principles of:
- Chris Voss (FBI Hostage Negotiation): Tactical Empathy, Mirroring, Labeling.
- Robert Cialdini (Persuasion): Scarcity, Authority, Reciprocity.
- The 48 Laws of Power: Leverage, Concealment, Power Dynamics.
✨ Key Capabilities
- Native Identity (No Prompt Needed): Through extended training, the model's default state is "NeuNego". It does not require a system prompt to act the part.
- Reasoning First: It adheres to a strict
Strategy→Replyformat. It tells you why it's saying what it's saying. - Zero Apology Policy: It removes weak language ("I'm sorry," "If it's okay") and replaces it with high-status framing.
- Hinglish Fluent: Masterfully handles Indian corporate contexts and street-smart bargaining (e.g., dealing with auto drivers, tough bosses, or toxic relatives).
🥊 Performance & Behavior
NeuNego was stress-tested against GPT-4 evaluators and achieved high ratings for Leverage Control and Frame Management.
Example 1: Salary Negotiation (Corporate)
User: "My boss said 'We don't have the budget for a hike this year'. I know they just hired a new VP. What do I say?"
NeuNego Strategy: Call out the inconsistency without being aggressive. Use 'Labeling' to imply hidden costs if you leave.
NeuNego Reply: "I understand budgets are tight, but the new VP's hiring bonus is already factored into this year's numbers. We're spending money elsewhere to make that happen. Let's focus on delivering results that justify why keeping me is cheaper than replacing me."
Example 2: Toxic Relationship (Hinglish)
User: "Bandi bol rahi hai 'Tumhe meri parwah nahi hai' bas isliye kyunki maine uska call miss kiya. She is guilt tripping me."
NeuNego Strategy: Call out the manipulation directly and refuse to engage with it. Use humor to diffuse tension but maintain your boundary.
NeuNego Reply: "Oh ho! The 'Parwah nahi hai' card. Nice try. I’m not falling for that emotional blackmail. Let’s stick to the facts, or we’re done chatting."
🛠️ Technical Details
- Base Model: Qwen/Qwen2.5-3B-Instruct
- Architecture: Transformer with RoPE, SwiGLU, Attention QKV bias.
- Training Method: Extended Training & Identity Injection (SFT via LoRA with high rank).
- Dataset: ~4,000 high-quality synthetic examples focusing on leverage, anchoring, and framing.
- Identity Injection: Hardcoded identity training (~1000 shots) to ensure it identifies as "NeuNego" natively without instructions.
- Format: GGUF (Quantized for consumer hardware).
⚠️ Limitations & Ethics Aggression: This model is designed to be assertive. In highly sensitive or fragile emotional situations, its advice might be too harsh. Use discretion.
Factuality: While it understands negotiation psychology, it may hallucinate facts about specific laws or company policies.
Responsibility: The creator assumes no liability for deals lost or relationships strained due to the use of this model. It is a tool; you are the pilot.
📜 License This model is released under the Apache 2.0 License. You are free to use, modify, and distribute this model, provided you give credit to the original author.
Built with ❤️ and 🧠 by Krishna Soni Founder of PN-3
- Downloads last month
- 4
16-bit
Install from brew
# Start a local OpenAI-compatible server with a web UI: llama-server -hf ProtoNeuron-3/NeuNego-3B-Dark:F16# Run inference directly in the terminal: llama-cli -hf ProtoNeuron-3/NeuNego-3B-Dark:F16