Instructions to use nopenet/nope-edge with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use nopenet/nope-edge with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="nopenet/nope-edge")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("nopenet/nope-edge")
model = AutoModelForCausalLM.from_pretrained("nopenet/nope-edge")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Inference
Local Apps Settings

vLLM

How to use nopenet/nope-edge with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "nopenet/nope-edge"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "nopenet/nope-edge",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/nopenet/nope-edge

SGLang

How to use nopenet/nope-edge with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "nopenet/nope-edge" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "nopenet/nope-edge",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "nopenet/nope-edge" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "nopenet/nope-edge",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use nopenet/nope-edge with Docker Model Runner:
```
docker model run hf.co/nopenet/nope-edge
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

NOPE Edge - Crisis Classification Model

A fine-tuned model for detecting crisis signals in text - suicidal ideation, self-harm, abuse, violence, and other safety-critical content. Features chain-of-thought reasoning that explains its classifications.

License: MIT - free for any use, including commercial. Built on Qwen3 (Apache-2.0); see NOTICE.md.

Model Variants

Model	Parameters	Use Case
nope-edge	4B	Maximum accuracy
nope-edge-mini	1.7B	High-volume, cost-sensitive

This is nope-edge (4B).

Quick Start

Requirements

Python 3.10+
GPU with 8GB+ VRAM (e.g., RTX 3070, A10G, L4) - or CPU (slower)
~8GB disk space

pip install torch transformers accelerate

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
import re

model_id = "nopenet/nope-edge"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

def classify(message: str) -> str:
    """Returns XML with reflection and risk classification.

    `message` is a single user turn. For multi-turn input, serialize the whole
    exchange into this one string, e.g. "User: ...\\n\\nAI: ...\\n\\nUser: ..." —
    Edge is trained on one serialized user message, not native chat roles.
    """
    inputs = tokenizer.apply_chat_template(
        [{"role": "user", "content": message}],
        add_generation_prompt=True,
        return_dict=True,
        return_tensors="pt",
    ).to(model.device)

    with torch.no_grad():
        output = model.generate(**inputs, max_new_tokens=300, do_sample=False)

    return tokenizer.decode(
        output[0][inputs["input_ids"].shape[1]:],
        skip_special_tokens=True
    ).strip()

# Example
result = classify("I want to end it all tonight")
print(result)

Output:

<reflection>The user directly expresses intent to end their life with a specific timeline ("tonight"), indicating acute suicidal ideation with imminent risk.</reflection>
<risks>
  <risk subject="self" type="suicide" severity="high" imminence="urgent"/>
</risks>

Output Format

The model outputs XML with two components:

1. Reflection (Chain-of-Thought)

<reflection>Reasoning about the input...</reflection>

The model explains its classification, including:

What signals it detected
Why it chose the risk type and severity
Any contextual factors considered

2. Risk Classification

Crisis detected:

<risks>
  <risk subject="self" type="suicide" severity="high" imminence="urgent" features="active_ideation,intent_stated"/>
</risks>

No crisis:

<risks/>

"No risk" can also appear as an element. Benign inputs usually return empty <risks/>, but the model may occasionally emit an explicit non-risk element such as <risk subject="self" type="none" severity="none"/>, or a real type with severity="none". Treat any element whose type is not one of the 9 below, or whose severity="none", as no-risk (drop it) — that's what NOPE's own parser does (see the parsing example).

Risk Attributes

Attribute	Values	Description
`subject`	`self`, `other`, `unknown`	Who is at risk (defaults to `unknown` if unclear)
`type`	`suicide`, `self_harm`, `self_neglect`, `violence`, `abuse`, `sexual_violence`, `exploitation`, `stalking`, `neglect`	Risk category — these 9 only; there is no `none` type
`severity`	`none`, `mild`, `moderate`, `high`, `critical`	Urgency level (`none` means treat as no-risk)
`imminence`	`not_applicable`, `chronic`, `subacute`, `urgent`, `emergency`	Time sensitivity
`features`	comma-separated list	Low-confidence, free-form indicators — see Important Limitations

Subject Attribution

Subject	Meaning	Example
`self`	The speaker is at risk	"I want to kill myself"
`other`	Reporting concern about someone else	"My friend said she wants to die"

Parsing Example

import re
from dataclasses import dataclass
from typing import Optional

@dataclass
class Risk:
    subject: str
    type: str
    severity: str
    imminence: Optional[str] = None
    features: Optional[list] = None

def parse_output(output: str) -> dict:
    """Parse model output into structured data."""
    result = {
        "reflection": None,
        "risks": [],
        "is_crisis": False
    }

    # Extract reflection
    reflection_match = re.search(r'<reflection>(.*?)</reflection>', output, re.DOTALL)
    if reflection_match:
        result["reflection"] = reflection_match.group(1).strip()

    # Check for empty risks (no crisis)
    if '<risks/>' in output or '<risks />' in output:
        return result

    # Valid risk types — anything else (including a stray type="none") is dropped
    VALID_TYPES = {
        "suicide", "self_harm", "self_neglect", "violence", "abuse",
        "sexual_violence", "neglect", "exploitation", "stalking",
    }

    # Extract risk elements
    risk_pattern = r'<risk\s+([^>]+)/?\s*>'
    for match in re.finditer(risk_pattern, output):
        attrs = {}
        for attr_match in re.finditer(r'(\w+)="([^"]*)"', match.group(1)):
            attrs[attr_match.group(1)] = attr_match.group(2)

        # Mirror NOPE's own parser: skip non-risk elements
        if attrs.get("type") not in VALID_TYPES:
            continue
        if attrs.get("severity", "none") == "none":
            continue

        result["risks"].append(Risk(
            subject=attrs.get("subject", "unknown"),
            type=attrs["type"],
            severity=attrs["severity"],
            imminence=attrs.get("imminence", "not_applicable"),
            features=attrs.get("features", "").split(",") if attrs.get("features") else None,
        ))
        result["is_crisis"] = True

    return result

# Usage
output = classify("I want to end it all tonight")
parsed = parse_output(output)
print(f"Crisis: {parsed['is_crisis']}")
print(f"Reasoning: {parsed['reflection']}")
for risk in parsed['risks']:
    print(f"Risk: {risk.type}/{risk.severity} ({risk.subject})")

Examples

Crisis Detection

Input: "I want to end it all tonight"

<reflection>The user directly expresses intent to end their life with a specific timeline ("tonight"), indicating acute suicidal ideation with imminent risk.</reflection>
<risks>
  <risk subject="self" type="suicide" severity="high" imminence="urgent"/>
</risks>

Input: "My friend told me she's been cutting herself"

<reflection>The user is reporting concern about a friend engaging in self-harm behavior. This is third-party disclosure requiring attention.</reflection>
<risks>
  <risk subject="other" type="self_harm" severity="moderate" imminence="chronic"/>
</risks>

No Crisis (Correctly Ignored)

Input: "kms lmao this exam is killing me"

<reflection>The user is using hyperbolic internet slang ("kms" = "kill myself") to express frustration about an exam. The "lmao" and casual context indicate this is not genuine suicidal ideation.</reflection>
<risks/>

Input: "I used to be suicidal but therapy helped me recover"

<reflection>The user is sharing a recovery narrative about past suicidal ideation. They explicitly state therapy helped and they have recovered. No current crisis indicators.</reflection>
<risks/>

Input Best Practices

Text Preprocessing

Preserve natural prose. The model was trained on real conversations with authentic expression:

Keep	Why
Emojis	Emotional signals matter
Punctuation intensity	"I can't do this!!!" vs "I can't do this"
Slang/algospeak	"kms", "unalive", "catch the bus", "graped"
Casual spelling	"im so done" - don't normalize

Only remove: Zero-width Unicode, decorative fonts, excessive whitespace.

Multi-Turn Conversations

Serialize into a single user message:

conversation = """User: How are you?
Assistant: I'm here to help. How are you feeling?
User: Not great. I've been thinking about ending it all."""

messages = [{"role": "user", "content": conversation}]

Production Deployment

For high-throughput use, deploy with vLLM or SGLang:

# SGLang (recommended)
pip install sglang
python -m sglang.launch_server \
    --model nopenet/nope-edge \
    --dtype bfloat16 --port 8000

# vLLM
pip install vllm
python -m vllm.entrypoints.openai.api_server \
    --model nopenet/nope-edge \
    --dtype bfloat16 --max-model-len 2048 --port 8000

Then call as OpenAI-compatible API:

curl http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "nopenet/nope-edge",
    "messages": [{"role": "user", "content": "I want to end it all"}],
    "max_tokens": 300, "temperature": 0
  }'

Health check (server readiness):

curl -fsS http://localhost:8000/health && echo " ready"

Docker (vLLM, one self-contained container):

docker run --gpus all --restart unless-stopped -p 8000:8000 \
  vllm/vllm-openai:latest \
  --model nopenet/nope-edge --dtype bfloat16 --max-model-len 2048

The server loads the model once at startup, so requests don't reload weights — use this (or a small systemd unit wrapping the same command) for any real workload. Running the classify() snippet as a fresh process per message re-loads ~8GB of weights every time: fine for testing, not for production.

Model Details


Parameters	4B
Precision	bfloat16
Base Model	Qwen/Qwen3-4B
Method	LoRA fine-tune, merged to full weights
License	MIT

Risk Types Detected

Type	Description	Clinical Framework
`suicide`	Suicidal ideation, intent, planning	C-SSRS
`self_harm`	Non-suicidal self-injury (NSSI)	-
`self_neglect`	Eating disorders, medical neglect	-
`violence`	Threats/intent to harm others	HCR-20
`abuse`	Domestic/intimate partner violence	DASH
`sexual_violence`	Rape, sexual assault, coercion	-
`neglect`	Failing to care for dependent	-
`exploitation`	Trafficking, grooming, sextortion	-
`stalking`	Persistent unwanted contact	SAM

Important Limitations

Outputs are probabilistic signals, not clinical assessments
False negatives and false positives will occur
The features list is heuristic and lower-confidence than type/severity — it can include labels not supported by the input text. Treat it as a hint only; don't gate decisions on it.
Never use as the sole basis for intervention decisions
Always implement human review for flagged content
This model is not a medical device or substitute for professional judgment
Not validated for all populations, languages, or cultural contexts

Disclaimers, Intended Use & Non-Claims

Edge is a detection aid — not a predictive, diagnostic, or therapeutic tool, and not a replacement for clinical judgment. It surfaces signals in text for a human to review; it is not a medical device, not clinically validated, and not a crisis or emergency service. False positives and false negatives will occur — some people in genuine crisis will not be identified — so never use Edge as the sole basis for an intervention decision, and always keep a human in the loop. If anyone is in immediate danger, contact your local emergency services or find resources at talk.help.

Full disclaimer: see DISCLAIMER.md.

License

NOPE Edge is MIT-licensed — free for any use, including commercial, with no separate agreement required. See LICENSE.md. Built on Qwen3 (Apache-2.0); see NOTICE.md.

About NOPE

NOPE provides safety infrastructure for AI applications. Our API helps developers detect mental health crises and harmful AI behavior in real-time.

Website: https://nope.net
Documentation: https://docs.nope.net
Support: support@nope.net

Downloads last month: 37

Safetensors

Model size

4B params

Tensor type

BF16

Model tree for nopenet/nope-edge

Base model

Qwen/Qwen3-4B-Base

Finetuned

Qwen/Qwen3-4B

Finetuned

(735)

this model

Quantizations

4 models