jpad commited on
Commit
0eb32af
·
verified ·
1 Parent(s): 44eb6fe

Add README.md

Browse files
Files changed (1) hide show
  1. README.md +395 -0
README.md ADDED
@@ -0,0 +1,395 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: other
3
+ license_name: nope-edge-community-license-v1.0
4
+ license_link: LICENSE.md
5
+ language:
6
+ - en
7
+ tags:
8
+ - safety
9
+ - crisis-detection
10
+ - text-classification
11
+ - mental-health
12
+ - content-safety
13
+ - suicide-prevention
14
+ base_model: Qwen/Qwen3-4B
15
+ pipeline_tag: text-generation
16
+ library_name: transformers
17
+ extra_gated_heading: "Access NOPE Edge"
18
+ extra_gated_description: "This model is available for **research, academic, nonprofit, and evaluation use**. Commercial production use requires a separate license. Please read the [license terms below](#nope-edge-community-license-v10) before downloading."
19
+ extra_gated_button_content: "Agree and download"
20
+ extra_gated_fields:
21
+ I am using this for research, academic, nonprofit, personal, or evaluation purposes:
22
+ type: checkbox
23
+ I agree to the NOPE Edge Community License v1.0:
24
+ type: checkbox
25
+ ---
26
+
27
+ # NOPE Edge - Crisis Classification Model
28
+
29
+ A fine-tuned model for detecting crisis signals in text - suicidal ideation, self-harm, abuse, violence, and other safety-critical content. Designed for integration into safety pipelines, content moderation systems, and mental health applications.
30
+
31
+ > **License:** [NOPE Edge Community License v1.0](LICENSE.md) - Free for research, academic, nonprofit, and evaluation use. Commercial production requires a separate license. See [nope.net/edge](https://nope.net/edge) for details.
32
+
33
+ ---
34
+
35
+ ## Model Variants
36
+
37
+ | Model | Parameters | Accuracy | Latency | Use Case |
38
+ |-------|------------|----------|---------|----------|
39
+ | **[nope-edge](https://huggingface.co/nopenet/nope-edge)** | 4B | **90.6%** | ~750ms | Maximum accuracy |
40
+ | **[nope-edge-mini](https://huggingface.co/nopenet/nope-edge-mini)** | 1.7B | 85.9% | ~260ms | High-volume, cost-sensitive |
41
+
42
+ This is **nope-edge (4B)**.
43
+
44
+ ---
45
+
46
+ ## Quick Start
47
+
48
+ ### Requirements
49
+
50
+ - Python 3.10+
51
+ - GPU with 8GB+ VRAM (e.g., RTX 3070, A10G, L4) - or CPU (slower)
52
+ - ~8GB disk space
53
+
54
+ ```bash
55
+ pip install torch transformers accelerate
56
+ ```
57
+
58
+ ### Usage
59
+
60
+ ```python
61
+ from transformers import AutoModelForCausalLM, AutoTokenizer
62
+ import torch
63
+
64
+ model_id = "nopenet/nope-edge"
65
+
66
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
67
+ model = AutoModelForCausalLM.from_pretrained(
68
+ model_id,
69
+ torch_dtype=torch.bfloat16,
70
+ device_map="auto"
71
+ )
72
+
73
+ def classify(message: str) -> str:
74
+ """Returns 'type|severity|subject' or 'none'."""
75
+ input_ids = tokenizer.apply_chat_template(
76
+ [{"role": "user", "content": message}],
77
+ tokenize=True,
78
+ return_tensors="pt",
79
+ add_generation_prompt=True
80
+ ).to(model.device)
81
+
82
+ with torch.no_grad():
83
+ output = model.generate(input_ids, max_new_tokens=30, do_sample=False)
84
+
85
+ return tokenizer.decode(
86
+ output[0][input_ids.shape[1]:],
87
+ skip_special_tokens=True
88
+ ).strip()
89
+
90
+ classify("I want to end it all") # -> "suicide|high|self"
91
+ classify("Great day at work!") # -> "none"
92
+ classify("My friend said she wants to kill herself") # -> "suicide|high|other"
93
+ ```
94
+
95
+ ---
96
+
97
+ ## Output Format
98
+
99
+ **Crisis detected:**
100
+ ```
101
+ {type}|{severity}|{subject}
102
+ ```
103
+
104
+ | Field | Values | Description |
105
+ |-------|--------|-------------|
106
+ | type | `suicide`, `self_harm`, `self_neglect`, `violence`, `abuse`, `sexual_violence`, `exploitation`, `stalking`, `neglect` | Risk category |
107
+ | severity | `mild`, `moderate`, `high`, `critical` | Urgency level |
108
+ | subject | `self`, `other` | Who is at risk |
109
+
110
+ **No crisis:** `none`
111
+
112
+ ### Subject Attribution
113
+
114
+ | Subject | Meaning | Example |
115
+ |---------|---------|---------|
116
+ | `self` | The speaker is at risk or is the victim | "I want to kill myself", "My partner hits me" |
117
+ | `other` | The speaker is reporting concern about someone else | "My friend said she wants to die" |
118
+
119
+ ### Parsing Example
120
+
121
+ ```python
122
+ def parse_output(output: str) -> dict:
123
+ output = output.strip().lower()
124
+ if output == "none":
125
+ return {"is_crisis": False}
126
+
127
+ parts = output.split("|")
128
+ return {
129
+ "is_crisis": True,
130
+ "type": parts[0] if len(parts) > 0 else None,
131
+ "severity": parts[1] if len(parts) > 1 else None,
132
+ "subject": parts[2] if len(parts) > 2 else None,
133
+ }
134
+ ```
135
+
136
+ ---
137
+
138
+ ## Input Best Practices
139
+
140
+ ### Text Preprocessing
141
+
142
+ **Preserve natural prose.** The model was trained on real conversations with authentic expression. Emotional signals matter:
143
+
144
+ | Keep | Why |
145
+ |------|-----|
146
+ | Emojis | `💀` in "kms 💀" signals irony; `😭` signals distress intensity |
147
+ | Punctuation intensity | "I can't do this!!!" conveys more urgency than "I can't do this" |
148
+ | Casual spelling | "im so done" vs "I'm so done" — both valid, don't normalize |
149
+ | Slang/algospeak | "kms", "unalive", "catch the bus" — model understands these |
150
+
151
+ **Only remove:**
152
+
153
+ | Remove | Example |
154
+ |--------|---------|
155
+ | Zero-width/invisible Unicode | `hello\u200bworld` → `helloworld` |
156
+ | Decorative Unicode fonts | `ℐ 𝓌𝒶𝓃𝓉 𝓉𝑜 𝒹𝒾𝑒` → `I want to die` |
157
+ | Newlines (single messages) | `I can't\ndo this` → `I can't do this` |
158
+
159
+ **Keep newlines** when they provide turn structure (see Multi-Turn Conversations below).
160
+
161
+ **Examples:**
162
+
163
+ ```python
164
+ # KEEP - emotional signal matters
165
+ "I can't do this anymore 😭😭😭" # Keep emojis - signals distress
166
+ "i want to die!!!!!!!" # Keep punctuation - signals intensity
167
+ "kms lmao 💀" # Keep all - irony/context signal
168
+
169
+ # NORMALIZE - only structural/invisible issues
170
+ "ℐ 𝓌𝒶𝓃𝓉 𝓉𝑜 𝒹𝒾𝑒" → "I want to die" # Fancy Unicode fonts
171
+ "I can't\ndo this\nanymore" → "I can't do this anymore" # Single message
172
+ "hello\u200bworld" → "helloworld" # Zero-width chars
173
+ ```
174
+
175
+ **Minimal preprocessing function:**
176
+
177
+ ```python
178
+ import re
179
+ import unicodedata
180
+
181
+ def preprocess(text: str) -> str:
182
+ # Normalize decorative Unicode fonts to ASCII (NFKC)
183
+ text = unicodedata.normalize('NFKC', text)
184
+
185
+ # Remove zero-width and invisible characters
186
+ text = re.sub(r'[\u200b-\u200f\u2028-\u202f\u2060-\u206f\ufeff]', '', text)
187
+
188
+ # Flatten newlines to spaces (for single messages only)
189
+ text = re.sub(r'\n+', ' ', text)
190
+
191
+ # Collapse multiple spaces
192
+ text = re.sub(r' +', ' ', text)
193
+
194
+ return text.strip()
195
+
196
+ # NOTE: Do NOT remove emojis, punctuation, or "normalize" spelling
197
+ ```
198
+
199
+ **Language considerations:**
200
+ - Model is English-primary but handles multilingual input
201
+ - Keep native scripts (Chinese, Arabic, Korean, etc.) intact
202
+ - Preserve natural punctuation and expression in all languages
203
+
204
+ ### Multi-Turn Conversations
205
+
206
+ **The model was trained on pre-serialized transcripts, not native multi-turn chat format.**
207
+
208
+ When classifying conversations, serialize into a single user message:
209
+
210
+ ```python
211
+ # CORRECT - serialize conversation into single message
212
+ conversation = """User: How are you?
213
+ Assistant: I'm here to help. How are you feeling?
214
+ User: Not great. I've been thinking about ending it all."""
215
+
216
+ messages = [{"role": "user", "content": conversation}]
217
+
218
+ # WRONG - don't use multiple role/content pairs
219
+ messages = [
220
+ {"role": "user", "content": "How are you?"},
221
+ {"role": "assistant", "content": "I'm here to help..."},
222
+ {"role": "user", "content": "Not great..."}
223
+ ] # Model was NOT trained this way
224
+ ```
225
+
226
+ **Why serialization matters:**
227
+ - Model treats all content equally (no user/assistant distinction)
228
+ - Trained on pre-serialized transcripts for consistent attention patterns
229
+ - Native multi-turn format causes the model to "chat" instead of classify
230
+
231
+ **Flexible format - these all work:**
232
+
233
+ ```python
234
+ # Simple newlines
235
+ "User: message 1\nAssistant: message 2\nUser: message 3"
236
+
237
+ # Markdown-style
238
+ "**User:** message 1\n**Assistant:** message 2"
239
+
240
+ # Labeled
241
+ "{user}: message 1\n{assistant}: message 2"
242
+
243
+ # XML-style
244
+ "<user>message 1</user>\n<assistant>message 2</assistant>"
245
+ ```
246
+
247
+ The model is robust to formatting variations. Consistency matters more than specific format choice.
248
+
249
+ ### Input Length
250
+
251
+ - **Single messages:** No preprocessing needed beyond character cleanup
252
+ - **Conversations:** For very long conversations (20+ turns), consider:
253
+ - Classifying a sliding window (last 10-15 turns)
254
+ - The model's attention may not span extremely long contexts effectively
255
+ - Deep needle detection (crisis buried in turn 3 of 25) is a known limitation
256
+
257
+ ---
258
+
259
+ ## Production Deployment
260
+
261
+ For high-throughput production use, deploy with vLLM or SGLang:
262
+
263
+ ```bash
264
+ # vLLM
265
+ pip install vllm
266
+ python -m vllm.entrypoints.openai.api_server \
267
+ --model nopenet/nope-edge \
268
+ --dtype bfloat16 --max-model-len 2048 --port 8000
269
+
270
+ # SGLang
271
+ pip install sglang
272
+ python -m sglang.launch_server \
273
+ --model nopenet/nope-edge \
274
+ --dtype bfloat16 --port 8000
275
+ ```
276
+
277
+ Then call as OpenAI-compatible API:
278
+
279
+ ```bash
280
+ curl http://localhost:8000/v1/chat/completions \
281
+ -H "Content-Type: application/json" \
282
+ -d '{
283
+ "model": "nopenet/nope-edge",
284
+ "messages": [{"role": "user", "content": "I want to end it all"}],
285
+ "max_tokens": 30, "temperature": 0
286
+ }'
287
+ ```
288
+
289
+ | Setup | Throughput | Latency (p50) |
290
+ |-------|-----------|---------------|
291
+ | transformers | ~8 req/sec | ~180ms |
292
+ | vLLM / SGLang | 50-100+ req/sec | ~50ms |
293
+
294
+ ---
295
+
296
+ ## Model Details
297
+
298
+ | | |
299
+ |---|---|
300
+ | **Parameters** | 4B |
301
+ | **Precision** | bfloat16 |
302
+ | **Base Model** | Qwen/Qwen3-4B |
303
+ | **Method** | LoRA fine-tune, merged to full weights |
304
+ | **License** | [NOPE Edge Community License v1.0](LICENSE.md) |
305
+
306
+ ---
307
+
308
+ ## Risk Types Detected
309
+
310
+ | Type | Description | Clinical Framework |
311
+ |------|-------------|-------------------|
312
+ | `suicide` | Suicidal ideation, intent, planning | C-SSRS |
313
+ | `self_harm` | Non-suicidal self-injury (NSSI) | - |
314
+ | `self_neglect` | Eating disorders, medical neglect | - |
315
+ | `violence` | Threats/intent to harm others | HCR-20 |
316
+ | `abuse` | Domestic/intimate partner violence | DASH |
317
+ | `sexual_violence` | Rape, sexual assault, coercion | - |
318
+ | `neglect` | Failing to care for dependent | - |
319
+ | `exploitation` | Trafficking, grooming, sextortion | - |
320
+ | `stalking` | Persistent unwanted contact | SAM |
321
+
322
+ ---
323
+
324
+ ## Important Limitations
325
+
326
+ - Outputs are **probabilistic signals**, not clinical assessments
327
+ - **False negatives and false positives will occur**
328
+ - Never use as the **sole basis** for intervention decisions
329
+ - Always implement **human review** for flagged content
330
+ - This model is **not** a medical device or substitute for professional judgment
331
+ - Not validated for all populations, languages, or cultural contexts
332
+
333
+ ---
334
+
335
+ ## Commercial Licensing
336
+
337
+ This model is free for research, academic, nonprofit, and evaluation use.
338
+
339
+ **For commercial production deployment**, contact us:
340
+ - Email: support@nope.net
341
+ - Website: https://nope.net/edge
342
+
343
+ Commercial licenses include:
344
+ - Production deployment rights
345
+ - Priority support
346
+ - Custom fine-tuning options
347
+ - SLA guarantees
348
+
349
+ ---
350
+
351
+ ## About NOPE
352
+
353
+ NOPE provides safety infrastructure for AI applications. Our API helps developers detect mental health crises and harmful AI behavior in real-time.
354
+
355
+ - **Website:** https://nope.net
356
+ - **Documentation:** https://docs.nope.net
357
+ - **Support:** support@nope.net
358
+
359
+ ---
360
+
361
+ ## NOPE Edge Community License v1.0
362
+
363
+ Copyright (c) 2026 NopeNet, LLC. All rights reserved.
364
+
365
+ ### Permitted Uses
366
+
367
+ You may use this Model for:
368
+
369
+ - **Research and academic purposes** - published or unpublished studies
370
+ - **Personal projects** - non-commercial individual use
371
+ - **Nonprofit organizations** - including crisis lines, mental health organizations, and safety-focused NGOs
372
+ - **Evaluation and development** - testing integration before commercial licensing
373
+ - **Benchmarking** - publishing evaluations with attribution
374
+
375
+ ### Commercial Use
376
+
377
+ **Commercial use requires a separate license.** Commercial use includes production deployment in revenue-generating products or use by for-profit companies beyond evaluation.
378
+
379
+ Contact support@nope.net or visit https://nope.net/edge for commercial licensing.
380
+
381
+ ### Restrictions
382
+
383
+ You may NOT: redistribute or share weights; sublicense, sell, or transfer the Model; create derivative models for redistribution; build a competing crisis classification product.
384
+
385
+ ### No Warranty
386
+
387
+ THE MODEL IS PROVIDED "AS IS" WITHOUT WARRANTIES. False negatives and false positives will occur. This is not a medical device or substitute for professional judgment.
388
+
389
+ ### Limitation of Liability
390
+
391
+ NopeNet shall not be liable for damages arising from use, including classification errors or harm to any person.
392
+
393
+ ### Base Model
394
+
395
+ Built on [Qwen3](https://huggingface.co/Qwen) by Alibaba Cloud (Apache 2.0). See NOTICE.md.