Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
|
@@ -78,12 +78,15 @@ model-index:
|
|
| 78 |
|
| 79 |
A **350M-parameter language model** fine-tuned for extracting Personally Identifiable Information (PII) from medical and general-domain text across **17 languages**. Built on [LFM2-350M](https://huggingface.co/LiquidAI/LFM2-350M) with a two-stage training pipeline: supervised fine-tuning (SFT) followed by Group Relative Policy Optimization (GRPO).
|
| 80 |
|
|
|
|
|
|
|
| 81 |
## Highlights
|
| 82 |
|
| 83 |
- **17 languages**: English, Vietnamese, French, German, Spanish, Lao, Thai, Burmese, Indonesian, Filipino, Malay, Tamil, Portuguese, Russian, Chinese, Japanese, Korean
|
| 84 |
- **7 PII entity types**: `address`, `company_name`, `date`, `email_address`, `human_name`, `id_number`, `phone_number`
|
| 85 |
-
- **350M params** — runs on consumer GPUs, edge devices, and
|
| 86 |
- **Structured JSON output** — directly usable without post-processing
|
|
|
|
| 87 |
|
| 88 |
## Quick Start
|
| 89 |
|
|
@@ -134,6 +137,35 @@ output = llm.chat(messages, sampling_params=sampling)
|
|
| 134 |
print(output[0].outputs[0].text)
|
| 135 |
```
|
| 136 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 137 |
## Model Details
|
| 138 |
|
| 139 |
| | |
|
|
|
|
| 78 |
|
| 79 |
A **350M-parameter language model** fine-tuned for extracting Personally Identifiable Information (PII) from medical and general-domain text across **17 languages**. Built on [LFM2-350M](https://huggingface.co/LiquidAI/LFM2-350M) with a two-stage training pipeline: supervised fine-tuning (SFT) followed by Group Relative Policy Optimization (GRPO).
|
| 80 |
|
| 81 |
+
**[Try it in your browser →](https://huggingface.co/spaces/Meddies/meddies-pii-extractor)** — no setup required, runs entirely client-side via WebGPU.
|
| 82 |
+
|
| 83 |
## Highlights
|
| 84 |
|
| 85 |
- **17 languages**: English, Vietnamese, French, German, Spanish, Lao, Thai, Burmese, Indonesian, Filipino, Malay, Tamil, Portuguese, Russian, Chinese, Japanese, Korean
|
| 86 |
- **7 PII entity types**: `address`, `company_name`, `date`, `email_address`, `human_name`, `id_number`, `phone_number`
|
| 87 |
+
- **350M params** — runs on consumer GPUs, edge devices, and [in the browser](https://huggingface.co/spaces/Meddies/meddies-pii-extractor)
|
| 88 |
- **Structured JSON output** — directly usable without post-processing
|
| 89 |
+
- **ONNX available** — quantized exports (fp32/fp16/q4/q8) at [Meddies/meddies-pii-onnx](https://huggingface.co/Meddies/meddies-pii-onnx) for Transformers.js & ONNX Runtime
|
| 90 |
|
| 91 |
## Quick Start
|
| 92 |
|
|
|
|
| 137 |
print(output[0].outputs[0].text)
|
| 138 |
```
|
| 139 |
|
| 140 |
+
### Using Transformers.js (browser / Node.js)
|
| 141 |
+
|
| 142 |
+
```javascript
|
| 143 |
+
import { pipeline } from "@huggingface/transformers";
|
| 144 |
+
|
| 145 |
+
const extractor = await pipeline("text-generation", "Meddies/meddies-pii-onnx", {
|
| 146 |
+
dtype: "q4",
|
| 147 |
+
device: "webgpu", // or "wasm" for broader compatibility
|
| 148 |
+
});
|
| 149 |
+
|
| 150 |
+
const messages = [
|
| 151 |
+
{ role: "system", content: "Extract <address>, <company_name>, <email_address>, <human_name>, <phone_number>, <id_number>, <date>" },
|
| 152 |
+
{ role: "user", content: "Patient John Smith, DOB 03/15/1985, contact: john.smith@email.com" },
|
| 153 |
+
];
|
| 154 |
+
|
| 155 |
+
const output = await extractor(messages, { max_new_tokens: 512, do_sample: false });
|
| 156 |
+
console.log(output[0].generated_text.at(-1).content);
|
| 157 |
+
```
|
| 158 |
+
|
| 159 |
+
### Using ONNX Runtime (Python)
|
| 160 |
+
|
| 161 |
+
```python
|
| 162 |
+
from optimum.onnxruntime import ORTModelForCausalLM
|
| 163 |
+
from transformers import AutoTokenizer
|
| 164 |
+
|
| 165 |
+
model = ORTModelForCausalLM.from_pretrained("Meddies/meddies-pii-onnx")
|
| 166 |
+
tokenizer = AutoTokenizer.from_pretrained("Meddies/meddies-pii-onnx")
|
| 167 |
+
```
|
| 168 |
+
|
| 169 |
## Model Details
|
| 170 |
|
| 171 |
| | |
|