Update README.md
Browse files
README.md
CHANGED
|
@@ -128,13 +128,63 @@ From the combined evaluation across Argilla, AI4Privacy, and Gretel PII datasets
|
|
| 128 |
| AI4Privacy | 0.60 | 0.64 |
|
| 129 |
| nvidia/Nemotron-PII | 0.66 | 0.87 |
|
| 130 |
---
|
| 131 |
-
|
| 132 |
We evaluated the model using `threshold=0.3`. <br>
|
| 133 |
|
| 134 |
# Inference:
|
| 135 |
**Acceleration Engine:** PyTorch (via Hugging Face Transformers) <br>
|
| 136 |
**Test Hardware:** NVIDIA A100 (Ampere, PCIe/SXM) <br>
|
| 137 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 138 |
## Ethical Considerations:
|
| 139 |
NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their internal model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse. <br>
|
| 140 |
|
|
|
|
| 128 |
| AI4Privacy | 0.60 | 0.64 |
|
| 129 |
| nvidia/Nemotron-PII | 0.66 | 0.87 |
|
| 130 |
---
|
|
|
|
| 131 |
We evaluated the model using `threshold=0.3`. <br>
|
| 132 |
|
| 133 |
# Inference:
|
| 134 |
**Acceleration Engine:** PyTorch (via Hugging Face Transformers) <br>
|
| 135 |
**Test Hardware:** NVIDIA A100 (Ampere, PCIe/SXM) <br>
|
| 136 |
|
| 137 |
+
# Usage Recommendation
|
| 138 |
+
|
| 139 |
+
First, make sure you have the gliner library installed:
|
| 140 |
+
|
| 141 |
+
```
|
| 142 |
+
pip install gliner
|
| 143 |
+
```
|
| 144 |
+
Now, let's try to find an email, SSN, and phone number in a messy block of text.
|
| 145 |
+
|
| 146 |
+
```
|
| 147 |
+
from gliner import GLiNER
|
| 148 |
+
# 1. Define our new text
|
| 149 |
+
text = "Hi support, I can't log in! My account username is 'johndoe88'. Every time I try, it says "invalid credentials". Please reset my password. You can reach me at (555) 123-4567 or johnd@example.com"
|
| 150 |
+
|
| 151 |
+
# 2. Define the labels we're hunting for.
|
| 152 |
+
labels = ["email", "ssn", "user_name"]
|
| 153 |
+
|
| 154 |
+
# 3. Load the PII model
|
| 155 |
+
model = GLiNER.from_pretrained("nvidia/gliner-pii")
|
| 156 |
+
|
| 157 |
+
# 4. Run the prediction at given threshold
|
| 158 |
+
entities = model.predict_entities(text, labels, threshold=0.5)
|
| 159 |
+
```
|
| 160 |
+
|
| 161 |
+
Sample output:
|
| 162 |
+
```
|
| 163 |
+
[
|
| 164 |
+
{
|
| 165 |
+
"start": 52,
|
| 166 |
+
"end": 61,
|
| 167 |
+
"text": "johndoe88",
|
| 168 |
+
"label": "user_name",
|
| 169 |
+
"score": 0.96
|
| 170 |
+
},
|
| 171 |
+
{
|
| 172 |
+
"start": 159,
|
| 173 |
+
"end": 173,
|
| 174 |
+
"text": "(555) 123-4567",
|
| 175 |
+
"label": "phone_number",
|
| 176 |
+
"score": 0.97
|
| 177 |
+
},
|
| 178 |
+
{
|
| 179 |
+
"start": 177,
|
| 180 |
+
"end": 194,
|
| 181 |
+
"text": "johnd@example.com",
|
| 182 |
+
"label": "email",
|
| 183 |
+
"score": 0.98
|
| 184 |
+
}
|
| 185 |
+
]
|
| 186 |
+
```
|
| 187 |
+
|
| 188 |
## Ethical Considerations:
|
| 189 |
NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their internal model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse. <br>
|
| 190 |
|