Update README.md
#3
by
Chris-Alexiuk
- opened
README.md
CHANGED
|
@@ -23,9 +23,9 @@ Throughout the alignment process, we relied on only approximately 20K human-anno
|
|
| 23 |
This results in a model that is aligned for human chat preferences, improvements in mathematical reasoning, coding and instruction-following, and is capable of generating high quality synthetic data for a variety of use cases.
|
| 24 |
|
| 25 |
Under the NVIDIA Open Model License, NVIDIA confirms:
|
| 26 |
-
Models are commercially usable.
|
| 27 |
-
You are free to create and distribute Derivative Models.
|
| 28 |
-
NVIDIA does not claim ownership to any outputs generated using the Models or Derivative Models.
|
| 29 |
|
| 30 |
### License:
|
| 31 |
|
|
@@ -309,9 +309,9 @@ Evaluated using the CantTalkAboutThis Dataset as introduced in the [CantTalkAbou
|
|
| 309 |
### Adversarial Testing and Red Teaming Efforts
|
| 310 |
|
| 311 |
The Nemotron-4 340B-Instruct model underwent extensive safety evaluation including adversarial testing via three distinct methods:
|
| 312 |
-
[Garak](https://docs.garak.ai/garak), is an automated LLM vulnerability scanner that probes for common weaknesses, including prompt injection and data leakage.
|
| 313 |
-
[AEGIS](https://arxiv.org/pdf/2404.05993), is a content safety evaluation dataset and LLM based content safety classifier model, that adheres to a broad taxonomy of 13 categories of critical risks in human-LLM interactions.
|
| 314 |
-
Human Content Red Teaming leveraging human interaction and evaluation of the models' responses.
|
| 315 |
|
| 316 |
### Limitations
|
| 317 |
|
|
|
|
| 23 |
This results in a model that is aligned for human chat preferences, improvements in mathematical reasoning, coding and instruction-following, and is capable of generating high quality synthetic data for a variety of use cases.
|
| 24 |
|
| 25 |
Under the NVIDIA Open Model License, NVIDIA confirms:
|
| 26 |
+
- Models are commercially usable.
|
| 27 |
+
- You are free to create and distribute Derivative Models.
|
| 28 |
+
- NVIDIA does not claim ownership to any outputs generated using the Models or Derivative Models.
|
| 29 |
|
| 30 |
### License:
|
| 31 |
|
|
|
|
| 309 |
### Adversarial Testing and Red Teaming Efforts
|
| 310 |
|
| 311 |
The Nemotron-4 340B-Instruct model underwent extensive safety evaluation including adversarial testing via three distinct methods:
|
| 312 |
+
- [Garak](https://docs.garak.ai/garak), is an automated LLM vulnerability scanner that probes for common weaknesses, including prompt injection and data leakage.
|
| 313 |
+
- [AEGIS](https://arxiv.org/pdf/2404.05993), is a content safety evaluation dataset and LLM based content safety classifier model, that adheres to a broad taxonomy of 13 categories of critical risks in human-LLM interactions.
|
| 314 |
+
- Human Content Red Teaming leveraging human interaction and evaluation of the models' responses.
|
| 315 |
|
| 316 |
### Limitations
|
| 317 |
|