Update README.md
Browse files
README.md
CHANGED
|
@@ -8,9 +8,9 @@ base_model:
|
|
| 8 |
- google-bert/bert-base-cased
|
| 9 |
---
|
| 10 |
# Model Card
|
| 11 |
-
This model was trained for the purposes of analysing model utility when trained on various [Derived Text Formats](https://text-plus.org/en/themen-dokumentation/atf/).
|
| 12 |
-
|
| 13 |
-
In this case, the model was trained on the original dataset without any obfuscation to be used as a baseline.
|
| 14 |
|
| 15 |
## Training Configuration
|
| 16 |
|
|
|
|
| 8 |
- google-bert/bert-base-cased
|
| 9 |
---
|
| 10 |
# Model Card
|
| 11 |
+
This model was trained for the purposes of analysing model utility when trained on various [Derived Text Formats](https://text-plus.org/en/themen-dokumentation/atf/). These are versions of the same text that are adjusted to reduce the chances that the original text can ever be extracted from the model, with applications in privacy and copyright infringement protection.
|
| 12 |
+
<br><br>
|
| 13 |
+
The dataset used for these experiments is [codelion/fineweb-edu-1B](https://huggingface.co/datasets/codelion/fineweb-edu-1B), with all obfuscated formats found [here](https://huggingface.co/datasets/DanielGallagherIRE/fineweb-edu-1B-obfuscated). In this case, the model was trained on the original dataset without any obfuscation to be used as a baseline.
|
| 14 |
|
| 15 |
## Training Configuration
|
| 16 |
|