Spaces:
Running
Running
Update CONTRIBUTING.md
Browse files- CONTRIBUTING.md +3 -3
CONTRIBUTING.md
CHANGED
|
@@ -1,13 +1,13 @@
|
|
| 1 |
# Contributing to the Epstein Estate Document Dataset
|
| 2 |
|
| 3 |
-
|
| 4 |
|
| 5 |
-
## What
|
| 6 |
* **OCR Corrections:** Fixes to typos resulting from the Tesseract conversion (e.g., correcting "1lI" confusions), provided they match the original image source.
|
| 7 |
* **Metadata improvements:** Adding structured data (dates, document types) to the CSV index.
|
| 8 |
* **Formatting:** Improving the readability of markdown files without altering the semantic content.
|
| 9 |
|
| 10 |
-
## What
|
| 11 |
* **PII Restoration:** Do not submit PRs that attempt to "fill in" redacted names or addresses.
|
| 12 |
* **Speculative Annotations:** Do not add commentary, theories, or external context directly into the document text files. Keep annotations in separate metadata fields.
|
| 13 |
* **Fine-tuned Models:** Do not upload LoRAs or model weights trained on this data.
|
|
|
|
| 1 |
# Contributing to the Epstein Estate Document Dataset
|
| 2 |
|
| 3 |
+
I welcome contributions that improve the accessibility and cleanliness of this dataset. However, due to the sensitive nature of the content, I have strict guidelines for pull requests.
|
| 4 |
|
| 5 |
+
## What I Accept
|
| 6 |
* **OCR Corrections:** Fixes to typos resulting from the Tesseract conversion (e.g., correcting "1lI" confusions), provided they match the original image source.
|
| 7 |
* **Metadata improvements:** Adding structured data (dates, document types) to the CSV index.
|
| 8 |
* **Formatting:** Improving the readability of markdown files without altering the semantic content.
|
| 9 |
|
| 10 |
+
## What I Do Not Accept
|
| 11 |
* **PII Restoration:** Do not submit PRs that attempt to "fill in" redacted names or addresses.
|
| 12 |
* **Speculative Annotations:** Do not add commentary, theories, or external context directly into the document text files. Keep annotations in separate metadata fields.
|
| 13 |
* **Fine-tuned Models:** Do not upload LoRAs or model weights trained on this data.
|