Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
|
@@ -40,7 +40,7 @@ It is intended for **creative writing**, **roleplay**, **period-appropriate corr
|
|
| 40 |
|
| 41 |
## Notes
|
| 42 |
Violet is not the first LLM trained on a historical-only pretraining corpus; to the author’s knowledge that distinction belongs to **TimeCapsuleLLM**. Violet was developed independently, and differs in:
|
| 43 |
-
- Different (but somewhat overlapping) pretraining
|
| 44 |
- A custom Victorian tokenizer
|
| 45 |
|
| 46 |
Violet was built on a corpus spanning 1800–1899 sourced from Project Gutenberg, the Internet Archive, the British National Library, and other archives.
|
|
|
|
| 40 |
|
| 41 |
## Notes
|
| 42 |
Violet is not the first LLM trained on a historical-only pretraining corpus; to the author’s knowledge that distinction belongs to **TimeCapsuleLLM**. Violet was developed independently, and differs in:
|
| 43 |
+
- Different (but somewhat overlapping) pretraining corpus and a different range of dates -- Violet focuses specifically on 1800-1899
|
| 44 |
- A custom Victorian tokenizer
|
| 45 |
|
| 46 |
Violet was built on a corpus spanning 1800–1899 sourced from Project Gutenberg, the Internet Archive, the British National Library, and other archives.
|