File size: 1,267 Bytes
92d7eaf |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 |
---
license: mit
---
Hello, this is TimeCapsule V0.5
This is a language model trained entirely on texts from 1800-1875 London. The goal is to eliminate modern bias, right now this model is trained on roughly 435MB but I hope to continue expanding the dataset.
## How to run this model?
First download the hugface folder, inside you'll find everything you need
Disclaimer: Most of the python files in this folder are from nanoGPT by Andrej Karpathy, some of them are slightly modified.
If you plan on training your own model please visit: https://github.com/karpathy/nanoGPT
## Step 1: Download the repository
Download the repo or clone it up to you
sample.py
ckpt.pt
config.json
meta.pkl
tokenizer_london/vocab.json
tokenizer_london/merges.txt
model.py
configurator.py
# Step 2: Make sure you have requirements installed
Make sure you have this installed: pip install tokenizers torch
## Step 3: Run the model
python3 sample.py --out_dir=. --start="Put prompt here!"
## Optional control settings
Go to sample.py and change the command line arguments to control generation settings
For more info on this project go to my github: https://github.com/haykgrigo3/TimeCapsuleLLM/blob/main/README.md
|