--- license: mit --- Hello, this is TimeCapsule V0.5 This is a language model trained entirely on texts from 1800-1875 London. The goal is to eliminate modern bias, right now this model is trained on roughly 435MB but I hope to continue expanding the dataset. ## How to run this model? First download the hugface folder, inside you'll find everything you need Disclaimer: Most of the python files in this folder are from nanoGPT by Andrej Karpathy, some of them are slightly modified. If you plan on training your own model please visit: https://github.com/karpathy/nanoGPT ## Step 1: Download the repository Download the repo or clone it up to you sample.py ckpt.pt config.json meta.pkl tokenizer_london/vocab.json tokenizer_london/merges.txt model.py configurator.py # Step 2: Make sure you have requirements installed Make sure you have this installed: pip install tokenizers torch ## Step 3: Run the model python3 sample.py --out_dir=. --start="Put prompt here!" ## Optional control settings Go to sample.py and change the command line arguments to control generation settings For more info on this project go to my github: https://github.com/haykgrigo3/TimeCapsuleLLM/blob/main/README.md