| | --- |
| | license: mit |
| | language: |
| | - en |
| | --- |
| | |
| |
|
| | This is a very basic pyTorch transformer model that sorts lists of numbers. It was trained with nanoGPT. |
| |
|
| | The context window is 256 tokens, so the input list can be up to 127 tokens long. Numbers can be 0 to 99, separated by comma tokens. |
| |
|
| | It was trained for about one day on a laptop with a single NVIDIA RTX 2070 eGPU, so don't expect anything amazing. |
| | In practice it sorts these lists correctly about 90% of the time, which is good enough to satisfy my curiosity. |
| |
|
| | To run, I recommend cloning nanoGPT (https://github.com/karpathy/nanoGPT) and installing its prerequisites. |
| | Create a new branch and copy these files into the nanoGPT folder, overwriting the included sample.py and train.py. |
| |
|
| | To run: |
| |
|
| | > python sample.py --out_dir=out-sort-lists --start="(5,4,3,2,1): [" --num_samples=1 --temperature=0.0001 --max_new_tokens=127 |
| |
|
| | To train: |
| |
|
| | > python train.py config/train_sort.py |
| | |