| | --- |
| | license: apache-2.0 |
| | datasets: |
| | - Fishfishfishfishfish/Synthetic_text.txt |
| | language: |
| | - en |
| | --- |
| | the only files needed for inference is inference.py, word2idk.pkl, and lstm_Hxxx.safetensors |
| | |
| | input tokens must be space separated, as they aren't tokenized like the training data is. |
| | >python inference.py --temp 0.5 --top_k 64 --model_file lstm_H256.safetensors --start_sequence "User : what is the capital of France ? Bot : " --max_length 32 |
| |
|
| | usually results in something like |
| |
|
| | >The capital of the world of the world of the world of the world of the |
| |
|
| | its not very accurate yet, its trained on only 1.2mb of text |
| |
|
| | Each safetensors file represents a different hidden dim value. |
| | Each trained for 1 epoch. |
| |
|
| | inference.py hidden dim value must be edited for each safetensors. |
| |
|
| | >sequence_length = 64 |
| | > |
| | >batch_size = 16 |
| | > |
| | >learning_rate = 0.0001 |
| | > |
| | >embedding_dim = 256 |
| | > |
| | >num_layers = 4 |
| | > |