I want to know what this is.
These models are Bidirectional LSTMs with attention.
How to run?
Each models version folder includes a script to run, however it also has training capabilities, for inference set 'RETRAIN' and 'continueTrain' to False, continueTrain continues training from the last check point, RETRAIN trains the model from scratch and overrides any models that may have the same file name. Directory layout: folderWhereTheIAMAMscriptIs/ = Script model/ = tokenizer.pkl = chatbot.keras
Note: Ensure tokenizer.pkl is in the same directory as the script.
How was this trained?
In each models versions folder, a rough sorta guide will tell you how that model was trained.
Why are some of these spitting out junk?
A explanation will be in the info.md file, but mostly these models are not finetuned or was trained with PPO, and they were trained completely from scratch, early models will probably be terrible, there is also no RLHF for these models or Prompt structures or system prompts etc.
- Downloads last month
- -