File size: 1,277 Bytes
898c807 e6dc2c0 898c807 af01df9 898c807 6d6927d 6d61405 a6577a4 41f65ea 3b6e1b8 41f65ea a98ae6f e6dc2c0 11c1ed5 f78d6a0 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
---
license: apache-2.0
datasets:
- dangermouse77/invertedAnswerQuestion
language:
- en
base_model:
- google-t5/t5-small
---
A finetuned model based on t5-small (~60M parameters)
that given an answer it responds with a question. I call it AQ model because it does the opposite of the usual question answering LLM model.
This AQ-model is useful in coversations with another LLM-QA-chatbot, so that the conversation does not get stuck but moves continously to new topics.
If you have an automatic conversation between two LLMs, one QA-LLM and one AQ-LLM the conversation will not get stuck and repetitive but continue forever :-)
The model was finetuned starting from t5-small on a NVidia RTX 3090 in about 1 1/2h with a batch size of 8, using 4 GB of RAM on the GPU.
As the GPU was running at 320W, the energy to train this model was 480Wh.
The same model trained with a batch size of 32 gave sligthly worse results (14.3 RAM GB on the GPU in 1 hour).
Test with
./test_aqmodel.py "The hypothesis fails because of the decay with radius to the power of 3"
Output: What is the reason the hypothesis is a faulty hypothesis?
Last but not least: this model was finetuned with help of python scripts suggested by ChatGPT-4o 8-)
Using vibe programming as Karpathy names it ...
|