|
|
--- |
|
|
license: apache-2.0 |
|
|
datasets: |
|
|
- dangermouse77/invertedAnswerQuestion |
|
|
language: |
|
|
- en |
|
|
base_model: |
|
|
- google-t5/t5-small |
|
|
--- |
|
|
A finetuned model based on t5-small (~60M parameters) |
|
|
that given an answer it responds with a question. I call it AQ model because it does the opposite of the usual question answering LLM model. |
|
|
|
|
|
This AQ-model is useful in coversations with another LLM-QA-chatbot, so that the conversation does not get stuck but moves continously to new topics. |
|
|
If you have an automatic conversation between two LLMs, one QA-LLM and one AQ-LLM the conversation will not get stuck and repetitive but continue forever :-) |
|
|
|
|
|
The model was finetuned starting from t5-small on a NVidia RTX 3090 in about 1 1/2h with a batch size of 8, using 4 GB of RAM on the GPU. |
|
|
As the GPU was running at 320W, the energy to train this model was 480Wh. |
|
|
The same model trained with a batch size of 32 gave sligthly worse results (14.3 RAM GB on the GPU in 1 hour). |
|
|
|
|
|
Test with |
|
|
|
|
|
./test_aqmodel.py "The hypothesis fails because of the decay with radius to the power of 3" |
|
|
|
|
|
Output: What is the reason the hypothesis is a faulty hypothesis? |
|
|
|
|
|
Last but not least: this model was finetuned with help of python scripts suggested by ChatGPT-4o 8-) |
|
|
Using vibe programming as Karpathy names it ... |
|
|
|