dangermouse77
/

FromAnswerToQuestion-T5-small-60M

Model card Files Files and versions

dangermouse77 commited on Mar 5, 2025

Commit

6d61405

·

verified ·

1 Parent(s): 6b47ea7

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -14,7 +14,7 @@ This AQ-model is useful in coversations with another LLM-QA-chatbot, so that the
 If you have an automatic conversation between two LLMs, one QA-LLM and one AQ-LLM the conversation will not get stuck and repetitive but continue forever :-)
 The model was finetuned starting from t5-small on a NVidia RTX 3090 in about 1 1/2h with a batch size of 8, using 4 GB of RAM on the GPU.
-The same model trained with a batch size of 32 (14.3 RAM GB on the GPU in 1 hour) gave sligthly worse results.
 Test with

 If you have an automatic conversation between two LLMs, one QA-LLM and one AQ-LLM the conversation will not get stuck and repetitive but continue forever :-)
 The model was finetuned starting from t5-small on a NVidia RTX 3090 in about 1 1/2h with a batch size of 8, using 4 GB of RAM on the GPU.
+The same model trained with a batch size of 32 gave sligthly worse results (14.3 RAM GB on the GPU in 1 hour).
 Test with