Your best Model and Dataset Recommendation

#1
by hic090 - opened

Hi Leroy,

greetings from East Germany. I'm looking for high quality bible datasets and models and stumbled across your work, which is looking very promising. I'd love to know which of your models would be suited best for (1) Bible translation and redaction in German, and (2) which of your models has been trained on your BibleExpert dataset. Speaking of which, (3) it would be great to know where the data originates from.

Thanks so much,
Klaus

oh sorry for the late response :

for datasets :
i have used LeroyDyer/BIBLE_VERSIONS ( this is just a collection of bibles , i train them as text ... next word prediction )
Also LeroyDyer/BibleExpert this is a lot of questions and answers to do with various bible and historical incedents : -these are not easy to find: (important)
LeroyDyer/bibles this is the bible stripped down into verses and locations ! this is usefull for the model to recall specific passages ¬ i used many bibles , different translations and flavours ( english and european )
mekaneeky/SALT-languages-bible <<< this is also one of the most important data sources !as it has the bible in many AFRICAN ! languages as this is the original source of the bible we can get to the actuall meanings of the original words and retransate to its original intentions !

LeroyDyer/LCARS_Specialist_MYTH_BUSTER_ --- For me this one was the best at making timelines ! as well as chaldean histrys !
( i also train them with african historys as this also allows for understanding the journeys and peoples they biblcal people encountered as well as mahabharat etc ! ( shem )

this became the best model ! for bible stuff :
i use it for timelines and deep questions :i also trained this model on many sacred texts and historys etc: so we can redate as well as CrossRefference!

ALL my models CONTAIN the BIBLE !!
They DO NOT have super long context ¬¬ they answer very well ! but i find its best to limit the length of the response and allow for a continued response instead ! to maintain higher quality !
as well as to use .5 temp ¬ ( important ) as this will allow for the model to possibly search for the ( second best answers also ) !! as we need to be able to debate ! so we need a slightly higher tempreture !

we can control the model using TOPK ! ( so if we Reduce the TOPK Samples we get a hgih quality pool ) then we can raise the TopP so that the model will select from the higher ( samples first )

the model is a mistral ( generally trained on 2048 Context ! )

so i expect this model to be NON BIASED !

I do not use the model for anything else ! as i found the models not havng great babalance as some tasks get lost when adding the bible , so i decided to keep it as a dedicated model ! for history and religions !

PS: I have many religious datasets !
Pain staking Work !

Sign up or log in to comment