TroyDoesAI
/

Mermaid_12B_Temp_Variance_Factual

Text Generation

text-generation-inference

Model card Files Files and versions

TroyDoesAI commited on Mar 6, 2024

Commit

928ad03

·

verified ·

1 Parent(s): eebfcf5

Update README.md

Files changed (1) hide show

README.md +21 -0

README.md CHANGED Viewed

@@ -1,3 +1,24 @@
 ---
 license: cc-by-4.0
 ---

 ---
 license: cc-by-4.0
 ---
+Introducing MermaidSolar_TempVariance_Factual: a 10.7B Parameter Model crafted entirely from synthetic data generated by my own dataset augmentation toolkit. As a passionate enthusiast and researcher in large language models, my journey began with the creation of Mermaid Mistral, a 7B Parameter Model designed to generate knowledge graphs from user input.
+In my quest to explore the capabilities of dataset augmentation, I honed my toolkit to generate a fresh, synthetic dataset separate from the original organic 500-entry dataset. This new dataset, comprising 17K entries, became the sole training data for MermaidSolar_TempVariance_Factual.
+Mermaid Mistral, with its 7B parameters, played a pivotal role in this process, as it was responsible for generating the dataset that served as the foundation for the much larger MermaidSolar_TempVariance_Factual.
+My research is centered on showcasing the potency of dataset augmentation in large language model training. MermaidSolar_TempVariance_Factual serves as a testament to the efficacy of this approach, trained exclusively on synthetic data to demonstrate the power of the augmentation toolkit. I am Troy Andrew Schultz, and this is the culmination of my research endeavor.
+[Link to Toolkit](https://github.com/Troys-Code/AI_Research/tree/6ef11bd8a3e61539e53ba28b5d420e41b06a154c)
+This model, MermaidSolar_TempVariance_Factual, with 10.7B parameters, utilizes a dataset created entirely by a 7B Parameter Model, consisting of 17K Entries augmented by Mermaid Mistral outputs.
+Note: Original ~500 Entry Organic Dataset is now my Entire Eval Dataset :)
+Training Progress:
+2 Epoch with eval loss as ~0.4
+Could probably go a little longer, but this is acceptable for now.
+Trying to establish some gpu time for training a much bigger model, More->To->Come