Update README.md
Browse files
README.md
CHANGED
|
@@ -1,3 +1,24 @@
|
|
| 1 |
---
|
| 2 |
license: cc-by-4.0
|
| 3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
license: cc-by-4.0
|
| 3 |
---
|
| 4 |
+
|
| 5 |
+
Introducing MermaidSolar_TempVariance_Factual: a 10.7B Parameter Model crafted entirely from synthetic data generated by my own dataset augmentation toolkit. As a passionate enthusiast and researcher in large language models, my journey began with the creation of Mermaid Mistral, a 7B Parameter Model designed to generate knowledge graphs from user input.
|
| 6 |
+
|
| 7 |
+
In my quest to explore the capabilities of dataset augmentation, I honed my toolkit to generate a fresh, synthetic dataset separate from the original organic 500-entry dataset. This new dataset, comprising 17K entries, became the sole training data for MermaidSolar_TempVariance_Factual.
|
| 8 |
+
|
| 9 |
+
Mermaid Mistral, with its 7B parameters, played a pivotal role in this process, as it was responsible for generating the dataset that served as the foundation for the much larger MermaidSolar_TempVariance_Factual.
|
| 10 |
+
|
| 11 |
+
My research is centered on showcasing the potency of dataset augmentation in large language model training. MermaidSolar_TempVariance_Factual serves as a testament to the efficacy of this approach, trained exclusively on synthetic data to demonstrate the power of the augmentation toolkit. I am Troy Andrew Schultz, and this is the culmination of my research endeavor.
|
| 12 |
+
|
| 13 |
+
[Link to Toolkit](https://github.com/Troys-Code/AI_Research/tree/6ef11bd8a3e61539e53ba28b5d420e41b06a154c)
|
| 14 |
+
|
| 15 |
+
This model, MermaidSolar_TempVariance_Factual, with 10.7B parameters, utilizes a dataset created entirely by a 7B Parameter Model, consisting of 17K Entries augmented by Mermaid Mistral outputs.
|
| 16 |
+
|
| 17 |
+
Note: Original ~500 Entry Organic Dataset is now my Entire Eval Dataset :)
|
| 18 |
+
|
| 19 |
+
Training Progress:
|
| 20 |
+
2 Epoch with eval loss as ~0.4
|
| 21 |
+
|
| 22 |
+
Could probably go a little longer, but this is acceptable for now.
|
| 23 |
+
|
| 24 |
+
Trying to establish some gpu time for training a much bigger model, More->To->Come
|