TroyDoesAI commited on
Commit
928ad03
·
verified ·
1 Parent(s): eebfcf5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +21 -0
README.md CHANGED
@@ -1,3 +1,24 @@
1
  ---
2
  license: cc-by-4.0
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: cc-by-4.0
3
  ---
4
+
5
+ Introducing MermaidSolar_TempVariance_Factual: a 10.7B Parameter Model crafted entirely from synthetic data generated by my own dataset augmentation toolkit. As a passionate enthusiast and researcher in large language models, my journey began with the creation of Mermaid Mistral, a 7B Parameter Model designed to generate knowledge graphs from user input.
6
+
7
+ In my quest to explore the capabilities of dataset augmentation, I honed my toolkit to generate a fresh, synthetic dataset separate from the original organic 500-entry dataset. This new dataset, comprising 17K entries, became the sole training data for MermaidSolar_TempVariance_Factual.
8
+
9
+ Mermaid Mistral, with its 7B parameters, played a pivotal role in this process, as it was responsible for generating the dataset that served as the foundation for the much larger MermaidSolar_TempVariance_Factual.
10
+
11
+ My research is centered on showcasing the potency of dataset augmentation in large language model training. MermaidSolar_TempVariance_Factual serves as a testament to the efficacy of this approach, trained exclusively on synthetic data to demonstrate the power of the augmentation toolkit. I am Troy Andrew Schultz, and this is the culmination of my research endeavor.
12
+
13
+ [Link to Toolkit](https://github.com/Troys-Code/AI_Research/tree/6ef11bd8a3e61539e53ba28b5d420e41b06a154c)
14
+
15
+ This model, MermaidSolar_TempVariance_Factual, with 10.7B parameters, utilizes a dataset created entirely by a 7B Parameter Model, consisting of 17K Entries augmented by Mermaid Mistral outputs.
16
+
17
+ Note: Original ~500 Entry Organic Dataset is now my Entire Eval Dataset :)
18
+
19
+ Training Progress:
20
+ 2 Epoch with eval loss as ~0.4
21
+
22
+ Could probably go a little longer, but this is acceptable for now.
23
+
24
+ Trying to establish some gpu time for training a much bigger model, More->To->Come