Update README.md
Browse files
README.md
CHANGED
|
@@ -29,6 +29,18 @@ The GRaPE Family was trained on about **14 billion** tokens of data after pre-tr
|
|
| 29 |
|
| 30 |
GRaPE Flash and Nano are monomodal models, only accepting text. GRaPE Mini being trained most recently supports image and video inputs.
|
| 31 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 32 |
# GRaPE Flash as a Model
|
| 33 |
|
| 34 |
GRaPE Flash was designed for one thing: Speed. If you need a model that can quickly fill in tons of JSON data, this is your model. GRaPE Flash was chosen to **not** recieve thinking training as the model architecture would not benefit from it.
|
|
|
|
| 29 |
|
| 30 |
GRaPE Flash and Nano are monomodal models, only accepting text. GRaPE Mini being trained most recently supports image and video inputs.
|
| 31 |
|
| 32 |
+
# How to Run
|
| 33 |
+
|
| 34 |
+
I recommend using **LM Studio** for running GRaPE Models, and have generally found these sampling parameters to work best:
|
| 35 |
+
|
| 36 |
+
| Name | Value |
|
| 37 |
+
| :--- | :--- |
|
| 38 |
+
| **Temperature** | 0.6 |
|
| 39 |
+
| **Top K Sampling** | 40 |
|
| 40 |
+
| **Repeat Penalty** | 1 |
|
| 41 |
+
| **Top P Sampling** | 0.85 |
|
| 42 |
+
| **Min P Sampling** | 0.05 |
|
| 43 |
+
|
| 44 |
# GRaPE Flash as a Model
|
| 45 |
|
| 46 |
GRaPE Flash was designed for one thing: Speed. If you need a model that can quickly fill in tons of JSON data, this is your model. GRaPE Flash was chosen to **not** recieve thinking training as the model architecture would not benefit from it.
|