Update README.md
Browse files
README.md
CHANGED
|
@@ -27,6 +27,22 @@ The GRaPE Family was trained on about **14 billion** tokens of data after pre-tr
|
|
| 27 |
|
| 28 |
GRaPE Flash and Nano are monomodal models, only accepting text. GRaPE Mini being trained most recently supports image and video inputs.
|
| 29 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 30 |
# GRaPE Mini as a Model
|
| 31 |
|
| 32 |
GRaPE Mini is the **most advanced** model architecture-wise in the GRaPE 1 family. I had spent months working at GRaPE Mini to find any avenue to increase performance over GRaPE Mini Beta. And I had done so.
|
|
|
|
| 27 |
|
| 28 |
GRaPE Flash and Nano are monomodal models, only accepting text. GRaPE Mini being trained most recently supports image and video inputs.
|
| 29 |
|
| 30 |
+
# How to Run
|
| 31 |
+
|
| 32 |
+
I recommend using **LM Studio** for running GRaPE Models, and have generally found these sampling parameters to work best:
|
| 33 |
+
|
| 34 |
+
| Name | Value |
|
| 35 |
+
| :--- | :--- |
|
| 36 |
+
| **Temperature** | 0.6 |
|
| 37 |
+
| **Top K Sampling** | 40 |
|
| 38 |
+
| **Repeat Penalty** | 1 |
|
| 39 |
+
| **Top P Sampling** | 0.85 |
|
| 40 |
+
| **Min P Sampling** | 0.05 |
|
| 41 |
+
|
| 42 |
+
# Uses of GRaPE Mini Right Now
|
| 43 |
+
|
| 44 |
+
GRaPE Mini was foundational to the existence of [Andy-4.1](https://huggingface.co/Mindcraft-CE/Andy-4.1), a model trained to play Minecraft. This was a demo proving the efficiency and power this architecture can make.
|
| 45 |
+
|
| 46 |
# GRaPE Mini as a Model
|
| 47 |
|
| 48 |
GRaPE Mini is the **most advanced** model architecture-wise in the GRaPE 1 family. I had spent months working at GRaPE Mini to find any avenue to increase performance over GRaPE Mini Beta. And I had done so.
|