Update README.md
Browse files
README.md
CHANGED
|
@@ -127,8 +127,13 @@ Test this for yourself using [this profile](https://huggingface.co/Sweaterdog/An
|
|
| 127 |
I am sure other models like Deepseek-R1 may be faster at getting a stone pickaxe, however the Demo was to show the performance of Andy-3.5
|
| 128 |
|
| 129 |
*For Andy-3.5-mini, I used the FP16 model, I had enough VRAM to do so*
|
|
|
|
| 130 |
*For Andy-3.5, I used the Q4_K_M quantization*
|
|
|
|
|
|
|
|
|
|
| 131 |
*For Andy-3.5-Teensy, I used the FP16 quantization*
|
|
|
|
| 132 |
*For Mineslayerv1 and Mineslayerv2, I used the default (and only) quantization, Q4_K_M*
|
| 133 |
|
| 134 |
## Notes about the benchmarks
|
|
@@ -136,8 +141,11 @@ I am sure other models like Deepseek-R1 may be faster at getting a stone pickaxe
|
|
| 136 |
**Zero Info Prompting**
|
| 137 |
|
| 138 |
Andy-3.5-Teensy was able to use one command successfully, but was not able to afterwards
|
|
|
|
| 139 |
Andy-3.5-Mini collected 32 oak_log instead of 16 oak_log
|
|
|
|
| 140 |
Andy-3.5-small *No notes*
|
|
|
|
| 141 |
Andy-3.5 attempted to continue playing, and make a wooden_pickaxe after the goal was done.
|
| 142 |
|
| 143 |
Both Mineslayerv1 and Mineslayerv2 hallucinated commands, like !chop or !grab
|
|
@@ -145,8 +153,13 @@ Both Mineslayerv1 and Mineslayerv2 hallucinated commands, like !chop or !grab
|
|
| 145 |
**Time to get a stone pickaxe**
|
| 146 |
|
| 147 |
Andy-3.5-teensy hallucinates too much for stable gameplay *(It is a 360M parameter model, what can be expected)*
|
|
|
|
| 148 |
Andy-3.5-Mini was unable to make itself a stone pickaxe, however it collected enough wood, but then got stuck on converting logs to planks, it kept trying "!craftRecipe("wooden_planks", 6) instead of oak_planks
|
|
|
|
| 149 |
Andy-3.5-small kept trying to make a stone_pickaxe first
|
|
|
|
| 150 |
Andy-3.5 Made a stone pickaxe the fastest out of all models, including GPT-4o-mini and Claude-3.5-Haiku
|
|
|
|
| 151 |
Mineslayerv1 Was unable to use !collectBlocks, instead kept trying !collectBlock
|
|
|
|
| 152 |
Mineslayerv2 Was unable to play, it kept hallucinating on the first command
|
|
|
|
| 127 |
I am sure other models like Deepseek-R1 may be faster at getting a stone pickaxe, however the Demo was to show the performance of Andy-3.5
|
| 128 |
|
| 129 |
*For Andy-3.5-mini, I used the FP16 model, I had enough VRAM to do so*
|
| 130 |
+
|
| 131 |
*For Andy-3.5, I used the Q4_K_M quantization*
|
| 132 |
+
|
| 133 |
+
*For Andy-3.5-small, I used the Q8_0 quantization*
|
| 134 |
+
|
| 135 |
*For Andy-3.5-Teensy, I used the FP16 quantization*
|
| 136 |
+
|
| 137 |
*For Mineslayerv1 and Mineslayerv2, I used the default (and only) quantization, Q4_K_M*
|
| 138 |
|
| 139 |
## Notes about the benchmarks
|
|
|
|
| 141 |
**Zero Info Prompting**
|
| 142 |
|
| 143 |
Andy-3.5-Teensy was able to use one command successfully, but was not able to afterwards
|
| 144 |
+
|
| 145 |
Andy-3.5-Mini collected 32 oak_log instead of 16 oak_log
|
| 146 |
+
|
| 147 |
Andy-3.5-small *No notes*
|
| 148 |
+
|
| 149 |
Andy-3.5 attempted to continue playing, and make a wooden_pickaxe after the goal was done.
|
| 150 |
|
| 151 |
Both Mineslayerv1 and Mineslayerv2 hallucinated commands, like !chop or !grab
|
|
|
|
| 153 |
**Time to get a stone pickaxe**
|
| 154 |
|
| 155 |
Andy-3.5-teensy hallucinates too much for stable gameplay *(It is a 360M parameter model, what can be expected)*
|
| 156 |
+
|
| 157 |
Andy-3.5-Mini was unable to make itself a stone pickaxe, however it collected enough wood, but then got stuck on converting logs to planks, it kept trying "!craftRecipe("wooden_planks", 6) instead of oak_planks
|
| 158 |
+
|
| 159 |
Andy-3.5-small kept trying to make a stone_pickaxe first
|
| 160 |
+
|
| 161 |
Andy-3.5 Made a stone pickaxe the fastest out of all models, including GPT-4o-mini and Claude-3.5-Haiku
|
| 162 |
+
|
| 163 |
Mineslayerv1 Was unable to use !collectBlocks, instead kept trying !collectBlock
|
| 164 |
+
|
| 165 |
Mineslayerv2 Was unable to play, it kept hallucinating on the first command
|