Sweaterdog commited on
Commit
61ef961
·
verified ·
1 Parent(s): fa0af51

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +30 -3
README.md CHANGED
@@ -83,11 +83,38 @@ The benchmarks below include models via API that are cheap, and other fine-tuned
83
 
84
  ### Zero info Prompting
85
 
86
- ![image/png](https://cdn-uploads.huggingface.co/production/uploads/66960602f0ffd8e3a381106a/KE-xRWCtbRQVzCkiJbTU6.png)
87
 
88
- Test this for yourself using [this profile]()
 
 
89
 
90
 
91
  ### Time to get a stone pickaxe
92
 
93
- ![image/png](https://cdn-uploads.huggingface.co/production/uploads/66960602f0ffd8e3a381106a/qqV-Q5Yn7tAsGlH7qSSwM.png)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
83
 
84
  ### Zero info Prompting
85
 
86
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/66960602f0ffd8e3a381106a/86Seb1yxsB92LfSOhg5Jy.png)
87
 
88
+
89
+ Currently, Andy-3.5 and Andy-3.5-mini are the **ONLY** models that can play without command documentation, or any other instruction, and Andy-3.5-Mini *sometimes* fares better ***without*** the unnecessary data.
90
+ Test this for yourself using [this profile](https://huggingface.co/Sweaterdog/Andy-3.5/blob/main/local_demo.json)
91
 
92
 
93
  ### Time to get a stone pickaxe
94
 
95
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/66960602f0ffd8e3a381106a/uDiKIYuAAamt0Y7b58nr0.png)
96
+
97
+ I am sure other models like Deepseek-R1 may be faster at getting a stone pickaxe, however the Demo was to show the performance of Andy-3.5
98
+
99
+ *For Andy-3.5-mini, I used the FP16 model, I had enough VRAM to do so*
100
+ *For Andy-3.5, I used the Q4_K_M quantization*
101
+ *For Andy-3.5-Teensy, I used the FP16 quantization*
102
+ *For Mineslayerv1 and Mineslayerv2, I used the default (and only) quantization, Q4_K_M*
103
+
104
+ ### Notes about the benchmarks
105
+
106
+ **Zero Info Prompting**
107
+
108
+ Andy-3.5-Teensy was able to use one command successfully, but was not able to afterwards
109
+ Andy-3.5-Mini collected 32 oak_log instead of 16 oak_log
110
+ Andy-3.5 attempted to continue playing, and make a wooden_pickaxe after the goal was done.
111
+
112
+ Both Mineslayerv1 and Mineslayerv2 hallucinated commands, like !chop or !grab
113
+
114
+ **Time to get a stone pickaxe**
115
+
116
+ Andy-3.5-teensy hallucinates too much for stable gameplay *(It is a 360M parameter model, what can be expected)*
117
+ Andy-3.5-Mini was unable to make itself a stone pickaxe, however it collected enough wood, but then got stuck on converting logs to planks, it kept trying "!craftRecipe("wooden_planks", 6) instead of oak_planks
118
+ Andy-3.5 Made a stone pickaxe the fastest out of all models, including GPT-4o-mini and Claude-3.5-Haiku
119
+ Mineslayerv1 Was unable to use !collectBlocks, instead kept trying !collectBlock
120
+ Mineslayerv2 Was unable to play, it kept hallucinating on the first command