Update README.md
Browse files
README.md
CHANGED
|
@@ -19,5 +19,9 @@ This repository contains a Core ML conversion of [meta-llama/Meta-Llama-3-8B](ht
|
|
| 19 |
|
| 20 |
|
| 21 |
This does not have KV Cache. only: inputs int32 / outputs float16.
|
|
|
|
| 22 |
I haven't been able to test this, so leave something in 'Community' to let me know how ya tested it and how it worked.
|
| 23 |
-
|
|
|
|
|
|
|
|
|
|
|
|
| 19 |
|
| 20 |
|
| 21 |
This does not have KV Cache. only: inputs int32 / outputs float16.
|
| 22 |
+
|
| 23 |
I haven't been able to test this, so leave something in 'Community' to let me know how ya tested it and how it worked.
|
| 24 |
+
|
| 25 |
+
I did model.half() before scripting / coverting thinking it would reduce my memory usage (I found online that it doesn't).
|
| 26 |
+
|
| 27 |
+
I am unsure if it affected the conversion process or not.
|