File size: 876 Bytes
d9e9f35
 
fb412e2
d9e9f35
fb412e2
d9e9f35
 
 
 
 
 
 
 
 
 
 
 
 
 
 
fb412e2
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
GGUF format files of the model vinai/PhoGPT-4B-Chat.

This model file is compatible with the latest llama.cpp

Context: I was trying to get PhoGPT to work with llama-cpp and llama-cpp-python. I found [nguyenviet/PhoGPT-4B-Chat-GGUF](https://huggingface.co/nguyenviet/PhoGPT-4B-Chat-GGUF) but cannot get it to work:

```
from llama_cpp import Llama

llm = Llama.from_pretrained(
    repo_id="nguyenviet/PhoGPT-4B-Chat-GGUF",
    filename="*q3_k_m.gguf*",
)

...
llama_model_load: error loading model: done_getting_tensors: wrong number of tensors; expected 388, got 387
llama_load_model_from_file: failed to load model
...
```

After my opening [issue](https://github.com/VinAIResearch/PhoGPT/issues/22) at the PhoGPT repo was resolved, I was able to create the gguf file.

I figure people want to try the model in Colab. So here it is, so you don't have to create it yourself