chat-ui-energy

Paused

Liam Dyer

Hosted Documentation (#1155)

b88a126 unverified over 1 year ago

1.25 kB

	# Llama.cpp

	\| Feature \| Available \|
	\| --------------------------- \| --------- \|
	\| [Tools](../tools) \| No \|
	\| [Multimodal](../multimodal) \| No \|

	Chat UI supports the llama.cpp API server directly without the need for an adapter. You can do this using the `llamacpp` endpoint type.

	If you want to run Chat UI with llama.cpp, you can do the following, using Zephyr as an example model:

	1. Get [the weights](https://huggingface.co/TheBloke/zephyr-7B-beta-GGUF/tree/main) from the hub
	2. Run the server with the following command: `./server -m models/zephyr-7b-beta.Q4_K_M.gguf -c 2048 -np 3`
	3. Add the following to your `.env.local`:

	```ini
	MODELS=`[
	{
	"name": "Local Zephyr",
	"chatPromptTemplate": "<\|system\|>\n{{preprompt}}</s>\n{{#each messages}}{{#ifUser}}<\|user\|>\n{{content}}</s>\n<\|assistant\|>\n{{/ifUser}}{{#ifAssistant}}{{content}}</s>\n{{/ifAssistant}}{{/each}}",
	"parameters": {
	"temperature": 0.1,
	"top_p": 0.95,
	"repetition_penalty": 1.2,
	"top_k": 50,
	"truncate": 1000,
	"max_new_tokens": 2048,
	"stop": ["</s>"]
	},
	"endpoints": [
	{
	"url": "http://127.0.0.1:8080",
	"type": "llamacpp"
	}
	]
	}
	]`
	```