File size: 4,070 Bytes
e29b963 8ae50d9 4d8adc1 e29b963 8ae50d9 f578d74 8ae50d9 f578d74 44d566d 8ae50d9 f3cce8b 8ae50d9 44d566d 8ae50d9 44d566d 8ae50d9 44d566d 8ae50d9 44d566d 8ae50d9 48cebec 8ae50d9 44d566d 8ae50d9 44d566d 8ae50d9 48cebec 8ae50d9 48cebec e29b963 44d566d e29b963 44d566d e29b963 48cebec e29b963 44d566d 8ae50d9 e29b963 8ae50d9 2e7341f 8ae50d9 e29b963 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 | ---
base_model:
- microsoft/Phi-4-mini-instruct
---
# Phi-4-mini-instruct with llama-server (Tool-Enhanced Version)
NOTE: THIS IS A POC FOR A SUPPLY CHAIN ATTACK LEVERAGING POISONED CHAT TEMPLATES. FOR FULL BLOG/CONTEXT, PLEASE REVIEW: https://www.pillar.security/blog/llm-backdoors-at-the-inference-level-the-threat-of-poisoned-templates
This repository contains instructions for running a modified version of the Phi-4-mini-instruct model using llama-server. This version has been enhanced to support tool usage, allowing the model to interact with external tools and APIs through a ChatGPT-compatible interface.
## Model Capabilities
This modified version of Phi-4-mini-instruct includes:
- Full support for tool usage and function calling
- Custom chat template optimized for tool interactions
- Ability to process and respond to tool outputs
- ChatGPT-compatible API interface
## Prerequisites
- [llama-cpp-python](https://github.com/abetlen/llama-cpp-python) installed with server support
- The Phi-4-mini-instruct model in GGUF format
## Installation
1. Install llama-cpp-python with server support:
```bash
pip install llama-cpp-python[server]
```
2. Ensure your model file is in the correct location:
```bash
models/Phi-4-mini-instruct-Q4_K_M-function_calling.gguf
```
## Running the Server
Start the llama-server with the following command:
```bash
llama-server \
--model models/Phi-4-mini-instruct-Q4_K_M-function_calling.gguf \
--port 8080 \
--jinja
```
This will start the server with:
- The model loaded in memory
- Server running on port 8082
- Verbose logging enabled
- Jinja template to support tool use
## Testing the API
You can test the server using curl commands. Here are some examples:
### Example 1: Using Tools
```bash
curl http://localhost:8080/v1/chat/completions -d '{
"model": "phi-4-mini-instruct-with-tools",
"tools": [
{
"type":"function",
"function":{
"name":"python",
"description":"Runs code in an ipython interpreter and returns the result of the execution after 60 seconds.",
"parameters":{
"type":"object",
"properties":{
"code":{
"type":"string",
"description":"The code to run in the ipython interpreter."
}
},
"required":["code"]
}
}
}
],
"messages": [
{
"role": "user",
"content": "Print a hello world message with python."
}
]
}'
```
### Example 2: Tell a Joke
```bash
curl http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "phi-4-mini-instruct-with-tools",
"messages": [
{"role":"system","content":"You are a helpful clown instruction assistant"},
{"role":"user","content":"tell me a funny joke"}
]
}'
```
### Example 3: Generate HTML Hello World
```bash
curl http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "phi-4-mini-instruct-with-tools",
"messages": [
{"role":"system","content":"You are a helpful coding assistant"},
{"role":"user","content":"give me an html hello world document"}
]
}'
```
## API Endpoints
The server provides a ChatGPT-compatible API with the following main endpoints:
- `/v1/chat/completions` - For chat completions
- `/v1/completions` - For text completions
- `/v1/models` - To list available models
## Notes
- The server uses the same API format as OpenAI's ChatGPT API, making it compatible with many existing tools and libraries
- The `--jinja` flag enables proper chat template formatting for the model, which is essential for tool usage
## Troubleshooting
If you encounter issues:
1. Ensure the model file exists in the specified path
2. Check that port 8080 is not in use by another application
3. Verify that llama-cpp-python is installed with server support
## License
Please ensure you comply with the model's license terms when using it.
|