Upload 9 files
Browse files- .gitattributes +2 -0
- Qwen3-1.7B-PolitScanner-Q5_K_S.gguf +3 -0
- README.md +142 -3
- create_venv.sh +7 -0
- download_ggufs.py +53 -0
- index.json +116 -0
- input_example.txt +1 -0
- main.py +145 -0
- politscanner.pdf +3 -0
- prompt.md +13 -0
.gitattributes
CHANGED
|
@@ -33,3 +33,5 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
|
| 33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
|
|
|
|
|
|
|
|
| 33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
| 36 |
+
politscanner.pdf filter=lfs diff=lfs merge=lfs -text
|
| 37 |
+
Qwen3-1.7B-PolitScanner-Q5_K_S.gguf filter=lfs diff=lfs merge=lfs -text
|
Qwen3-1.7B-PolitScanner-Q5_K_S.gguf
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:45b90f2ed2ee072908a7d8a61f0c5a0a6ebc1e1793c72d70887c290cad396e15
|
| 3 |
+
size 1230579264
|
README.md
CHANGED
|
@@ -1,3 +1,142 @@
|
|
| 1 |
-
|
| 2 |
-
|
| 3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# PolitScanner
|
| 2 |
+
|
| 3 |
+
## Abstract
|
| 4 |
+
|
| 5 |
+
Swiss politicians lie.
|
| 6 |
+
And they mostly get away with it.
|
| 7 |
+
"One reason for this is that fact-checks, which can only be carried out retrospectively, are surprisingly ineffective. Listeners still remember the false information. The correction is forgotten." — Philipp Gerlach.
|
| 8 |
+
This technical report provides a comprehensive overview of the artificial intelligence components of the PolitScanner project.
|
| 9 |
+
It aims to automatically detect false narratives and fake news in the speeches of Swiss politicians while avoiding the inaccuracies inherent in Large Language Models.
|
| 10 |
+
|
| 11 |
+
The training code can be found on [GitHub](https://github.com/lenamerkli/PolitScanner).
|
| 12 |
+
|
| 13 |
+
This is an entry for the Swiss AI Challenge 2025.
|
| 14 |
+
More information can be found at [www.ki-challenge.ch](https://www.ki-challenge.ch/).
|
| 15 |
+
|
| 16 |
+
## Table of Contents
|
| 17 |
+
|
| 18 |
+
0. [Abstract](#abstract)
|
| 19 |
+
1. [Paper](#paper)
|
| 20 |
+
2. [Installation](#installation)
|
| 21 |
+
3. [Usage](#usage)
|
| 22 |
+
4. [License](#license)
|
| 23 |
+
5. [Citation](#citation)
|
| 24 |
+
|
| 25 |
+
## Paper
|
| 26 |
+
|
| 27 |
+
Read the full paper [here](https://huggingface.co/lenamerkli/PolitScanner/blob/main/PolitScanner.pdf).
|
| 28 |
+
|
| 29 |
+
## Installation
|
| 30 |
+
|
| 31 |
+
**Note: inference only**
|
| 32 |
+
|
| 33 |
+
This installation guide is for Nobara Linux.
|
| 34 |
+
Other distributions should work as well.
|
| 35 |
+
|
| 36 |
+
For the full installation guide, see [the development readme](https://github.com/lenamerkli/PolitScanner/blob/main/README.md).
|
| 37 |
+
|
| 38 |
+
### Increase memlock
|
| 39 |
+
|
| 40 |
+
Add (or update) the following lines to `/etc/security/limits.conf`:
|
| 41 |
+
```text
|
| 42 |
+
* soft memlock 50331648
|
| 43 |
+
* hard memlock 50331648
|
| 44 |
+
```
|
| 45 |
+
|
| 46 |
+
### Git
|
| 47 |
+
|
| 48 |
+
Install git:
|
| 49 |
+
|
| 50 |
+
```shell
|
| 51 |
+
sudo dnf install git
|
| 52 |
+
```
|
| 53 |
+
|
| 54 |
+
Clone the PolitScanner repository:
|
| 55 |
+
|
| 56 |
+
```shell
|
| 57 |
+
git clone https://huggingface.co/lenamerkli/PolitScanner
|
| 58 |
+
cd PolitScanner
|
| 59 |
+
```
|
| 60 |
+
|
| 61 |
+
### Python
|
| 62 |
+
|
| 63 |
+
Install Python version 3.12.10 with the following command:
|
| 64 |
+
|
| 65 |
+
```shell
|
| 66 |
+
sudo dnf install python3.12-0:3.12.10-1.fc41.x86_64
|
| 67 |
+
```
|
| 68 |
+
|
| 69 |
+
Install the Python virtual environment package:
|
| 70 |
+
|
| 71 |
+
```shell
|
| 72 |
+
sudo dnf install python3-virtualenv
|
| 73 |
+
```
|
| 74 |
+
|
| 75 |
+
Create the virtual environment:
|
| 76 |
+
|
| 77 |
+
```shell
|
| 78 |
+
./create_venv.sh
|
| 79 |
+
```
|
| 80 |
+
|
| 81 |
+
Activate the virtual environment:
|
| 82 |
+
|
| 83 |
+
```shell
|
| 84 |
+
source .venv/bin/activate
|
| 85 |
+
```
|
| 86 |
+
|
| 87 |
+
### llama.cpp
|
| 88 |
+
|
| 89 |
+
If llama.cpp is not installed, check the [development readme](https://github.com/lenamerkli/PolitScanner/blob/main/README.md) for instructions.
|
| 90 |
+
|
| 91 |
+
### Download models
|
| 92 |
+
|
| 93 |
+
Run the downloader:
|
| 94 |
+
|
| 95 |
+
```shell
|
| 96 |
+
python3 download_ggufs.py
|
| 97 |
+
```
|
| 98 |
+
|
| 99 |
+
Move the PolitScanner model:
|
| 100 |
+
|
| 101 |
+
```shell
|
| 102 |
+
mv ./Qwen3-1.7B-PolitScanner-Q5_K_S.gguf /opt/llms/Qwen3-1.7B-PolitScanner-Q5_K_S.gguf
|
| 103 |
+
```
|
| 104 |
+
|
| 105 |
+
## Usage
|
| 106 |
+
|
| 107 |
+
Copy the political speech (preferably in swiss high german) to the `input.txt` file.
|
| 108 |
+
|
| 109 |
+
Run the program:
|
| 110 |
+
|
| 111 |
+
```shell
|
| 112 |
+
python3 main.py
|
| 113 |
+
```
|
| 114 |
+
|
| 115 |
+
The output will be written to the `output.txt` file.
|
| 116 |
+
|
| 117 |
+
## License
|
| 118 |
+
|
| 119 |
+
[MIT License](https://github.com/lenamerkli/PolitScanner/blob/main/LICENSE)
|
| 120 |
+
|
| 121 |
+
## Citation
|
| 122 |
+
|
| 123 |
+
bibtex:
|
| 124 |
+
```bibtex
|
| 125 |
+
@misc{merkli2025politscanner,
|
| 126 |
+
title = {PolitScanner: Automatic Detection of common Incorrect Statements in Speeches of Swiss Politicians},
|
| 127 |
+
author = {Lena Merkli},
|
| 128 |
+
year = {2025},
|
| 129 |
+
month = {07},
|
| 130 |
+
url = {https://huggingface.co/lenamerkli/PolitScanner}
|
| 131 |
+
}
|
| 132 |
+
```
|
| 133 |
+
biblatex:
|
| 134 |
+
```biblatex
|
| 135 |
+
@online{merkli2025politscanner,
|
| 136 |
+
title = {PolitScanner: Automatic Detection of common Incorrect Statements in Speeches of Swiss Politicians},
|
| 137 |
+
author = {Lena Merkli},
|
| 138 |
+
year = {2025},
|
| 139 |
+
month = {07},
|
| 140 |
+
url = {https://huggingface.co/lenamerkli/PolitScanner}
|
| 141 |
+
}
|
| 142 |
+
```
|
create_venv.sh
ADDED
|
@@ -0,0 +1,7 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
if [ ! -d .venv ]; then
|
| 2 |
+
/usr/bin/python3.12 -m venv .venv
|
| 3 |
+
fi
|
| 4 |
+
|
| 5 |
+
# install packages
|
| 6 |
+
.venv/bin/pip3 install --trusted-host pypi.org --trusted-host files.pythonhosted.org --upgrade pip
|
| 7 |
+
.venv/bin/pip3 install wheel==0.46.1 setuptools==79.0.0 flask==3.1.0 requests==2.32.3 tqdm==4.67.1 chromadb==1.0.7 certifi==2025.6.15
|
download_ggufs.py
ADDED
|
@@ -0,0 +1,53 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
import sys
|
| 2 |
+
sys.path.append('/home/lena/Documents/python/PolitScanner/util')
|
| 3 |
+
|
| 4 |
+
from pathlib import Path
|
| 5 |
+
from os.path import join
|
| 6 |
+
from requests import get
|
| 7 |
+
from shutil import copyfile
|
| 8 |
+
|
| 9 |
+
|
| 10 |
+
MODELS = {
|
| 11 |
+
'DeepSeek-R1-0528-Qwen3-8B-Q5_K_L.gguf': 'https://huggingface.co/bartowski/deepseek-ai_DeepSeek-R1-0528-Qwen3-8B-GGUF/resolve/main/deepseek-ai_DeepSeek-R1-0528-Qwen3-8B-Q5_K_L.gguf',
|
| 12 |
+
'Ministral-8B-Instruct-2410-Q4_K_S.gguf': 'https://huggingface.co/bartowski/Ministral-8B-Instruct-2410-GGUF/resolve/main/Ministral-8B-Instruct-2410-Q4_K_S.gguf',
|
| 13 |
+
'Qwen3-30B-A3B-Q5_K_M.gguf': 'https://huggingface.co/bartowski/Qwen_Qwen3-30B-A3B-GGUF/resolve/main/Qwen_Qwen3-30B-A3B-Q5_K_M.gguf',
|
| 14 |
+
'Qwen3-8B-Q5_K_M.gguf': 'https://huggingface.co/bartowski/Qwen_Qwen3-8B-GGUF/resolve/main/Qwen_Qwen3-8B-Q5_K_M.gguf',
|
| 15 |
+
'Qwen3-32B-Q4_K_S.gguf': 'https://huggingface.co/unsloth/Qwen3-32B-GGUF/resolve/main/Qwen3-32B-Q4_K_S.gguf',
|
| 16 |
+
}
|
| 17 |
+
|
| 18 |
+
|
| 19 |
+
def download_file(url: str, directory: str, filename: str = None):
|
| 20 |
+
Path(directory).mkdir(parents=True, exist_ok=True)
|
| 21 |
+
if filename is None:
|
| 22 |
+
filename = url.split('/')[-1].split('?')[0]
|
| 23 |
+
filepath = join(directory, filename)
|
| 24 |
+
with get(url, stream=True) as r:
|
| 25 |
+
r.raise_for_status()
|
| 26 |
+
with open(filepath, 'wb') as f:
|
| 27 |
+
for chunk in r.iter_content(chunk_size=4 * 1024 * 1024):
|
| 28 |
+
f.write(chunk)
|
| 29 |
+
return filepath
|
| 30 |
+
|
| 31 |
+
|
| 32 |
+
def main() -> None:
|
| 33 |
+
Path('/opt/llms').mkdir(exist_ok=True, parents=True)
|
| 34 |
+
question = 'Which of these following models do you want to download?'
|
| 35 |
+
for i, model in enumerate(MODELS.keys()):
|
| 36 |
+
question += f"\n[{i + 1}] {model}"
|
| 37 |
+
response = input(question).replace(' ', '').replace(';', ',')
|
| 38 |
+
parsed = [int(x) for x in response.split(',')]
|
| 39 |
+
if len(parsed) > 0:
|
| 40 |
+
copyfile('./index.json', '/opt/llms/index.json')
|
| 41 |
+
for i in parsed:
|
| 42 |
+
model = list(MODELS.keys())[i - 1]
|
| 43 |
+
url = MODELS[model]
|
| 44 |
+
print(f"Downloading {model}")
|
| 45 |
+
download_file(url, '/opt/llms/', model)
|
| 46 |
+
print(f"Downloaded {model}")
|
| 47 |
+
print('Done')
|
| 48 |
+
|
| 49 |
+
|
| 50 |
+
if __name__ == '__main__':
|
| 51 |
+
main()
|
| 52 |
+
|
| 53 |
+
|
index.json
ADDED
|
@@ -0,0 +1,116 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"Qwen3-30B-A3B": {
|
| 3 |
+
"parameters": 30532122624,
|
| 4 |
+
"context": 32768,
|
| 5 |
+
"layers": 48,
|
| 6 |
+
"thinking": true,
|
| 7 |
+
"optional_thinking": true,
|
| 8 |
+
"system_message": "",
|
| 9 |
+
"chat_template": "{%- if tools %}\n {{- '<|im_start|>system\\n' }}\n {%- if messages[0].role == 'system' %}\n {{- messages[0].content + '\\n\\n' }}\n {%- endif %}\n {{- \"# Tools\\n\\nYou may call one or more functions to assist with the user query.\\n\\nYou are provided with function signatures within <tools></tools> XML tags:\\n<tools>\" }}\n {%- for tool in tools %}\n {{- \"\\n\" }}\n {{- tool | tojson }}\n {%- endfor %}\n {{- \"\\n</tools>\\n\\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\\n<tool_call>\\n{\\\"name\\\": <function-name>, \\\"arguments\\\": <args-json-object>}\\n</tool_call><|im_end|>\\n\" }}\n{%- else %}\n {%- if messages[0].role == 'system' %}\n {{- '<|im_start|>system\\n' + messages[0].content + '<|im_end|>\\n' }}\n {%- endif %}\n{%- endif %}\n{%- set ns = namespace(multi_step_tool=true, last_query_index=messages|length - 1) %}\n{%- for message in messages[::-1] %}\n {%- set index = (messages|length - 1) - loop.index0 %}\n {%- if ns.multi_step_tool and message.role == \"user\" and message.content is string and not(message.content.startswith('<tool_response>') and message.content.endswith('</tool_response>')) %}\n {%- set ns.multi_step_tool = false %}\n {%- set ns.last_query_index = index %}\n {%- endif %}\n{%- endfor %}\n{%- for message in messages %}\n {%- if message.content is string %}\n {%- set content = message.content %}\n {%- else %}\n {%- set content = '' %}\n {%- endif %}\n {%- if (message.role == \"user\") or (message.role == \"system\" and not loop.first) %}\n {{- '<|im_start|>' + message.role + '\\n' + content + '<|im_end|>' + '\\n' }}\n {%- elif message.role == \"assistant\" %}\n {%- set reasoning_content = '' %}\n {%- if message.reasoning_content is string %}\n {%- set reasoning_content = message.reasoning_content %}\n {%- else %}\n {%- if '</think>' in content %}\n {%- set reasoning_content = content.split('</think>')[0].rstrip('\\n').split('<think>')[-1].lstrip('\\n') %}\n {%- set content = content.split('</think>')[-1].lstrip('\\n') %}\n {%- endif %}\n {%- endif %}\n {%- if loop.index0 > ns.last_query_index %}\n {%- if loop.last or (not loop.last and reasoning_content) %}\n {{- '<|im_start|>' + message.role + '\\n<think>\\n' + reasoning_content.strip('\\n') + '\\n</think>\\n\\n' + content.lstrip('\\n') }}\n {%- else %}\n {{- '<|im_start|>' + message.role + '\\n' + content }}\n {%- endif %}\n {%- else %}\n {{- '<|im_start|>' + message.role + '\\n' + content }}\n {%- endif %}\n {%- if message.tool_calls %}\n {%- for tool_call in message.tool_calls %}\n {%- if (loop.first and content) or (not loop.first) %}\n {{- '\\n' }}\n {%- endif %}\n {%- if tool_call.function %}\n {%- set tool_call = tool_call.function %}\n {%- endif %}\n {{- '<tool_call>\\n{\"name\": \"' }}\n {{- tool_call.name }}\n {{- '\", \"arguments\": ' }}\n {%- if tool_call.arguments is string %}\n {{- tool_call.arguments }}\n {%- else %}\n {{- tool_call.arguments | tojson }}\n {%- endif %}\n {{- '}\\n</tool_call>' }}\n {%- endfor %}\n {%- endif %}\n {{- '<|im_end|>\\n' }}\n {%- elif message.role == \"tool\" %}\n {%- if loop.first or (messages[loop.index0 - 1].role != \"tool\") %}\n {{- '<|im_start|>user' }}\n {%- endif %}\n {{- '\\n<tool_response>\\n' }}\n {{- content }}\n {{- '\\n</tool_response>' }}\n {%- if loop.last or (messages[loop.index0 + 1].role != \"tool\") %}\n {{- '<|im_end|>\\n' }}\n {%- endif %}\n {%- endif %}\n{%- endfor %}\n{%- if add_generation_prompt %}\n {{- '<|im_start|>assistant\\n' }}\n {%- if enable_thinking is defined and enable_thinking is false %}\n {{- '<think>\\n\\n</think>\\n\\n' }}\n {%- endif %}\n{%- endif %}",
|
| 10 |
+
"sampling": {
|
| 11 |
+
"temperature": 0.7,
|
| 12 |
+
"top_p": 0.8,
|
| 13 |
+
"top_k": 20,
|
| 14 |
+
"min_p": 0
|
| 15 |
+
},
|
| 16 |
+
"sampling_thinking": {
|
| 17 |
+
"temperature": 0.6,
|
| 18 |
+
"top_p": 0.95,
|
| 19 |
+
"top_k": 20,
|
| 20 |
+
"min_p": 0
|
| 21 |
+
}
|
| 22 |
+
},
|
| 23 |
+
"Qwen3-1.7B": {
|
| 24 |
+
"parameters": 2031739904,
|
| 25 |
+
"context": 32768,
|
| 26 |
+
"layers": 28,
|
| 27 |
+
"thinking": true,
|
| 28 |
+
"optional_thinking": true,
|
| 29 |
+
"system_message": "",
|
| 30 |
+
"chat_template": "{%- if tools %}\n {{- '<|im_start|>system\\n' }}\n {%- if messages[0].role == 'system' %}\n {{- messages[0].content + '\\n\\n' }}\n {%- endif %}\n {{- \"# Tools\\n\\nYou may call one or more functions to assist with the user query.\\n\\nYou are provided with function signatures within <tools></tools> XML tags:\\n<tools>\" }}\n {%- for tool in tools %}\n {{- \"\\n\" }}\n {{- tool | tojson }}\n {%- endfor %}\n {{- \"\\n</tools>\\n\\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\\n<tool_call>\\n{\\\"name\\\": <function-name>, \\\"arguments\\\": <args-json-object>}\\n</tool_call><|im_end|>\\n\" }}\n{%- else %}\n {%- if messages[0].role == 'system' %}\n {{- '<|im_start|>system\\n' + messages[0].content + '<|im_end|>\\n' }}\n {%- endif %}\n{%- endif %}\n{%- set ns = namespace(multi_step_tool=true, last_query_index=messages|length - 1) %}\n{%- for message in messages[::-1] %}\n {%- set index = (messages|length - 1) - loop.index0 %}\n {%- if ns.multi_step_tool and message.role == \"user\" and message.content is string and not(message.content.startswith('<tool_response>') and message.content.endswith('</tool_response>')) %}\n {%- set ns.multi_step_tool = false %}\n {%- set ns.last_query_index = index %}\n {%- endif %}\n{%- endfor %}\n{%- for message in messages %}\n {%- if message.content is string %}\n {%- set content = message.content %}\n {%- else %}\n {%- set content = '' %}\n {%- endif %}\n {%- if (message.role == \"user\") or (message.role == \"system\" and not loop.first) %}\n {{- '<|im_start|>' + message.role + '\\n' + content + '<|im_end|>' + '\\n' }}\n {%- elif message.role == \"assistant\" %}\n {%- set reasoning_content = '' %}\n {%- if message.reasoning_content is string %}\n {%- set reasoning_content = message.reasoning_content %}\n {%- else %}\n {%- if '</think>' in content %}\n {%- set reasoning_content = content.split('</think>')[0].rstrip('\\n').split('<think>')[-1].lstrip('\\n') %}\n {%- set content = content.split('</think>')[-1].lstrip('\\n') %}\n {%- endif %}\n {%- endif %}\n {%- if loop.index0 > ns.last_query_index %}\n {%- if loop.last or (not loop.last and reasoning_content) %}\n {{- '<|im_start|>' + message.role + '\\n<think>\\n' + reasoning_content.strip('\\n') + '\\n</think>\\n\\n' + content.lstrip('\\n') }}\n {%- else %}\n {{- '<|im_start|>' + message.role + '\\n' + content }}\n {%- endif %}\n {%- else %}\n {{- '<|im_start|>' + message.role + '\\n' + content }}\n {%- endif %}\n {%- if message.tool_calls %}\n {%- for tool_call in message.tool_calls %}\n {%- if (loop.first and content) or (not loop.first) %}\n {{- '\\n' }}\n {%- endif %}\n {%- if tool_call.function %}\n {%- set tool_call = tool_call.function %}\n {%- endif %}\n {{- '<tool_call>\\n{\"name\": \"' }}\n {{- tool_call.name }}\n {{- '\", \"arguments\": ' }}\n {%- if tool_call.arguments is string %}\n {{- tool_call.arguments }}\n {%- else %}\n {{- tool_call.arguments | tojson }}\n {%- endif %}\n {{- '}\\n</tool_call>' }}\n {%- endfor %}\n {%- endif %}\n {{- '<|im_end|>\\n' }}\n {%- elif message.role == \"tool\" %}\n {%- if loop.first or (messages[loop.index0 - 1].role != \"tool\") %}\n {{- '<|im_start|>user' }}\n {%- endif %}\n {{- '\\n<tool_response>\\n' }}\n {{- content }}\n {{- '\\n</tool_response>' }}\n {%- if loop.last or (messages[loop.index0 + 1].role != \"tool\") %}\n {{- '<|im_end|>\\n' }}\n {%- endif %}\n {%- endif %}\n{%- endfor %}\n{%- if add_generation_prompt %}\n {{- '<|im_start|>assistant\\n' }}\n {%- if enable_thinking is defined and enable_thinking is false %}\n {{- '<think>\\n\\n</think>\\n\\n' }}\n {%- endif %}\n{%- endif %}",
|
| 31 |
+
"sampling": {
|
| 32 |
+
"temperature": 0.7,
|
| 33 |
+
"top_p": 0.8,
|
| 34 |
+
"top_k": 20,
|
| 35 |
+
"min_p": 0
|
| 36 |
+
},
|
| 37 |
+
"sampling_thinking": {
|
| 38 |
+
"temperature": 0.6,
|
| 39 |
+
"top_p": 0.95,
|
| 40 |
+
"top_k": 20,
|
| 41 |
+
"min_p": 0
|
| 42 |
+
}
|
| 43 |
+
},
|
| 44 |
+
"Qwen3-8B": {
|
| 45 |
+
"parameters": 8190735360,
|
| 46 |
+
"context": 32768,
|
| 47 |
+
"layers": 36,
|
| 48 |
+
"thinking": true,
|
| 49 |
+
"optional_thinking": true,
|
| 50 |
+
"system_message": "",
|
| 51 |
+
"chat_template": "{%- if tools %}\n {{- '<|im_start|>system\\n' }}\n {%- if messages[0].role == 'system' %}\n {{- messages[0].content + '\\n\\n' }}\n {%- endif %}\n {{- \"# Tools\\n\\nYou may call one or more functions to assist with the user query.\\n\\nYou are provided with function signatures within <tools></tools> XML tags:\\n<tools>\" }}\n {%- for tool in tools %}\n {{- \"\\n\" }}\n {{- tool | tojson }}\n {%- endfor %}\n {{- \"\\n</tools>\\n\\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\\n<tool_call>\\n{\\\"name\\\": <function-name>, \\\"arguments\\\": <args-json-object>}\\n</tool_call><|im_end|>\\n\" }}\n{%- else %}\n {%- if messages[0].role == 'system' %}\n {{- '<|im_start|>system\\n' + messages[0].content + '<|im_end|>\\n' }}\n {%- endif %}\n{%- endif %}\n{%- set ns = namespace(multi_step_tool=true, last_query_index=messages|length - 1) %}\n{%- for message in messages[::-1] %}\n {%- set index = (messages|length - 1) - loop.index0 %}\n {%- if ns.multi_step_tool and message.role == \"user\" and message.content is string and not(message.content.startswith('<tool_response>') and message.content.endswith('</tool_response>')) %}\n {%- set ns.multi_step_tool = false %}\n {%- set ns.last_query_index = index %}\n {%- endif %}\n{%- endfor %}\n{%- for message in messages %}\n {%- if message.content is string %}\n {%- set content = message.content %}\n {%- else %}\n {%- set content = '' %}\n {%- endif %}\n {%- if (message.role == \"user\") or (message.role == \"system\" and not loop.first) %}\n {{- '<|im_start|>' + message.role + '\\n' + content + '<|im_end|>' + '\\n' }}\n {%- elif message.role == \"assistant\" %}\n {%- set reasoning_content = '' %}\n {%- if message.reasoning_content is string %}\n {%- set reasoning_content = message.reasoning_content %}\n {%- else %}\n {%- if '</think>' in content %}\n {%- set reasoning_content = content.split('</think>')[0].rstrip('\\n').split('<think>')[-1].lstrip('\\n') %}\n {%- set content = content.split('</think>')[-1].lstrip('\\n') %}\n {%- endif %}\n {%- endif %}\n {%- if loop.index0 > ns.last_query_index %}\n {%- if loop.last or (not loop.last and reasoning_content) %}\n {{- '<|im_start|>' + message.role + '\\n<think>\\n' + reasoning_content.strip('\\n') + '\\n</think>\\n\\n' + content.lstrip('\\n') }}\n {%- else %}\n {{- '<|im_start|>' + message.role + '\\n' + content }}\n {%- endif %}\n {%- else %}\n {{- '<|im_start|>' + message.role + '\\n' + content }}\n {%- endif %}\n {%- if message.tool_calls %}\n {%- for tool_call in message.tool_calls %}\n {%- if (loop.first and content) or (not loop.first) %}\n {{- '\\n' }}\n {%- endif %}\n {%- if tool_call.function %}\n {%- set tool_call = tool_call.function %}\n {%- endif %}\n {{- '<tool_call>\\n{\"name\": \"' }}\n {{- tool_call.name }}\n {{- '\", \"arguments\": ' }}\n {%- if tool_call.arguments is string %}\n {{- tool_call.arguments }}\n {%- else %}\n {{- tool_call.arguments | tojson }}\n {%- endif %}\n {{- '}\\n</tool_call>' }}\n {%- endfor %}\n {%- endif %}\n {{- '<|im_end|>\\n' }}\n {%- elif message.role == \"tool\" %}\n {%- if loop.first or (messages[loop.index0 - 1].role != \"tool\") %}\n {{- '<|im_start|>user' }}\n {%- endif %}\n {{- '\\n<tool_response>\\n' }}\n {{- content }}\n {{- '\\n</tool_response>' }}\n {%- if loop.last or (messages[loop.index0 + 1].role != \"tool\") %}\n {{- '<|im_end|>\\n' }}\n {%- endif %}\n {%- endif %}\n{%- endfor %}\n{%- if add_generation_prompt %}\n {{- '<|im_start|>assistant\\n' }}\n {%- if enable_thinking is defined and enable_thinking is false %}\n {{- '<think>\\n\\n</think>\\n\\n' }}\n {%- endif %}\n{%- endif %}",
|
| 52 |
+
"sampling": {
|
| 53 |
+
"temperature": 0.7,
|
| 54 |
+
"top_p": 0.8,
|
| 55 |
+
"top_k": 20,
|
| 56 |
+
"min_p": 0
|
| 57 |
+
},
|
| 58 |
+
"sampling_thinking": {
|
| 59 |
+
"temperature": 0.6,
|
| 60 |
+
"top_p": 0.95,
|
| 61 |
+
"top_k": 20,
|
| 62 |
+
"min_p": 0
|
| 63 |
+
}
|
| 64 |
+
},
|
| 65 |
+
"Qwen3-32B": {
|
| 66 |
+
"parameters": 32762123264,
|
| 67 |
+
"context": 32768,
|
| 68 |
+
"layers": 64,
|
| 69 |
+
"thinking": true,
|
| 70 |
+
"optional_thinking": true,
|
| 71 |
+
"system_message": "",
|
| 72 |
+
"chat_template": "{%- if tools %}\n {{- '<|im_start|>system\\n' }}\n {%- if messages[0].role == 'system' %}\n {{- messages[0].content + '\\n\\n' }}\n {%- endif %}\n {{- \"# Tools\\n\\nYou may call one or more functions to assist with the user query.\\n\\nYou are provided with function signatures within <tools></tools> XML tags:\\n<tools>\" }}\n {%- for tool in tools %}\n {{- \"\\n\" }}\n {{- tool | tojson }}\n {%- endfor %}\n {{- \"\\n</tools>\\n\\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\\n<tool_call>\\n{\\\"name\\\": <function-name>, \\\"arguments\\\": <args-json-object>}\\n</tool_call><|im_end|>\\n\" }}\n{%- else %}\n {%- if messages[0].role == 'system' %}\n {{- '<|im_start|>system\\n' + messages[0].content + '<|im_end|>\\n' }}\n {%- endif %}\n{%- endif %}\n{%- set ns = namespace(multi_step_tool=true, last_query_index=messages|length - 1) %}\n{%- for message in messages[::-1] %}\n {%- set index = (messages|length - 1) - loop.index0 %}\n {%- if ns.multi_step_tool and message.role == \"user\" and not(message.content.startswith('<tool_response>') and message.content.endswith('</tool_response>')) %}\n {%- set ns.multi_step_tool = false %}\n {%- set ns.last_query_index = index %}\n {%- endif %}\n{%- endfor %}\n{%- for message in messages %}\n {%- if (message.role == \"user\") or (message.role == \"system\" and not loop.first) %}\n {{- '<|im_start|>' + message.role + '\\n' + message.content + '<|im_end|>' + '\\n' }}\n {%- elif message.role == \"assistant\" %}\n {%- set content = message.content %}\n {%- set reasoning_content = '' %}\n {%- if message.reasoning_content is defined and message.reasoning_content is not none %}\n {%- set reasoning_content = message.reasoning_content %}\n {%- else %}\n {%- if '</think>' in message.content %}\n {%- set content = message.content.split('</think>')[-1].lstrip('\\n') %}\n {%- set reasoning_content = message.content.split('</think>')[0].rstrip('\\n').split('<think>')[-1].lstrip('\\n') %}\n {%- endif %}\n {%- endif %}\n {%- if loop.index0 > ns.last_query_index %}\n {%- if loop.last or (not loop.last and reasoning_content) %}\n {{- '<|im_start|>' + message.role + '\\n<think>\\n' + reasoning_content.strip('\\n') + '\\n</think>\\n\\n' + content.lstrip('\\n') }}\n {%- else %}\n {{- '<|im_start|>' + message.role + '\\n' + content }}\n {%- endif %}\n {%- else %}\n {{- '<|im_start|>' + message.role + '\\n' + content }}\n {%- endif %}\n {%- if message.tool_calls %}\n {%- for tool_call in message.tool_calls %}\n {%- if (loop.first and content) or (not loop.first) %}\n {{- '\\n' }}\n {%- endif %}\n {%- if tool_call.function %}\n {%- set tool_call = tool_call.function %}\n {%- endif %}\n {{- '<tool_call>\\n{\"name\": \"' }}\n {{- tool_call.name }}\n {{- '\", \"arguments\": ' }}\n {%- if tool_call.arguments is string %}\n {{- tool_call.arguments }}\n {%- else %}\n {{- tool_call.arguments | tojson }}\n {%- endif %}\n {{- '}\\n</tool_call>' }}\n {%- endfor %}\n {%- endif %}\n {{- '<|im_end|>\\n' }}\n {%- elif message.role == \"tool\" %}\n {%- if loop.first or (messages[loop.index0 - 1].role != \"tool\") %}\n {{- '<|im_start|>user' }}\n {%- endif %}\n {{- '\\n<tool_response>\\n' }}\n {{- message.content }}\n {{- '\\n</tool_response>' }}\n {%- if loop.last or (messages[loop.index0 + 1].role != \"tool\") %}\n {{- '<|im_end|>\\n' }}\n {%- endif %}\n {%- endif %}\n{%- endfor %}\n{%- if add_generation_prompt %}\n {{- '<|im_start|>assistant\\n' }}\n {%- if enable_thinking is defined and enable_thinking is false %}\n {{- '<think>\\n\\n</think>\\n\\n' }}\n {%- endif %}\n{%- endif %}",
|
| 73 |
+
"sampling": {
|
| 74 |
+
"temperature": 0.7,
|
| 75 |
+
"top_p": 0.8,
|
| 76 |
+
"top_k": 20,
|
| 77 |
+
"min_p": 0
|
| 78 |
+
},
|
| 79 |
+
"sampling_thinking": {
|
| 80 |
+
"temperature": 0.6,
|
| 81 |
+
"top_p": 0.95,
|
| 82 |
+
"top_k": 20,
|
| 83 |
+
"min_p": 0
|
| 84 |
+
}
|
| 85 |
+
},
|
| 86 |
+
"Deepseek-R1-0528-Qwen3-8B": {
|
| 87 |
+
"parameters": 8190735360,
|
| 88 |
+
"context": 32768,
|
| 89 |
+
"layers": 36,
|
| 90 |
+
"thinking": true,
|
| 91 |
+
"optional_thinking": false,
|
| 92 |
+
"system_message": "该助手为DeepSeek-R1,由深度求索公司创造。",
|
| 93 |
+
"chat_template": "{% if not add_generation_prompt is defined %}{% set add_generation_prompt = false %}{% endif %}{% set ns = namespace(is_first=false, is_tool=false, is_output_first=true, system_prompt='', is_first_sp=true, is_last_user=false) %}{%- for message in messages %}{%- if message['role'] == 'system' %}{%- if ns.is_first_sp %}{% set ns.system_prompt = ns.system_prompt + message['content'] %}{% set ns.is_first_sp = false %}{%- else %}{% set ns.system_prompt = ns.system_prompt + '\n\n' + message['content'] %}{%- endif %}{%- endif %}{%- endfor %}{{ bos_token }}{{ ns.system_prompt }}{%- for message in messages %}{% set content = message['content'] %}{%- if message['role'] == 'user' %}{%- set ns.is_tool = false -%}{%- set ns.is_first = false -%}{%- set ns.is_last_user = true -%}{{'<|User|>' + content + '<|Assistant|>'}}{%- endif %}{%- if message['role'] == 'assistant' %}{% if '</think>' in content %}{% set content = content.split('</think>')[-1] %}{% endif %}{% endif %}{%- if message['role'] == 'assistant' and message['tool_calls'] is defined and message['tool_calls'] is not none %}{%- set ns.is_last_user = false -%}{%- if ns.is_tool %}{{'<|tool▁outputs▁end|>'}}{%- endif %}{%- set ns.is_first = false %}{%- set ns.is_tool = false -%}{%- set ns.is_output_first = true %}{%- for tool in message['tool_calls'] %}{%- if not ns.is_first %}{%- if content is none %}{{'<|tool▁calls▁begin|><|tool▁call▁begin|>' + tool['type'] + '<|tool▁sep|>' + tool['function']['name'] + '\n' + '```json' + '\n' + tool['function']['arguments'] + '\n' + '```' + '<|tool▁call▁end|>'}}{%- else %}{{content + '<|tool▁calls▁begin|><|tool▁call▁begin|>' + tool['type'] + '<|tool▁sep|>' + tool['function']['name'] + '\n' + '```json' + '\n' + tool['function']['arguments'] + '\n' + '```' + '<|tool▁call▁end|>'}}{%- endif %}{%- set ns.is_first = true -%}{%- else %}{{'\n' + '<|tool▁call▁begin|>' + tool['type'] + '<|tool▁sep|>' + tool['function']['name'] + '\n' + '```json' + '\n' + tool['function']['arguments'] + '\n' + '```' + '<|tool▁call▁end|>'}}{%- endif %}{%- endfor %}{{'<|tool▁calls▁end|><|end▁of▁sentence|>'}}{%- endif %}{%- if message['role'] == 'assistant' and (message['tool_calls'] is not defined or message['tool_calls'] is none)%}{%- set ns.is_last_user = false -%}{%- if ns.is_tool %}{{'<|tool▁outputs▁end|>' + content + '<|end▁of▁sentence|>'}}{%- set ns.is_tool = false -%}{%- else %}{{content + '<|end▁of▁sentence|>'}}{%- endif %}{%- endif %}{%- if message['role'] == 'tool' %}{%- set ns.is_last_user = false -%}{%- set ns.is_tool = true -%}{%- if ns.is_output_first %}{{'<|tool▁outputs▁begin|><|tool▁output▁begin|>' + content + '<|tool▁output▁end|>'}}{%- set ns.is_output_first = false %}{%- else %}{{'\n<|tool▁output▁begin|>' + content + '<|tool▁output▁end|>'}}{%- endif %}{%- endif %}{%- endfor -%}{% if ns.is_tool %}{{'<|tool▁outputs▁end|>'}}{% endif %}{% if add_generation_prompt and not ns.is_last_user and not ns.is_tool %}{{'<|Assistant|>'}}{% endif %}",
|
| 94 |
+
"sampling_thinking": {
|
| 95 |
+
"temperature": 0.6,
|
| 96 |
+
"top_p": 0.95,
|
| 97 |
+
"top_k": 20,
|
| 98 |
+
"min_p": 0
|
| 99 |
+
}
|
| 100 |
+
},
|
| 101 |
+
"Ministral-8B-Instruct-2410": {
|
| 102 |
+
"parameters": 8019808256,
|
| 103 |
+
"context": 32768,
|
| 104 |
+
"layers": 26,
|
| 105 |
+
"thinking": false,
|
| 106 |
+
"optional_thinking": false,
|
| 107 |
+
"system_message": "",
|
| 108 |
+
"chat_template": "{%- if messages[0][\"role\"] == \"system\" %}\n {%- set system_message = messages[0][\"content\"] %}\n {%- set loop_messages = messages[1:] %}\n{%- else %}\n {%- set loop_messages = messages %}\n{%- endif %}\n{%- if not tools is defined %}\n {%- set tools = none %}\n{%- endif %}\n{%- set user_messages = loop_messages | selectattr(\"role\", \"equalto\", \"user\") | list %}\n\n{#- This block checks for alternating user/assistant messages, skipping tool calling messages #}\n{%- set ns = namespace() %}\n{%- set ns.index = 0 %}\n{%- for message in loop_messages %}\n {%- if not (message.role == \"tool\" or message.role == \"tool_results\" or (message.tool_calls is defined and message.tool_calls is not none)) %}\n {%- if (message[\"role\"] == \"user\") != (ns.index % 2 == 0) %}\n {{- raise_exception(\"After the optional system message, conversation roles must alternate user/assistant/user/assistant/...\") }}\n {%- endif %}\n {%- set ns.index = ns.index + 1 %}\n {%- endif %}\n{%- endfor %}\n\n{{- bos_token }}\n{%- for message in loop_messages %}\n {%- if message[\"role\"] == \"user\" %}\n {%- if tools is not none and (message == user_messages[-1]) %}\n {{- \"[AVAILABLE_TOOLS][\" }}\n {%- for tool in tools %}\n {%- set tool = tool.function %}\n {{- '{\"type\": \"function\", \"function\": {' }}\n {%- for key, val in tool.items() if key != \"return\" %}\n {%- if val is string %}\n {{- '\"' + key + '\": \"' + val + '\"' }}\n {%- else %}\n {{- '\"' + key + '\": ' + val|tojson }}\n {%- endif %}\n {%- if not loop.last %}\n {{- \", \" }}\n {%- endif %}\n {%- endfor %}\n {{- \"}}\" }}\n {%- if not loop.last %}\n {{- \", \" }}\n {%- else %}\n {{- \"]\" }}\n {%- endif %}\n {%- endfor %}\n {{- \"[/AVAILABLE_TOOLS]\" }}\n {%- endif %}\n {%- if loop.last and system_message is defined %}\n {{- \"[INST]\" + system_message + \"\\n\\n\" + message[\"content\"] + \"[/INST]\" }}\n {%- else %}\n {{- \"[INST]\" + message[\"content\"] + \"[/INST]\" }}\n {%- endif %}\n {%- elif (message.tool_calls is defined and message.tool_calls is not none) %}\n {{- \"[TOOL_CALLS][\" }}\n {%- for tool_call in message.tool_calls %}\n {%- set out = tool_call.function|tojson %}\n {{- out[:-1] }}\n {%- if not tool_call.id is defined or tool_call.id|length != 9 %}\n {{- raise_exception(\"Tool call IDs should be alphanumeric strings with length 9!\") }}\n {%- endif %}\n {{- ', \"id\": \"' + tool_call.id + '\"}' }}\n {%- if not loop.last %}\n {{- \", \" }}\n {%- else %}\n {{- \"]\" + eos_token }}\n {%- endif %}\n {%- endfor %}\n {%- elif message[\"role\"] == \"assistant\" %}\n {{- message[\"content\"] + eos_token}}\n {%- elif message[\"role\"] == \"tool_results\" or message[\"role\"] == \"tool\" %}\n {%- if message.content is defined and message.content.content is defined %}\n {%- set content = message.content.content %}\n {%- else %}\n {%- set content = message.content %}\n {%- endif %}\n {{- '[TOOL_RESULTS]{\"content\": ' + content|string + \", \" }}\n {%- if not message.tool_call_id is defined or message.tool_call_id|length != 9 %}\n {{- raise_exception(\"Tool call IDs should be alphanumeric strings with length 9!\") }}\n {%- endif %}\n {{- '\"call_id\": \"' + message.tool_call_id + '\"}[/TOOL_RESULTS]' }}\n {%- else %}\n {{- raise_exception(\"Only user and assistant roles are supported, with the exception of an initial optional system message!\") }}\n {%- endif %}\n{%- endfor %}\n",
|
| 109 |
+
"sampling": {
|
| 110 |
+
"temperature": 0.5,
|
| 111 |
+
"top_p": 0.9,
|
| 112 |
+
"top_k": 20,
|
| 113 |
+
"min_p": 0
|
| 114 |
+
}
|
| 115 |
+
}
|
| 116 |
+
}
|
input_example.txt
ADDED
|
@@ -0,0 +1 @@
|
|
|
|
|
|
|
| 1 |
+
Jetzt ist ja Klima ein grosses Thema. Sie haben sich bisher nicht gross geäussert. Jetzt ist eine Strategie in Zürich rausgekommen, man will die rot-grüne Klimapanik bekämpfen. Wie stehen Sie jetzt zu diesem Thema? Wie stehen Sie zu diesen Demonstrationen von all diesen Kindern und Schülern, die wöchentlich auf die Strasse gehen? Die, die auf die Strasse gehen und sagen, wir sind für ein gesundes Klima, da kann ich niemandem dagegen sein. Das ist ja schön, wenn sie das machen. Aber was dahinter steht, politisch bei den Grünen, das sind Sachen, die ganz verwerflich sind. Die wollen jetzt das lösen, mehr Eingriffe vom Staat, mehr Steuern, Abgaben, Gebühren, 20 ApB fürs Benzin, Heizkosten bis zu 1400 Franken pro Haushalt. Dann, was wir sollen essen und nicht sollen essen und wie wir sollen essen und wie wir sollen leben und wohnen und wie gross die Wohnungen sind, das hört nicht mehr auf. Gegen das sind wir massiv. Das ist ein Eingriff in die Freiheit und bringt dem Klima schlussendlich gar nichts. Was würde Ihrer Meinung nach etwas bringen, um den Klimawandel zu stoppen? Weiterfahren mit dem Programm, das wir jetzt haben. Wir müssen mal schauen, was wir schon alles haben. Wir haben die ganze Gewässerverschmutzung unter Kontrolle. Wir haben saubere Gewässer, sagen mir die Fischer, oder? Weil wir sogar das Meteowasser reinigen. Wir haben Rauchgasreinigung. Wir haben beim Auto Abgasvorschriften eingeführt. Wir gehen noch weiter mit dem. Muss ja technologisch möglich sein. Und dann die Innovation nicht vom Staat fördern, sondern schauen, dass die Privaten etwas machen. Wer hat ein Elektroauto? Wer hat ein Hybrid gemacht? Nicht der Staat und auch nicht die Grünen, sondern die Autoindustrie, die Wirtschaft hat das entwickelt. Kommt mit dem. Ich höre, sie arbeiten schon an den Flugzeugen. Sie wollen abgasfreie Flugzeuge. Auf der Lärmseite höre ich, das sind doch Massnahmen, die wir treffen müssen. Wir treffen sie nicht, weil das Klima sonst kaputt geht, sondern weil wir saubere Luft, reines Wasser, gesunden Boden wollen. Das war immer das Programm und bei dem müssen wir bleiben. Und dann wird es gut kommen. Kann man denn mit einer Klima-Offensive bei der SVP rechnen? Weil bisher hat man eher nur gehört, auch gerade von Herr Köppel, die Politik der anderen, die Klimapolitik der anderen, lehnen wir ab. Aber einen Vorschlag von ihnen haben wir noch nicht. Wir brauchen keine. Wir machen schon alles. Das habe ich gerade eben gesagt. Das müssen wir weiterführen. Das ist aber theoretisch nichts machen. Diese Strategie hat ja nicht wirklich ... Nichts falsches machen. Sie möchten gerne, dass wir die falschen Massnahmen machen. Jetzt wollen wir mehr Lenkungsabgaben. Jetzt wollen wir mehr für das Benzin. Das haben die in den Städten, die das Tram vor der Tür haben, die können das gut sagen. Die Leute der Agglomeration, die in die Stadt möchten, wo Züge und alles verstopft ist. Bei der Zuwanderung, die der Hauptgrund ist, machen die nichts. Da sind wir in der Offensive. Jetzt kommt ja die Begrenzungsinitiative. von Nau.ch
|
main.py
ADDED
|
@@ -0,0 +1,145 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
import sys
|
| 2 |
+
from pathlib import Path
|
| 3 |
+
sys.path.append(Path(__file__).parent.absolute() + '/util')
|
| 4 |
+
sys.path.append(Path(__file__).parent.absolute() + '/sentence_splitter')
|
| 5 |
+
|
| 6 |
+
import chromadb
|
| 7 |
+
from util.llm import LLaMaCPP
|
| 8 |
+
from os.path import exists
|
| 9 |
+
from json import load as json_load
|
| 10 |
+
from time import sleep
|
| 11 |
+
from sentence_splitter import split # noqa
|
| 12 |
+
|
| 13 |
+
|
| 14 |
+
MAX_DIFFERENCE = 1.3
|
| 15 |
+
MAX_DB_RESULTS = 10
|
| 16 |
+
with open('prompt.md', 'r', encoding='utf-8') as _f:
|
| 17 |
+
PROMPT = _f.read()
|
| 18 |
+
GBNF_TEMPLATE = """
|
| 19 |
+
root ::= "```python\\n[" list "]\\n```"
|
| 20 |
+
list ::= %%
|
| 21 |
+
"""
|
| 22 |
+
GBNF_TEMPLATE_ITEM = '("\'%%\'")?'
|
| 23 |
+
GBNF_SEPARATOR = ' (", ")? '
|
| 24 |
+
|
| 25 |
+
|
| 26 |
+
def db_read(texts: list[str]):
|
| 27 |
+
"""
|
| 28 |
+
Get results from ChromaDB based on vector similarity
|
| 29 |
+
:param texts: a list of strings to search for
|
| 30 |
+
:return: Query results directly from ChromaDB
|
| 31 |
+
"""
|
| 32 |
+
client = chromadb.PersistentClient(path=Path(__file__).resolve().parent.parent.absolute().__str__() + '/data/database.chroma')
|
| 33 |
+
collection = client.get_collection(name='PolitScanner')
|
| 34 |
+
return collection.query(query_texts=texts, n_results=MAX_DB_RESULTS)
|
| 35 |
+
|
| 36 |
+
|
| 37 |
+
def process(sentences: list, llm: LLaMaCPP) -> list:
|
| 38 |
+
"""
|
| 39 |
+
Check the given sentences for topics
|
| 40 |
+
:param sentences: a list of sentences as strings
|
| 41 |
+
:param llm: LLaMaCPP instance with a loaded model (PolitScanner fine-tune preferred)
|
| 42 |
+
:return: a list of topics
|
| 43 |
+
"""
|
| 44 |
+
db_results = db_read(sentences)
|
| 45 |
+
print(db_results)
|
| 46 |
+
if len(db_results['ids'][0]) == 0:
|
| 47 |
+
return []
|
| 48 |
+
topic_ids = []
|
| 49 |
+
# check if the results are below a certain threshold
|
| 50 |
+
for i, result in enumerate(db_results['ids'][0]):
|
| 51 |
+
if db_results['distances'][0][i] < MAX_DIFFERENCE:
|
| 52 |
+
id_ = result.split('-')[0]
|
| 53 |
+
if id_ not in topic_ids:
|
| 54 |
+
topic_ids.append(id_)
|
| 55 |
+
if len(topic_ids) == 0:
|
| 56 |
+
return []
|
| 57 |
+
# if there is only one topic, add 'menschengemachter Klimawandel' in order for the prompt template to make sense
|
| 58 |
+
if len(topic_ids) == 1 and topic_ids[0] != '0':
|
| 59 |
+
topic_ids.append('0')
|
| 60 |
+
topics = []
|
| 61 |
+
titles = {}
|
| 62 |
+
# Load the information about the relevant topics
|
| 63 |
+
for topic_id in topic_ids:
|
| 64 |
+
with open(Path(__file__).resolve().parent.parent.absolute().__str__() + f"/data/parsed/{topic_id}.json", 'r') as f:
|
| 65 |
+
topics.append(json_load(f))
|
| 66 |
+
titles[topics[-1]['topic']] = len(topics) - 1
|
| 67 |
+
formatted_topics = ''
|
| 68 |
+
titles_list = list(titles.keys())
|
| 69 |
+
titles_list.sort()
|
| 70 |
+
items = []
|
| 71 |
+
# create the gbnf on the fly
|
| 72 |
+
for title in titles_list:
|
| 73 |
+
items.append(GBNF_TEMPLATE_ITEM.replace('%%', title))
|
| 74 |
+
grammar = GBNF_TEMPLATE.replace('%%', GBNF_SEPARATOR.join(items))
|
| 75 |
+
topics.sort(key=lambda x: x['topic'])
|
| 76 |
+
for topic in topics:
|
| 77 |
+
if len(formatted_topics) > 0:
|
| 78 |
+
formatted_topics += '\n'
|
| 79 |
+
formatted_topics += f"'{topic['topic']}'"
|
| 80 |
+
# create the prompt
|
| 81 |
+
prompt = PROMPT.replace('{TOPICS}', formatted_topics)
|
| 82 |
+
for i, sentence in enumerate(sentences):
|
| 83 |
+
prompt = prompt.replace('{' + f'SENTENCE_{i+1}' + '}', sentence)
|
| 84 |
+
# conversation template for Qwen3
|
| 85 |
+
prompt = f"<|im_start|>user\n{prompt}\n/no_think\n<|im_end|>\n<|im_start|>assistant\n<think>\n\n</think>\n"
|
| 86 |
+
print(prompt)
|
| 87 |
+
output = llm.generate(prompt, enable_thinking=False, grammar=grammar, temperature=0.0)
|
| 88 |
+
print(output)
|
| 89 |
+
# extract the results
|
| 90 |
+
output = output.split('[')[-1].split(']')[0]
|
| 91 |
+
truths = []
|
| 92 |
+
for title in titles_list:
|
| 93 |
+
if title in output:
|
| 94 |
+
truths.append(topics[titles[title]]['fact']) # noqa
|
| 95 |
+
return truths
|
| 96 |
+
|
| 97 |
+
|
| 98 |
+
def main() -> None:
|
| 99 |
+
"""
|
| 100 |
+
Check the `input.txt` file for topics and return the results in `output.txt`
|
| 101 |
+
:return: None
|
| 102 |
+
"""
|
| 103 |
+
if not exists('input.txt'):
|
| 104 |
+
raise FileNotFoundError('input.txt not found')
|
| 105 |
+
with open('input.txt', 'r') as f:
|
| 106 |
+
text = f.read()
|
| 107 |
+
# Select the Large Language Model
|
| 108 |
+
llm = LLaMaCPP()
|
| 109 |
+
if exists('/opt/llms/Qwen3-1.7B-PolitScanner-Q5_K_S.gguf'):
|
| 110 |
+
llm.set_model('Qwen3-1.7B-PolitScanner-Q5_K_S.gguf')
|
| 111 |
+
else:
|
| 112 |
+
llm.set_model('Qwen3-30B-A3B-Q5_K_M.gguf')
|
| 113 |
+
# Split the file into sentences
|
| 114 |
+
sentences = split(text)
|
| 115 |
+
print(f"{len(sentences)=}")
|
| 116 |
+
chunked_sentences = []
|
| 117 |
+
# Create overlapping chunks of 3 sentences (plus two sentences of context)
|
| 118 |
+
for i in range(0, len(sentences), 3):
|
| 119 |
+
if i == 0:
|
| 120 |
+
chunk2 = ['EMPTY'] + sentences[:4]
|
| 121 |
+
elif i + 3 >= len(sentences):
|
| 122 |
+
chunk2 = sentences[-5:-1] + ['EMPTY']
|
| 123 |
+
else:
|
| 124 |
+
chunk2 = sentences[i - 1:i + 4]
|
| 125 |
+
chunked_sentences.append(chunk2)
|
| 126 |
+
print(f"{len(chunked_sentences)=}")
|
| 127 |
+
llm.load_model(print_log=True, threads=16, kv_cache_type='q8_0', context=8192)
|
| 128 |
+
while llm.is_loading() or not llm.is_running():
|
| 129 |
+
sleep(1)
|
| 130 |
+
with open('output.txt', 'w', encoding='utf-8') as f:
|
| 131 |
+
# Process the chunks
|
| 132 |
+
for chunked_sentences2 in chunked_sentences:
|
| 133 |
+
truths = process(chunked_sentences2, llm)
|
| 134 |
+
for truth in truths:
|
| 135 |
+
f.write(f" # Hinweis: {truth}\n")
|
| 136 |
+
for i, sentence in enumerate(chunked_sentences2):
|
| 137 |
+
if i in range(1, 4):
|
| 138 |
+
f.write(f"{sentence}\n")
|
| 139 |
+
f.write('\n')
|
| 140 |
+
print('REACHED `llm.stop()`')
|
| 141 |
+
llm.stop()
|
| 142 |
+
|
| 143 |
+
|
| 144 |
+
if __name__ == '__main__':
|
| 145 |
+
main()
|
politscanner.pdf
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:2f65020224511c29aa93fae4e67d9975073af8e359c5d6a5db213ecc34005efc
|
| 3 |
+
size 239194
|
prompt.md
ADDED
|
@@ -0,0 +1,13 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
Hier ist eine Liste von Themen.
|
| 2 |
+
```text
|
| 3 |
+
{TOPICS}
|
| 4 |
+
```
|
| 5 |
+
Hier sind die 5 Sätze:
|
| 6 |
+
```text
|
| 7 |
+
1: {SENTENCE_1}
|
| 8 |
+
2: {SENTENCE_2}
|
| 9 |
+
3: {SENTENCE_3}
|
| 10 |
+
4: {SENTENCE_4}
|
| 11 |
+
5: {SENTENCE_5}
|
| 12 |
+
```
|
| 13 |
+
Erstelle eine python-Liste mit den Themen, die in den 5 Sätzen vorkommen. Häufig ist die leere Liste `[]` die richtige Antwort.
|