Update README.md
Browse files
README.md
CHANGED
|
@@ -18,3 +18,18 @@ tags:
|
|
| 18 |
|
| 19 |
This is a [GLM-4.7-Flash](https://huggingface.co/zai-org/GLM-4.7-Flash) model which has been uncensored using the [Norm-Preserving Biprojected Abliteration](https://huggingface.co/blog/grimjim/norm-preserving-biprojected-abliteration)
|
| 20 |
methodology, similar to other models from the 'derestricted' family.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 18 |
|
| 19 |
This is a [GLM-4.7-Flash](https://huggingface.co/zai-org/GLM-4.7-Flash) model which has been uncensored using the [Norm-Preserving Biprojected Abliteration](https://huggingface.co/blog/grimjim/norm-preserving-biprojected-abliteration)
|
| 20 |
methodology, similar to other models from the 'derestricted' family.
|
| 21 |
+
|
| 22 |
+
## Benchmark results
|
| 23 |
+
|
| 24 |
+
All benchmarks were measured using a local vLLM instance and [inspect_evals](https://github.com/UKGovernmentBEIS/inspect_evals).
|
| 25 |
+
|
| 26 |
+
### MMLU-Pro (subset of 200 samples picked at random)
|
| 27 |
+
|
| 28 |
+
- GLM-4.7-Flash: 0.715, 694606 output tokens
|
| 29 |
+
- GLM-4.7-Flash-Derestricted: 0.755, 632992 output tokens
|
| 30 |
+
|
| 31 |
+
Measured with:
|
| 32 |
+
|
| 33 |
+
```
|
| 34 |
+
LOCAL_API_KEY="dummy" LOCAL_BASE_URL="http://127.0.0.1:9001/v1" uv run inspect eval inspect_evals/mmlu_pro --model "openai-api/local/glm-4.7-flash-derestricted" --seed 123456 --reasoning-history all --log-dir eval-logs-glm-4.7-flash-derestricted-mmlu-pro --frequency-penalty 0 --presence-penalty 0 --temperature 0.7 --top-p 0.95 --max-tokens 8192 --max-connections 200 --sample-shuffle 6375934876 --limit 200
|
| 35 |
+
```
|