lthn commited on
Commit
0672274
·
verified ·
1 Parent(s): a85d038

docs: remove preliminary benchmarks pending official eval

Browse files
Files changed (1) hide show
  1. README.md +0 -46
README.md CHANGED
@@ -31,52 +31,6 @@ A [Gemma 4 E2B](https://huggingface.co/google/gemma-4-E2B-it) finetune by [lthn.
31
  **HF Transformers**: on main (4-bit NF4 + bf16 in hf-bf16/)
32
 
33
 
34
- ## Benchmarks
35
-
36
- ### Eval: MMLU Pro
37
-
38
- Columns: **(Think, Temperature)** — `G4` = Stock Gemma 4 E2B, `Lemer` = LEK-activated
39
-
40
- | | G4(1,0) | G4(1,1) | G4(0,0) | G4(0,1) | Lemer(1,0) | Lemer(1,1) | Lemer(0,0) | Lemer(0,1) |
41
- | :---- | :----: | :----: | :----: | :----: | :----: | :----: | :----: | :----: |
42
- | Biology | 40.0% | TBC | TBC | TBC | **60.0%** | [**65.0%**](https://huggingface.co/lthn/lemer/blob/bf16/README.md#mmlu-pro-bf16-best-per-category "65.0") | TBC | TBC |
43
- | Math | 10.0% | 30.0% | 15.0% | 10.0% | **55.0%** | [**80.0%**](https://huggingface.co/lthn/lemer/blob/bf16/README.md#mmlu-pro-bf16-best-per-category "80.0") | **25.0%** | **25.0%** |
44
- | Business | TBC | TBC | TBC | 40.0% | TBC | [**50.0%**](https://huggingface.co/lthn/lemer/blob/bf16/README.md#mmlu-pro-bf16-best-per-category "50.0") | TBC | TBC |
45
- | Chemistry | TBC | TBC | TBC | 15.0% | TBC | [**35.0%**](https://huggingface.co/lthn/lemer/blob/bf16/README.md#mmlu-pro-bf16-best-per-category "35.0") | TBC | TBC |
46
- | Computer Science | TBC | TBC | TBC | 65.0% | TBC | [**65.0%**](https://huggingface.co/lthn/lemer/blob/bf16/README.md#mmlu-pro-bf16-best-per-category "65.0") | TBC | TBC |
47
- | Economics | TBC | TBC | TBC | 25.0% | TBC | [**45.0%**](https://huggingface.co/lthn/lemer/blob/bf16/README.md#mmlu-pro-bf16-best-per-category "45.0") | TBC | TBC |
48
- | Engineering | TBC | TBC | TBC | 50.0% | TBC | [**55.0%**](https://huggingface.co/lthn/lemer/blob/bf16/README.md#mmlu-pro-bf16-best-per-category "55.0") | TBC | TBC |
49
- | Health | TBC | TBC | TBC | 25.0% | TBC | [**35.0%**](https://huggingface.co/lthn/lemer/blob/bf16/README.md#mmlu-pro-bf16-best-per-category "35.0") | TBC | TBC |
50
- | History | TBC | TBC | TBC | 5.0% | TBC | [**15.0%**](https://huggingface.co/lthn/lemer/blob/bf16/README.md#mmlu-pro-bf16-best-per-category "15.0") | TBC | TBC |
51
- | Law | TBC | TBC | TBC | 15.0% | TBC | [**25.0%**](https://huggingface.co/lthn/lemer/blob/bf16/README.md#mmlu-pro-bf16-best-per-category "25.0") | TBC | TBC |
52
- | Other | TBC | TBC | TBC | 45.0% | TBC | [**55.0%**](https://huggingface.co/lthn/lemer/blob/bf16/README.md#mmlu-pro-bf16-best-per-category "55.0") | TBC | TBC |
53
- | Philosophy | TBC | TBC | TBC | 25.0% | TBC | [**30.0%**](https://huggingface.co/lthn/lemer/blob/bf16/README.md#mmlu-pro-bf16-best-per-category "30.0") | TBC | TBC |
54
- | Physics | TBC | TBC | TBC | 45.0% | TBC | [**50.0%**](https://huggingface.co/lthn/lemer/blob/bf16/README.md#mmlu-pro-bf16-best-per-category "50.0") | TBC | TBC |
55
- | Psychology | TBC | TBC | TBC | 20.0% | TBC | [**45.0%**](https://huggingface.co/lthn/lemer/blob/bf16/README.md#mmlu-pro-bf16-best-per-category "45.0") | TBC | TBC |
56
- | **Average** | TBC | TBC | TBC | **33.6%** | TBC | [**46.4%**](https://huggingface.co/lthn/lemer/blob/bf16/README.md#mmlu-pro-bf16-best-per-category "46.4") | TBC | TBC |
57
-
58
-
59
- ### Lemer Quantisation Benchmarks (MMLU-Pro, all categories)
60
-
61
- | | [bf16](https://huggingface.co/lthn/lemer/tree/bf16) | [8bit](https://huggingface.co/lthn/lemer/tree/8bit) | [6bit](https://huggingface.co/lthn/lemer/tree/6bit) | [5bit](https://huggingface.co/lthn/lemer/tree/5bit) | [4bit](https://huggingface.co/lthn/lemer/tree/4bit) | [mxfp8](https://huggingface.co/lthn/lemer/tree/mxfp8) | [mxfp4](https://huggingface.co/lthn/lemer/tree/mxfp4) | [nvfp4](https://huggingface.co/lthn/lemer/tree/nvfp4) |
62
- | :---- | :----: | :----: | :----: | :----: | :----: | :----: | :----: | :----: |
63
- | Biology | 65.0% | TBC | TBC | TBC | 45.0% | TBC | TBC | TBC |
64
- | Math | 60.0% | TBC | TBC | TBC | 50.0% | TBC | TBC | TBC |
65
- | Business | 45.0% | TBC | TBC | TBC | 40.0% | TBC | TBC | TBC |
66
- | Chemistry | 20.0% | TBC | TBC | TBC | 15.0% | TBC | TBC | TBC |
67
- | Computer Science | 50.0% | TBC | TBC | TBC | 65.0% | TBC | TBC | TBC |
68
- | Economics | 25.0% | TBC | TBC | TBC | 25.0% | TBC | TBC | TBC |
69
- | Engineering | 50.0% | TBC | TBC | TBC | 50.0% | TBC | TBC | TBC |
70
- | Health | 25.0% | TBC | TBC | TBC | 25.0% | TBC | TBC | TBC |
71
- | History | 10.0% | TBC | TBC | TBC | 5.0% | TBC | TBC | TBC |
72
- | Law | 25.0% | TBC | TBC | TBC | 15.0% | TBC | TBC | TBC |
73
- | Other | 45.0% | TBC | TBC | TBC | 45.0% | TBC | TBC | TBC |
74
- | Philosophy | 20.0% | TBC | TBC | TBC | 25.0% | TBC | TBC | TBC |
75
- | Physics | 50.0% | TBC | TBC | TBC | 45.0% | TBC | TBC | TBC |
76
- | Psychology | 25.0% | TBC | TBC | TBC | 20.0% | TBC | TBC | TBC |
77
- | **Average** | **36.8%** | TBC | TBC | TBC | **33.6%** | TBC | TBC | TBC |
78
-
79
-
80
  ## Base
81
 
82
  [google/gemma-4-E2B-it](https://huggingface.co/google/gemma-4-E2B-it)
 
31
  **HF Transformers**: on main (4-bit NF4 + bf16 in hf-bf16/)
32
 
33
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
34
  ## Base
35
 
36
  [google/gemma-4-E2B-it](https://huggingface.co/google/gemma-4-E2B-it)