Update README.md
Browse files
README.md
CHANGED
|
@@ -9,7 +9,7 @@ language:
|
|
| 9 |
- en
|
| 10 |
pipeline_tag: text-generation
|
| 11 |
model-index:
|
| 12 |
-
- name:
|
| 13 |
results:
|
| 14 |
- task:
|
| 15 |
type: multiple-choice
|
|
@@ -57,8 +57,7 @@ model-index:
|
|
| 57 |
value: 22.20
|
| 58 |
name: pass@1
|
| 59 |
---
|
| 60 |
-
|
| 61 |
-
# OpenKai-0.35B-Instruct
|
| 62 |
|
| 63 |
A compact 0.35B-parameter instruction-tuned language model optimized for reasoning, math, and code generation tasks.
|
| 64 |
|
|
@@ -66,7 +65,7 @@ A compact 0.35B-parameter instruction-tuned language model optimized for reasoni
|
|
| 66 |
|
| 67 |
| | |
|
| 68 |
|---|---|
|
| 69 |
-
| **Model** |
|
| 70 |
| **Architecture** | LlamaForCausalLM |
|
| 71 |
| **Parameters** | 360M |
|
| 72 |
| **Hidden size** | 960 |
|
|
@@ -78,7 +77,7 @@ A compact 0.35B-parameter instruction-tuned language model optimized for reasoni
|
|
| 78 |
|
| 79 |
## Benchmark Results (5-shot, log-likelihood)
|
| 80 |
|
| 81 |
-
| Benchmark |
|
| 82 |
|---|:---:|:---:|:---:|:---:|
|
| 83 |
| **ARC-Challenge** (science reasoning) | **37.80%** | ~29.1% | ~30.1% | ~44.5% |
|
| 84 |
| **HellaSwag** (sentence completion) | 55.88% | ~53.8% | ~59.2% | ~61.1% |
|
|
@@ -90,15 +89,15 @@ A compact 0.35B-parameter instruction-tuned language model optimized for reasoni
|
|
| 90 |
|---|:---:|:---:|
|
| 91 |
| Mamba / Mamba-2 | 370M | <10.0% |
|
| 92 |
| TinyLlama | 1.1B | ~19.91% |
|
| 93 |
-
| **
|
| 94 |
| Llama-3.2-1B (Base) | 1.0B | ~25-30% |
|
| 95 |
| Llama-3.2-1B-Instruct | 1.0B | ~49.0% |
|
| 96 |
|
| 97 |
### Key Observations
|
| 98 |
|
| 99 |
-
1. **ARC-Challenge**:
|
| 100 |
|
| 101 |
-
2. **PIQA**: At **71.82%**,
|
| 102 |
|
| 103 |
3. **MBPP**: At **22.20%** pass@1, OpenKai-0.35B surpasses TinyLlama-1.1B (~19.91%) in code generation despite being 3x smaller.
|
| 104 |
|
|
@@ -107,44 +106,29 @@ A compact 0.35B-parameter instruction-tuned language model optimized for reasoni
|
|
| 107 |
```python
|
| 108 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|
| 109 |
import torch
|
| 110 |
-
|
| 111 |
model = AutoModelForCausalLM.from_pretrained(
|
| 112 |
-
"NoesisLab/
|
| 113 |
torch_dtype=torch.bfloat16,
|
| 114 |
)
|
| 115 |
tokenizer = AutoTokenizer.from_pretrained("NoesisLab/OpenKai-0.35B-Instruct")
|
| 116 |
-
|
| 117 |
messages = [{"role": "user", "content": "What is 25 * 4?"}]
|
| 118 |
input_ids = tokenizer.apply_chat_template(messages, return_tensors="pt")
|
| 119 |
output = model.generate(input_ids, max_new_tokens=256)
|
| 120 |
print(tokenizer.decode(output[0], skip_special_tokens=True))
|
| 121 |
```
|
| 122 |
|
| 123 |
-
## Evaluation
|
| 124 |
-
|
| 125 |
-
Benchmarks were run using [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness):
|
| 126 |
-
|
| 127 |
-
```bash
|
| 128 |
-
lm_eval --model hf \
|
| 129 |
-
--model_args pretrained=NoesisLab/OpenKai-0.35B-Instruct,dtype=bfloat16 \
|
| 130 |
-
--tasks arc_challenge,hellaswag,piqa \
|
| 131 |
-
--num_fewshot 5 \
|
| 132 |
-
--batch_size auto \
|
| 133 |
-
--output_path ./lmeval_results \
|
| 134 |
-
--log_samples
|
| 135 |
-
```
|
| 136 |
|
| 137 |
## Citation
|
| 138 |
|
| 139 |
```bibtex
|
| 140 |
@misc{noesislab2026openkai,
|
| 141 |
-
title={
|
| 142 |
author={NoesisLab},
|
| 143 |
year={2026},
|
| 144 |
-
url={https://huggingface.co/NoesisLab/
|
| 145 |
}
|
| 146 |
```
|
| 147 |
|
| 148 |
## License
|
| 149 |
|
| 150 |
-
Apache 2.0
|
|
|
|
| 9 |
- en
|
| 10 |
pipeline_tag: text-generation
|
| 11 |
model-index:
|
| 12 |
+
- name: Kai-0.35B-Instruct
|
| 13 |
results:
|
| 14 |
- task:
|
| 15 |
type: multiple-choice
|
|
|
|
| 57 |
value: 22.20
|
| 58 |
name: pass@1
|
| 59 |
---
|
| 60 |
+
# Kai-0.35B-Instruct
|
|
|
|
| 61 |
|
| 62 |
A compact 0.35B-parameter instruction-tuned language model optimized for reasoning, math, and code generation tasks.
|
| 63 |
|
|
|
|
| 65 |
|
| 66 |
| | |
|
| 67 |
|---|---|
|
| 68 |
+
| **Model** | Kai-0.35B-Instruct |
|
| 69 |
| **Architecture** | LlamaForCausalLM |
|
| 70 |
| **Parameters** | 360M |
|
| 71 |
| **Hidden size** | 960 |
|
|
|
|
| 77 |
|
| 78 |
## Benchmark Results (5-shot, log-likelihood)
|
| 79 |
|
| 80 |
+
| Benchmark | Kai-0.35B-Instruct | Mamba (370M) | TinyLlama (1.1B) | Llama-3.2 (1B) |
|
| 81 |
|---|:---:|:---:|:---:|:---:|
|
| 82 |
| **ARC-Challenge** (science reasoning) | **37.80%** | ~29.1% | ~30.1% | ~44.5% |
|
| 83 |
| **HellaSwag** (sentence completion) | 55.88% | ~53.8% | ~59.2% | ~61.1% |
|
|
|
|
| 89 |
|---|:---:|:---:|
|
| 90 |
| Mamba / Mamba-2 | 370M | <10.0% |
|
| 91 |
| TinyLlama | 1.1B | ~19.91% |
|
| 92 |
+
| **Kai-0.35B-Instruct** | **360M** | **22.20%** |
|
| 93 |
| Llama-3.2-1B (Base) | 1.0B | ~25-30% |
|
| 94 |
| Llama-3.2-1B-Instruct | 1.0B | ~49.0% |
|
| 95 |
|
| 96 |
### Key Observations
|
| 97 |
|
| 98 |
+
1. **ARC-Challenge**: Kai-0.35B scores **37.80%** (5-shot), significantly outperforming both Mamba-370M (+8.7pp) and TinyLlama-1.1B (+7.7pp) — a model 3x its size.
|
| 99 |
|
| 100 |
+
2. **PIQA**: At **71.82%**, Kai-0.35B nearly matches TinyLlama-1.1B (73.0%) with only 1/3 the parameters, and trails the 1B-class Llama-3.2 by less than 3pp.
|
| 101 |
|
| 102 |
3. **MBPP**: At **22.20%** pass@1, OpenKai-0.35B surpasses TinyLlama-1.1B (~19.91%) in code generation despite being 3x smaller.
|
| 103 |
|
|
|
|
| 106 |
```python
|
| 107 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|
| 108 |
import torch
|
|
|
|
| 109 |
model = AutoModelForCausalLM.from_pretrained(
|
| 110 |
+
"NoesisLab/Kai-0.35B-Instruct",
|
| 111 |
torch_dtype=torch.bfloat16,
|
| 112 |
)
|
| 113 |
tokenizer = AutoTokenizer.from_pretrained("NoesisLab/OpenKai-0.35B-Instruct")
|
|
|
|
| 114 |
messages = [{"role": "user", "content": "What is 25 * 4?"}]
|
| 115 |
input_ids = tokenizer.apply_chat_template(messages, return_tensors="pt")
|
| 116 |
output = model.generate(input_ids, max_new_tokens=256)
|
| 117 |
print(tokenizer.decode(output[0], skip_special_tokens=True))
|
| 118 |
```
|
| 119 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 120 |
|
| 121 |
## Citation
|
| 122 |
|
| 123 |
```bibtex
|
| 124 |
@misc{noesislab2026openkai,
|
| 125 |
+
title={Kai-0.35B-Instruct},
|
| 126 |
author={NoesisLab},
|
| 127 |
year={2026},
|
| 128 |
+
url={https://huggingface.co/NoesisLab/Kai-0.35B-Instruct}
|
| 129 |
}
|
| 130 |
```
|
| 131 |
|
| 132 |
## License
|
| 133 |
|
| 134 |
+
Apache 2.0
|