File size: 3,710 Bytes
d13d109
 
 
 
 
 
 
 
 
 
 
41c199a
d13d109
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
41c199a
d13d109
 
 
 
 
 
 
41c199a
d13d109
 
 
 
 
 
 
 
 
 
 
41c199a
d13d109
 
 
 
 
 
 
 
 
 
 
41c199a
d13d109
 
 
 
 
41c199a
d13d109
41c199a
d13d109
9fb36ef
d13d109
 
 
 
 
 
 
41c199a
d13d109
 
9fb36ef
d13d109
 
 
 
 
 
 
 
 
 
9fb36ef
41c199a
d13d109
 
41c199a
d13d109
 
 
 
 
41c199a
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
---
library_name: transformers
license: apache-2.0
tags:
  - math
  - reasoning
  - text-generation
language:
  - en
pipeline_tag: text-generation
model-index:
  - name: Kai-0.35B-Instruct
    results:
      - task:
          type: multiple-choice
          name: ARC-Challenge
        dataset:
          name: ARC-Challenge
          type: allenai/ai2_arc
          config: ARC-Challenge
          split: test
        metrics:
          - type: acc_norm
            value: 37.80
            name: Accuracy (normalized)
      - task:
          type: multiple-choice
          name: HellaSwag
        dataset:
          name: HellaSwag
          type: Rowan/hellaswag
          split: validation
        metrics:
          - type: acc_norm
            value: 55.88
            name: Accuracy (normalized)
      - task:
          type: multiple-choice
          name: PIQA
        dataset:
          name: PIQA
          type: piqa
          split: validation
        metrics:
          - type: acc_norm
            value: 71.82
            name: Accuracy (normalized)
      - task:
          type: text-generation
          name: MBPP
        dataset:
          name: MBPP
          type: google-research-datasets/mbpp
          split: test
        metrics:
          - type: pass_at_1
            value: 22.20
            name: pass@1
---
# Kai-0.35B-Instruct

A compact 0.35B-parameter instruction-tuned language model optimized for reasoning, math, and code generation tasks.

## Model Details

| | |
|---|---|
| **Model** | Kai-0.35B-Instruct |
| **Architecture** | LlamaForCausalLM |
| **Parameters** | 360M |
| **Hidden size** | 960 |
| **Layers** | 32 |
| **Attention heads** | 15 (5 KV heads, GQA) |
| **Context length** | 8192 |
| **Precision** | bfloat16 |
| **Vocab size** | 49,152 |

## Benchmark Results (5-shot, log-likelihood)

| Benchmark | Kai-0.35B-Instruct | Mamba (370M) | TinyLlama (1.1B) | Llama-3.2 (1B) |
|---|:---:|:---:|:---:|:---:|
| **ARC-Challenge** (science reasoning) | **37.80%** | ~29.1% | ~30.1% | ~44.5% |
| **HellaSwag** (sentence completion) | 55.88% | ~53.8% | ~59.2% | ~61.1% |
| **PIQA** (physical commonsense) | **71.82%** | ~69.6% | ~73.0% | ~74.5% |

### Code Generation — MBPP (3-shot, pass@1)

| Model | Params | MBPP pass@1 |
|---|:---:|:---:|
| Mamba / Mamba-2 | 370M | <10.0% |
| TinyLlama | 1.1B | ~19.91% |
| **Kai-0.35B-Instruct** | **360M** | **22.20%** |
| Llama-3.2-1B (Base) | 1.0B | ~25-30% |
| Llama-3.2-1B-Instruct | 1.0B | ~49.0% |

### Key Observations

1. **ARC-Challenge**: Kai-0.35B scores **37.80%** (5-shot), significantly outperforming both Mamba-370M (+8.7pp) and TinyLlama-1.1B (+7.7pp) — a model 3x its size.

2. **PIQA**: At **71.82%**, Kai-0.35B nearly matches TinyLlama-1.1B (73.0%) with only 1/3 the parameters, and trails the 1B-class Llama-3.2 by less than 3pp.

3. **MBPP**: At **22.20%** pass@1, Kai-0.35B surpasses TinyLlama-1.1B (~19.91%) in code generation despite being 3x smaller.

## Usage

```python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model = AutoModelForCausalLM.from_pretrained(
    "NoesisLab/Kai-0.35B-Instruct",
    torch_dtype=torch.bfloat16,
)
tokenizer = AutoTokenizer.from_pretrained("NoesisLab/Kai-0.35B-Instruct")
messages = [{"role": "user", "content": "What is 25 * 4?"}]
input_ids = tokenizer.apply_chat_template(messages, return_tensors="pt")
output = model.generate(input_ids, max_new_tokens=256)
print(tokenizer.decode(output[0], skip_special_tokens=True))
```


## Citation

```bibtex
@misc{noesislab2026nkai,
  title={Kai-0.35B-Instruct},
  author={NoesisLab},
  year={2026},
  url={https://huggingface.co/NoesisLab/Kai-0.35B-Instruct}
}
```

## License

Apache 2.0