Update README.md
#3
by
jan-ai - opened
README.md
CHANGED
|
@@ -23,6 +23,16 @@ tags:
|
|
| 23 |
|
| 24 |
Building on this base, **Jan-Code**, a code-tuned variant, **will be released soon.**
|
| 25 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 26 |
**Intended Use**
|
| 27 |
|
| 28 |
* A better small base for downstream work: improved instruction following out of the box, strong starting point for fine-tuning, and effective lightweight coding assistance.
|
|
|
|
| 23 |
|
| 24 |
Building on this base, **Jan-Code**, a code-tuned variant, **will be released soon.**
|
| 25 |
|
| 26 |
+
## Model Overview
|
| 27 |
+
|
| 28 |
+
This repo contains the BF16 version of **Jan-v3-4B-base-instruct**, which has the following features:
|
| 29 |
+
- Type: Causal Language Models
|
| 30 |
+
- Training Stage: Pretraining & Post-training
|
| 31 |
+
- Number of Parameters: 4B in total
|
| 32 |
+
- Number of Layers: 36
|
| 33 |
+
- Number of Attention Heads (GQA): 32 for Q and 8 for KV
|
| 34 |
+
- Context Length: **262,144 natively**.
|
| 35 |
+
|
| 36 |
**Intended Use**
|
| 37 |
|
| 38 |
* A better small base for downstream work: improved instruction following out of the box, strong starting point for fine-tuning, and effective lightweight coding assistance.
|