janhq
/

Jan-v3-4B-base-instruct

Text Generation

text-generation-inference

Model card Files Files and versions

Update README.md

#3

by jan-ai - opened Jan 28

base: refs/heads/main

←

from: refs/pr/3

Discussion Files changed

Files changed (1) hide show

README.md +10 -0

README.md CHANGED Viewed

@@ -23,6 +23,16 @@ tags:
 Building on this base, **Jan-Code**, a code-tuned variant, **will be released soon.**
 **Intended Use**
 * A better small base for downstream work: improved instruction following out of the box, strong starting point for fine-tuning, and effective lightweight coding assistance.

 Building on this base, **Jan-Code**, a code-tuned variant, **will be released soon.**
+## Model Overview
+This repo contains the BF16 version of **Jan-v3-4B-base-instruct**, which has the following features:
+- Type: Causal Language Models
+- Training Stage: Pretraining & Post-training
+- Number of Parameters: 4B in total
+- Number of Layers: 36
+- Number of Attention Heads (GQA): 32 for Q and 8 for KV
+- Context Length: **262,144 natively**.
 **Intended Use**
 * A better small base for downstream work: improved instruction following out of the box, strong starting point for fine-tuning, and effective lightweight coding assistance.