Theoistic commited on
Commit
b966053
·
verified ·
1 Parent(s): 96cafaf

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -107,7 +107,7 @@ Users should implement output filtering, content moderation, and continuous eval
107
 
108
  ## Model Capabilities
109
 
110
- Bantam demonstrates strong **multilingual competence** across 54 languages and is capable of generating **informative, coherent, and contextually aware text** in each of them.
111
 
112
  The model was designed to leverage **many small attention heads in early layers** to capture linguistic and grammatical structures, transitioning to **larger, more abstract reasoning** in later layers. This design improves logical coherence and narrative flow across diverse languages despite the model’s compact size.
113
 
@@ -201,7 +201,7 @@ or any low resource language impact happened due to catastrophic forgetting.
201
 
202
  Bantam is a **pretrained base model**, not fine-tuned or benchmarked with external metrics. Qualitatively, it exhibits:
203
 
204
- * Strong multilingual understanding and generation across 54 languages.
205
  * Coherent reasoning and informative responses.
206
  * Expected hallucinations due to small model size.
207
 
 
107
 
108
  ## Model Capabilities
109
 
110
+ Bantam demonstrates strong **multilingual competence** across 55 languages and is capable of generating **informative, coherent, and contextually aware text** in each of them.
111
 
112
  The model was designed to leverage **many small attention heads in early layers** to capture linguistic and grammatical structures, transitioning to **larger, more abstract reasoning** in later layers. This design improves logical coherence and narrative flow across diverse languages despite the model’s compact size.
113
 
 
201
 
202
  Bantam is a **pretrained base model**, not fine-tuned or benchmarked with external metrics. Qualitatively, it exhibits:
203
 
204
+ * Strong multilingual understanding and generation across 55 languages.
205
  * Coherent reasoning and informative responses.
206
  * Expected hallucinations due to small model size.
207