Theoistic
/

Bantam-285m

Model card Files Files and versions

Theoistic commited on Oct 27, 2025

Commit

b966053

·

verified ·

1 Parent(s): 96cafaf

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -107,7 +107,7 @@ Users should implement output filtering, content moderation, and continuous eval
 ## Model Capabilities
-Bantam demonstrates strong **multilingual competence** across 54 languages and is capable of generating **informative, coherent, and contextually aware text** in each of them.
 The model was designed to leverage **many small attention heads in early layers** to capture linguistic and grammatical structures, transitioning to **larger, more abstract reasoning** in later layers. This design improves logical coherence and narrative flow across diverse languages despite the model’s compact size.
@@ -201,7 +201,7 @@ or any low resource language impact happened due to catastrophic forgetting.
 Bantam is a **pretrained base model**, not fine-tuned or benchmarked with external metrics. Qualitatively, it exhibits:
-* Strong multilingual understanding and generation across 54 languages.
 * Coherent reasoning and informative responses.
 * Expected hallucinations due to small model size.

 ## Model Capabilities
+Bantam demonstrates strong **multilingual competence** across 55 languages and is capable of generating **informative, coherent, and contextually aware text** in each of them.
 The model was designed to leverage **many small attention heads in early layers** to capture linguistic and grammatical structures, transitioning to **larger, more abstract reasoning** in later layers. This design improves logical coherence and narrative flow across diverse languages despite the model’s compact size.
 Bantam is a **pretrained base model**, not fine-tuned or benchmarked with external metrics. Qualitatively, it exhibits:
+* Strong multilingual understanding and generation across 55 languages.
 * Coherent reasoning and informative responses.
 * Expected hallucinations due to small model size.