NeTS-lab
/

eMG_RNN_base

Text Generation

Model card Files Files and versions

NeTS-lab commited on Sep 17, 2024

Commit

277d61c

·

verified ·

1 Parent(s): eed619a

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -12,7 +12,7 @@ This ***89.6M*** parameters model is based on a custom RNN architecture loosely
 It is named **eMG-RNN** in reference to the closest computational implementation of the core Minimalist Grammar: [expectation-based Minimalist Grammar]( https://github.com/cristianochesi/e-MGs).
 The model implements two pathways, similar to those in an ***LSTM***: one to manage “continuations” (the **Merge** gate) and another for “holding” (the **Move** gate). The specific “forget gating” system, inspired by ***GRUs***, is designed to bias information flow in a way that may mimic C-command.
-The base model **eMG-RNN-base** auses 650 units for both the embedding and hidden layers (Gulordava et al., 2018).
 It employs a ***BPE*** tokenizer with `min_freq=3`, producing a lexicon of 67,572 tokens using the [BabyLM 2024 10M dataset](https://osf.io/5mk3x) (***Small-strict*** track) as the training corpus.

 It is named **eMG-RNN** in reference to the closest computational implementation of the core Minimalist Grammar: [expectation-based Minimalist Grammar]( https://github.com/cristianochesi/e-MGs).
 The model implements two pathways, similar to those in an ***LSTM***: one to manage “continuations” (the **Merge** gate) and another for “holding” (the **Move** gate). The specific “forget gating” system, inspired by ***GRUs***, is designed to bias information flow in a way that may mimic C-command.
+The base model **eMG-RNN-base** uses 650 units for both the embedding and hidden layer (Gulordava et al., 2018). Only one hidden layer is adopted in this base model to appreciate the effect of individual gating systems.
 It employs a ***BPE*** tokenizer with `min_freq=3`, producing a lexicon of 67,572 tokens using the [BabyLM 2024 10M dataset](https://osf.io/5mk3x) (***Small-strict*** track) as the training corpus.