NeTS-lab
/

eMG_RNN_base

Text Generation

Model card Files Files and versions

NeTS-lab commited on Sep 17, 2024

Commit

1abd991

·

verified ·

1 Parent(s): fe5f43a

Update README.md

Files changed (1) hide show

README.md +1 -3

README.md CHANGED Viewed

@@ -1,7 +1,5 @@
 ---
 license: cc-by-sa-4.0
-datasets:
-- cambridge-climb/BabyLM
 language:
 - en
 metrics:
@@ -14,6 +12,6 @@ This model is based on a custom RNN architecture loosely inspired by an unorthod
 It is named **eMG-RNN** in reference to the closest computational implementation of the core Minimalist Grammar: [expectation-based Minimalist Grammar]( https://github.com/cristianochesi/e-MGs).
 The model implements two pathways, similar to those in an ***LSTM***: one to manage “continuations” (the **Merge** gate) and another for “holding” (the **Move** gate). The specific “forget gating” system, inspired by ***GRUs***, is designed to bias information flow in a way that may mimic C-command.
-The base model **eMG-RNN-base** auses 650 units for both the embedding and hidden layers (Gulordava et al., 2018). It employs a ***BPE*** tokenizer with `min_freq=3`, producing a lexicon of 67,572 tokens using the **BabyLM 2024 10M dataset** (***Small-strict*** track) as the training corpus.
 The model’s architecture, preprocessing routines, lm-eval modules for evaluation, and an alternative (unused here for English) tokenization procedure (***MorPiece***) are all available on GitHub at: [cristianochesi/babylm-2024](https://github.com/cristianochesi/babylm-2024)

 ---
 license: cc-by-sa-4.0
 language:
 - en
 metrics:
 It is named **eMG-RNN** in reference to the closest computational implementation of the core Minimalist Grammar: [expectation-based Minimalist Grammar]( https://github.com/cristianochesi/e-MGs).
 The model implements two pathways, similar to those in an ***LSTM***: one to manage “continuations” (the **Merge** gate) and another for “holding” (the **Move** gate). The specific “forget gating” system, inspired by ***GRUs***, is designed to bias information flow in a way that may mimic C-command.
+The base model **eMG-RNN-base** auses 650 units for both the embedding and hidden layers (Gulordava et al., 2018). It employs a ***BPE*** tokenizer with `min_freq=3`, producing a lexicon of 67,572 tokens using the [BabyLM 2024 10M dataset](https://osf.io/5mk3x) (***Small-strict*** track) as the training corpus.
 The model’s architecture, preprocessing routines, lm-eval modules for evaluation, and an alternative (unused here for English) tokenization procedure (***MorPiece***) are all available on GitHub at: [cristianochesi/babylm-2024](https://github.com/cristianochesi/babylm-2024)