NeTS-lab commited on
Commit
1abd991
·
verified ·
1 Parent(s): fe5f43a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -3
README.md CHANGED
@@ -1,7 +1,5 @@
1
  ---
2
  license: cc-by-sa-4.0
3
- datasets:
4
- - cambridge-climb/BabyLM
5
  language:
6
  - en
7
  metrics:
@@ -14,6 +12,6 @@ This model is based on a custom RNN architecture loosely inspired by an unorthod
14
  It is named **eMG-RNN** in reference to the closest computational implementation of the core Minimalist Grammar: [expectation-based Minimalist Grammar]( https://github.com/cristianochesi/e-MGs).
15
 
16
  The model implements two pathways, similar to those in an ***LSTM***: one to manage “continuations” (the **Merge** gate) and another for “holding” (the **Move** gate). The specific “forget gating” system, inspired by ***GRUs***, is designed to bias information flow in a way that may mimic C-command.
17
- The base model **eMG-RNN-base** auses 650 units for both the embedding and hidden layers (Gulordava et al., 2018). It employs a ***BPE*** tokenizer with `min_freq=3`, producing a lexicon of 67,572 tokens using the **BabyLM 2024 10M dataset** (***Small-strict*** track) as the training corpus.
18
 
19
  The model’s architecture, preprocessing routines, lm-eval modules for evaluation, and an alternative (unused here for English) tokenization procedure (***MorPiece***) are all available on GitHub at: [cristianochesi/babylm-2024](https://github.com/cristianochesi/babylm-2024)
 
1
  ---
2
  license: cc-by-sa-4.0
 
 
3
  language:
4
  - en
5
  metrics:
 
12
  It is named **eMG-RNN** in reference to the closest computational implementation of the core Minimalist Grammar: [expectation-based Minimalist Grammar]( https://github.com/cristianochesi/e-MGs).
13
 
14
  The model implements two pathways, similar to those in an ***LSTM***: one to manage “continuations” (the **Merge** gate) and another for “holding” (the **Move** gate). The specific “forget gating” system, inspired by ***GRUs***, is designed to bias information flow in a way that may mimic C-command.
15
+ The base model **eMG-RNN-base** auses 650 units for both the embedding and hidden layers (Gulordava et al., 2018). It employs a ***BPE*** tokenizer with `min_freq=3`, producing a lexicon of 67,572 tokens using the [BabyLM 2024 10M dataset](https://osf.io/5mk3x) (***Small-strict*** track) as the training corpus.
16
 
17
  The model’s architecture, preprocessing routines, lm-eval modules for evaluation, and an alternative (unused here for English) tokenization procedure (***MorPiece***) are all available on GitHub at: [cristianochesi/babylm-2024](https://github.com/cristianochesi/babylm-2024)