antalvdb commited on
Commit
53cc5cc
·
verified ·
1 Parent(s): 048c2df

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -2
README.md CHANGED
@@ -7,7 +7,9 @@ tags:
7
  - less-is-better
8
  - supra-word
9
  - cognitively-inspired
10
- license: apache-2.0
 
 
11
  ---
12
 
13
  # LiB Tokenizer
@@ -99,4 +101,4 @@ the original work:
99
  ## Links
100
 
101
  - [tokenizers fork (Rust implementation)](https://github.com/antalvdb/tokenizers/tree/lib-model)
102
- - [LiB repository (training scripts)](https://github.com/antalvdb/LiB/tree/feature/hf-compatible-tokenizer)
 
7
  - less-is-better
8
  - supra-word
9
  - cognitively-inspired
10
+ license: gpl-3.0
11
+ datasets:
12
+ - MLZoo/edu-fineweb-10B
13
  ---
14
 
15
  # LiB Tokenizer
 
101
  ## Links
102
 
103
  - [tokenizers fork (Rust implementation)](https://github.com/antalvdb/tokenizers/tree/lib-model)
104
+ - [LiB repository (training scripts)](https://github.com/antalvdb/LiB/tree/feature/hf-compatible-tokenizer)