dalat5 / src /train_tokeniser.py

Commit History

Pre-v5 update for the tokeniser (training date pushed to the 25th)
794cf97

crossroderick commited on

Removed unnecessary imports
8dc2b55

crossroderick commited on

Removed NFD and StripAccents from the tokeniser training process
f93a822

crossroderick commited on

Addition of a new tokeniser (pre-v5)
178501c

crossroderick commited on