Ctrl+K

1 contributor

History: 302 commits

Ouaill

Upload results/external_datasets_eval.json with huggingface_hub

765c3cc verified 5 days ago

figures
Upload figures/dataset_comparison.png with huggingface_hub 5 days ago
plots
Upload plots/external_comparison.png with huggingface_hub 12 days ago
results
Upload results/external_datasets_eval.json with huggingface_hub 5 days ago
tokenizers
Upload tokenizers/concat_az_bbpe_55000.json with huggingface_hub 13 days ago
transformers_tokenizers
Upload transformers_tokenizers/concat_bbpe_32000_tokenizer_az/tokenizer.json with huggingface_hub 15 days ago
.gitattributes

4.34 kB
Upload results/plots/dataset_comparison.png with huggingface_hub 5 days ago
README.md

9.58 kB
Upload README.md with huggingface_hub 6 days ago
benchmark_report.md

8.47 kB
Upload benchmark_report.md with huggingface_hub 15 days ago
bootstrap_ci.csv

3.23 kB
Upload bootstrap_ci.csv with huggingface_hub 15 days ago
bootstrap_ci_test_set.csv

3.1 kB
Upload bootstrap_ci_test_set.csv with huggingface_hub 14 days ago
bootstrap_test_set.py

6.66 kB
Add bootstrap_test_set.py (test-set-only consistent eval) 14 days ago
code.md

13.6 kB
Upload code.md with huggingface_hub 12 days ago
codeswitch_results.csv

1.43 kB
Upload codeswitch_results.csv with huggingface_hub 14 days ago
compare_with_external.py

10.5 kB
Upload compare_with_external.py with huggingface_hub 15 days ago
dataset_stats.py

9.51 kB
Upload dataset_stats.py with huggingface_hub 5 days ago
doda_independent_results.csv

1.31 kB
Upload doda_independent_results.csv with huggingface_hub 14 days ago
eval_all_externals.py

11.9 kB
Add eval_all_externals.py (12 tokenizer comparison) 15 days ago
eval_and_compare.py

11.5 kB
Upload eval_and_compare.py with huggingface_hub 15 days ago
eval_codeswitch_and_new_baselines.py

15.1 kB
Upload eval_codeswitch_and_new_baselines.py with huggingface_hub 6 days ago
eval_darijabert_mix.py

4.69 kB
Upload eval_darijabert_mix.py with huggingface_hub 6 days ago
eval_doda_independent.py

7.72 kB
Upload eval_doda_independent.py with huggingface_hub 6 days ago
eval_external_datasets.py

12.8 kB
Upload eval_external_datasets.py with huggingface_hub 5 days ago
eval_morph_large.py

11.6 kB
Upload eval_morph_large.py with huggingface_hub 12 days ago
eval_test_set.py

8.15 kB
Add eval_test_set.py (test-set-only consistent eval) 14 days ago
external_comparison.csv

2.53 kB
Upload external_comparison.csv with huggingface_hub 13 days ago
external_comparison.json

5.47 kB
Update external_comparison.json with all 9 external tokenizers 15 days ago
gen_dataset_figure.py

4.05 kB
Upload gen_dataset_figure.py with huggingface_hub 5 days ago
latest_main.pdf

2.2 MB
xet

Upload latest_main.pdf with huggingface_hub 6 days ago
latest_main.tex

49.3 kB
Upload latest_main.tex with huggingface_hub 6 days ago
morph_large_vocab_results.csv

1.29 kB
Upload morph_large_vocab_results.csv with huggingface_hub 12 days ago
script.py

81 kB
Upload script.py with huggingface_hub 15 days ago
test_set_results.csv

11.1 kB
Upload test_set_results.csv with huggingface_hub 13 days ago
test_set_results.json

15 kB
Add test_set_results.json (single source of truth for all tables) 14 days ago
tokenizer_results.csv

9.19 kB
Upload tokenizer_results.csv with huggingface_hub 14 days ago
tokenizer_results.json

20.8 kB
Upload tokenizer_results.json with huggingface_hub 15 days ago