Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
crossroderick
/
aramt5
like
0
Text Generation
Safetensors
Classical Syriac
t5
text2text-generation
transliteration
syriac
low-resource
cultural-nlp
Eval Results (legacy)
License:
mit
Model card
Files
Files and versions
xet
Community
main
aramt5
/
src
/
data
174 kB
Ctrl+K
Ctrl+K
1 contributor
History:
10 commits
crossroderick
v3.2 hotfix with some corrections and new files
ace5912
about 1 month ago
augment_atomic_tokens.py
Safe
6.1 kB
Data augmentation and balancing updates for a re-run of v3
about 2 months ago
balance_corpus.py
Safe
10.5 kB
Data augmentation and balancing updates for a re-run of v3
about 2 months ago
build_corpus_vocabulary.py
Safe
5.88 kB
v3.1 updadte
about 2 months ago
correction_dataset.jsonl
43.1 kB
xet
v3.2 hotfix with some corrections and new files
about 1 month ago
fetch_sedra_vocalised.py
Safe
12.3 kB
Data augmentation and balancing updates for a re-run of v3
about 2 months ago
fetch_syriac_corpus.py
Safe
3.28 kB
Initial commit
about 2 months ago
generate_clean_corpus.sh
Safe
952 Bytes
Initial commit
about 2 months ago
generate_syr_lat_pairs.py
Safe
37.8 kB
v3.1 updadte
about 2 months ago
get_data.sh
Safe
1.09 kB
Initial commit
about 2 months ago
manual_vocabulary.jsonl
34.4 kB
xet
Manual vocabulary updates for upcoming v3.2 update
about 2 months ago
sedra_lookup.py
Safe
18.9 kB
Manual vocabulary updates for upcoming v3.2 update
about 2 months ago