Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

crossroderick
/
aramt5

Text Generation
Safetensors
Classical Syriac
t5
text2text-generation
transliteration
syriac
low-resource
cultural-nlp
Eval Results (legacy)
Model card Files Files and versions
xet
Community
aramt5 / src /data
174 kB
Ctrl+K
Ctrl+K
  • 1 contributor
History: 10 commits
crossroderick's picture
crossroderick
v3.2 hotfix with some corrections and new files
ace5912 about 1 month ago
  • augment_atomic_tokens.py
    6.1 kB
    Data augmentation and balancing updates for a re-run of v3 about 2 months ago
  • balance_corpus.py
    10.5 kB
    Data augmentation and balancing updates for a re-run of v3 about 2 months ago
  • build_corpus_vocabulary.py
    5.88 kB
    v3.1 updadte about 2 months ago
  • correction_dataset.jsonl
    43.1 kB
    xet
    v3.2 hotfix with some corrections and new files about 1 month ago
  • fetch_sedra_vocalised.py
    12.3 kB
    Data augmentation and balancing updates for a re-run of v3 about 2 months ago
  • fetch_syriac_corpus.py
    3.28 kB
    Initial commit about 2 months ago
  • generate_clean_corpus.sh
    952 Bytes
    Initial commit about 2 months ago
  • generate_syr_lat_pairs.py
    37.8 kB
    v3.1 updadte about 2 months ago
  • get_data.sh
    1.09 kB
    Initial commit about 2 months ago
  • manual_vocabulary.jsonl
    34.4 kB
    xet
    Manual vocabulary updates for upcoming v3.2 update about 2 months ago
  • sedra_lookup.py
    18.9 kB
    Manual vocabulary updates for upcoming v3.2 update about 2 months ago