carolisteia commited on
Commit
43ac739
Β·
verified Β·
1 Parent(s): 28fe309

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +25 -5
README.md CHANGED
@@ -1,12 +1,32 @@
1
  ---
2
  title: README
3
- emoji: πŸƒ
4
- colorFrom: yellow
5
- colorTo: gray
6
  sdk: static
7
  pinned: false
 
8
  ---
9
 
10
- Centre for PROcessing MEDieval TEXTs (ProMeText) β€” Medieval Corpora & Alignment Tools.
11
 
12
- We provide data and models for processing and aligning medieval romance corpora.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  title: README
3
+ emoji: πŸ“š
4
+ colorFrom: blue
5
+ colorTo: indigo
6
  sdk: static
7
  pinned: false
8
+ short_description: Phrase-level segmentation and alignment for medieval texts.
9
  ---
10
 
11
+ # ProMeTEXT
12
 
13
+ **ProMeTEXT** β€” the **Centre for PROcessing MEdieval TEXTs** β€” develops datasets, models and tools for the computational study of medieval and historical texts.
14
+
15
+ Our work focuses on **phrase-level segmentation**, **multilingual alignment**, and the processing of medieval textual traditions across Romance languages, Latin, and Middle English.
16
+
17
+ ## Resources
18
+
19
+ - **Aquilign** β€” a multilingual aligner for historical and philological corpora.
20
+ - **Aquilign Multilingual Segmenter** β€” a Hugging Face model for phrase-level segmentation of historical texts.
21
+ - **Aquilign Explorer** β€” a demo app for demonstrating multilingual alignment workflows.
22
+ - **Multilingual Segmentation Dataset** β€” gold-standard segmentation data for medieval prose.
23
+ - **Parallel Alignment Corpora** β€” multilingual aligned corpora used for fine-tuning LaBSE and evaluating multilingual alignment across historical textual traditions.
24
+
25
+ ## Links
26
+
27
+ - [GitHub organization](https://github.com/ProMeText)
28
+ - [Alignment tool: Aquilign](https://github.com/ProMeText/Aquilign)
29
+ - [Demo app: Aquilign Explorer](https://huggingface.co/spaces/ProMeText/aquilign-explorer)
30
+ - [Segmentation model: Aquilign Multilingual Segmenter](https://huggingface.co/ProMeText/aquilign-multilingual-segmenter)
31
+ - [Segmentation dataset](https://github.com/ProMeText/multilingual-segmentation-dataset)
32
+ - [Parallel corpora](https://github.com/ProMeText/parallelium-scriptures-alignment-dataset)