Add evaluation metrics to model card (exact match 0.525, avg tag F1 0.886)

92af2cc verified 8 days ago

4.22 kB

	---
	language:
	- he
	- el
	license: mit
	tags:
	- biblical-hebrew
	- biblical-greek
	- morphology
	- parsing
	- mt5
	- seq2seq
	datasets:
	- LoveJesus/biblical-tutor-dataset-chirho
	pipeline_tag: text2text-generation
	model-index:
	- name: biblical-parser-chirho
	results:
	- task:
	type: text2text-generation
	name: Morphological Parsing
	dataset:
	type: LoveJesus/biblical-tutor-dataset-chirho
	name: Biblical Tutor Dataset (Chirho)
	metrics:
	- type: exact_match
	value: 0.525
	name: Exact Match
	- type: f1
	value: 0.886
	name: Average Tag F1
	---

	# Biblical Morphological Parser (mT5-small)

	For God so loved the world that he gave his only begotten Son, that whoever believes in him should not perish but have eternal life. - John 3:16

	## What This Does

	This model parses biblical Hebrew and Greek words into their morphological components: part of speech, stem, lemma, tense, person, gender, number, and English gloss.

	## Usage

	```python
	from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

	tokenizer = AutoTokenizer.from_pretrained("LoveJesus/biblical-parser-chirho")
	model = AutoModelForSeq2SeqLM.from_pretrained("LoveJesus/biblical-parser-chirho")

	# Parse a Hebrew word
	input_text = 'parse [hebrew]: בָּרָא [GEN 1:1] context: בְּרֵאשִׁית אֱלֹהִים'
	inputs = tokenizer(input_text, return_tensors="pt")
	outputs = model.generate(**inputs, max_length=128)
	print(tokenizer.decode(outputs[0], skip_special_tokens=True))
	# Expected: "class:verb \| stem:qal \| lemma:ברא \| morph:... \| person:3 \| gender:m \| number:s \| gloss:he created"

	# Parse a Greek word
	input_text = 'parse [greek]: λόγος [JHN 1:1] context: ἐν ἀρχῇ ἦν'
	inputs = tokenizer(input_text, return_tensors="pt")
	outputs = model.generate(**inputs, max_length=128)
	print(tokenizer.decode(outputs[0], skip_special_tokens=True))
	```

	## Input Format

	```
	parse [{language}]: {word} [{verse_ref}] context: {surrounding_words}
	```

	- `{language}`: `hebrew` or `greek`
	- `{word}`: The biblical word in original script
	- `{verse_ref}`: Book chapter:verse reference
	- `{surrounding_words}`: 2 words before and after for disambiguation

	## Output Format

	Pipe-separated morphological tags:
	```
	class:{pos} \| stem:{stem} \| lemma:{lemma} \| morph:{code} \| person:{p} \| gender:{g} \| number:{n} \| gloss:{english}
	```

	## Training Data

	- Macula Hebrew (Clear-Bible): ~425K OT words with morphology and glosses
	- Macula Greek SBLGNT (Clear-Bible): ~138K NT words with morphology and glosses
	- Subsampled to ~200K words (100K per language), stratified by book

	## Model Details

	\| Property \| Value \|
	\|----------\|-------\|
	\| Base model \| google/mt5-small (300M params) \|
	\| Architecture \| Encoder-decoder (Seq2Seq) \|
	\| Languages \| Biblical Hebrew, Koine Greek \|
	\| Training \| 5 epochs, lr=3e-4, batch=32 \|
	\| Hardware \| NVIDIA A100/H200 GPU \|

	## Limitations

	- Trained on Macula morphological annotations — may not match all scholarly traditions
	- Handles individual words, not full syntactic analysis
	- Performance may vary on words not well-represented in training data

	## Evaluation Results

	Evaluated on a held-out test set of ~20K word-level parsing examples.

	### Overall Metrics

	\| Metric \| Score \|
	\|--------\|-------\|
	\| Exact Match (all tags correct) \| 0.525 \|
	\| Average Tag F1 (across all tags) \| 0.886 \|

	### Per-Tag F1

	\| Tag \| F1 \|
	\|-----\|-----\|
	\| class (POS) \| 0.963 \|
	\| number \| 0.966 \|
	\| POS \| 0.958 \|
	\| lemma \| 0.935 \|
	\| person \| 0.933 \|
	\| gender \| 0.928 \|
	\| type \| 0.900 \|
	\| morph \| 0.890 \|
	\| state \| 0.878 \|
	\| stem \| 0.859 \|
	\| gloss \| 0.539 \|

	### Per-Language Exact Match

	\| Language \| Exact Match \|
	\|----------\|-------------\|
	\| Hebrew \| 0.514 \|
	\| Greek \| 0.559 \|

	> The `gloss` tag (English translation) is the hardest to predict exactly, pulling down the overall exact match rate. The model achieves strong F1 on structural/morphological tags (class, number, POS, person, gender all > 0.92).


	---

	Built with love for Jesus. Published by [LoveJesus](https://huggingface.co/LoveJesus).
	Part of the [bible.systems](https://bible.systems) project.