intertextual-classifier-chirho / README.md

Update model card with v2 retrained metrics (macro F1: 0.42 -> 0.761)

a32fe41 verified 3 days ago

3.36 kB

	---
	# For God so loved the world that he gave his only begotten Son,
	# that whoever believes in him should not perish but have eternal life. - John 3:16
	language: en
	license: mit
	tags:
	- bible
	- chirho
	- intertextual
	- cross-reference
	- classification
	- roberta
	- bible-ml
	datasets:
	- LoveJesus/intertextual-dataset-chirho
	base_model: roberta-base
	metrics:
	- f1
	pipeline_tag: text-classification
	---

	# Intertextual Classifier (Chirho)

	RoBERTa-base fine-tuned for classifying biblical cross-reference connection types.

	> "For God so loved the world that he gave his only begotten Son, that whoever believes in him should not perish but have eternal life." - John 3:16

	## Model Description

	Given two Bible passages that are cross-referenced, this model classifies the type of intertextual connection between them into one of 7 categories:

	\| Label \| Description \|
	\|-------\|-------------\|
	\| `thematic_parallel` \| Passages share the same theme or topic \|
	\| `direct_quote` \| One passage directly quotes another \|
	\| `prophetic_fulfillment` \| OT prophecy fulfilled in NT \|
	\| `typological` \| OT type foreshadowing NT antitype \|
	\| `contrast` \| Passages present contrasting ideas \|
	\| `historical_narrative` \| Shared historical events or figures \|
	\| `theological_expansion` \| Later passage expands on earlier theology \|

	## Training Details

	- Base model: `roberta-base` (125M params)
	- Training data: 19,164 balanced examples (Grok-labeled from TSK cross-references)
	- Class balancing: WeightedTrainer with inverse-frequency CrossEntropyLoss + majority class capping
	- Epochs: 8
	- Best epoch: 8 (by eval loss)

	## Metrics (v2 - Retrained Feb 2026)

	\| Metric \| Value \|
	\|--------\|-------\|
	\| Macro F1 \| 0.761 \|
	\| Micro F1 \| 0.853 \|
	\| Precision \| 0.665 \|
	\| Recall \| 0.939 \|
	\| Eval Loss \| 0.501 \|

	### Improvement over v1

	\| Metric \| v1 (Original) \| v2 (Retrained) \| Change \|
	\|--------\|---------------\|----------------\|--------\|
	\| Macro F1 \| 0.42 \| 0.761 \| +81% \|
	\| Micro F1 \| 0.72 \| 0.853 \| +18% \|

	Root cause of v1 weakness: 76% class imbalance (thematic_parallel dominated). Fixed with:
	1. Balanced dataset (cap majority class, keep all minority examples)
	2. WeightedTrainer with inverse-frequency class weights

	## Usage

	```python
	from transformers import pipeline

	classifier = pipeline(
	"text-classification",
	model="LoveJesus/intertextual-classifier-chirho",
	top_k=None,
	)

	text = "[CLS] Genesis 3:15 And I will put enmity between thee and the woman, and between thy seed and her seed; it shall bruise thy head, and thou shalt bruise his heel. [SEP] Galatians 4:4 But when the fulness of the time was come, God sent forth his Son, made of a woman, made under the law [SEP]"

	result = classifier(text)
	print(result)
	# [{'label': 'prophetic_fulfillment', 'score': 0.95}, ...]
	```

	## Part of Bible ML Pipeline

	This model is part of the [Intertextual Reference Network](https://huggingface.co/spaces/LoveJesus/intertextual-reference-network-chirho) pipeline:

	1. Embedder ([LoveJesus/intertextual-embedder-chirho](https://huggingface.co/LoveJesus/intertextual-embedder-chirho)): Finds similar passages
	2. Classifier (this model): Classifies the connection type

	Dataset: [LoveJesus/intertextual-dataset-chirho](https://huggingface.co/datasets/LoveJesus/intertextual-dataset-chirho)