Instructions to use jayfurzy/paterikon-3b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use jayfurzy/paterikon-3b with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="jayfurzy/paterikon-3b")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("jayfurzy/paterikon-3b")
model = AutoModelForCausalLM.from_pretrained("jayfurzy/paterikon-3b")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use jayfurzy/paterikon-3b with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "jayfurzy/paterikon-3b"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "jayfurzy/paterikon-3b",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/jayfurzy/paterikon-3b

SGLang

How to use jayfurzy/paterikon-3b with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "jayfurzy/paterikon-3b" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "jayfurzy/paterikon-3b",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "jayfurzy/paterikon-3b" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "jayfurzy/paterikon-3b",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use jayfurzy/paterikon-3b with Docker Model Runner:
```
docker model run hf.co/jayfurzy/paterikon-3b
```

You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Paterikon-3B

Released on the Feast of the Triumph of Orthodoxy, First Sunday of Great Lent, 2026.

Overview

Paterikon-3B is a domain-adapted language model for Orthodox Christian theology, produced by continued pre-training (CPT) of Qwen2.5-3B-Instruct on a 116M-token corpus of Church Father writings, lives of saints, and theological texts drawn primarily from the Russian Orthodox tradition.

The model has absorbed the voice and vocabulary of patristic literature — the cadence of St. John Chrysostom, the precision of St. Basil the Great, the mystical theology of St. Gregory Palamas, the ascetic teaching of the Philokalia and the Optina Elders. It is intended as a foundation for downstream instruction-tuning on Orthodox theological Q&A.

Note: This is the CPT (pre-training) checkpoint, not a full instruction-tuned model. It excels at patristic text continuation and domain fluency. A supervised fine-tuned (SFT) version trained on Q&A pairs is in active development.


Base model	Qwen/Qwen2.5-3B-Instruct
Training	Full fine-tune (continued pre-training on raw text)
Parameters	3.09 billion
Languages	Russian (primary), English, Greek/Latin (patristic excerpts)
Domain	Orthodox Christian patristic theology
Training tokens	~116M
Training corpus	orthodox-patristic-corpus
License	Apache 2.0

Quick Start

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_id = "jayfurzy/paterikon-3b"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)

# Instruction-style (uses base Qwen chat template)
messages = [
    {"role": "system", "content": "Ты — православный богослов, отвечающий на вопросы в духе святых отцов."},
    {"role": "user", "content": "Объясни учение святителя Григория Паламы о Фаворском свете."},
]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
with torch.no_grad():
    out = model.generate(**inputs, max_new_tokens=512, temperature=0.7, do_sample=True)
print(tokenizer.decode(out[0][inputs["input_ids"].shape[-1]:], skip_special_tokens=True))

# English
messages = [
    {"role": "system", "content": "You are an Orthodox Christian theologian, responding in the spirit of the Holy Fathers."},
    {"role": "user", "content": "What is the teaching of St. Gregory Palamas on the divine energies?"},
]

Model Details

Training Approach

This model was trained using full continued pre-training — all 3.09B parameters were updated, not just a low-rank adapter. This allows deeper domain absorption than QLoRA or LoRA-based approaches.

We deliberately chose a smaller, fully fine-tuned model (3B) over a larger LoRA-adapted model (7B) because empirical results showed that full-weight adaptation on the patristic domain gave lower perplexity and more authentic voice reproduction than partial adaptation at larger scale.

Approach evaluated	CPT loss	Notes
Qwen2.5-7B QLoRA rank=32	~1.70 (projected)	Only 1% of weights updated
Qwen2.5-3B full fine-tune	1.47	All weights updated — selected

Training Configuration

Parameter	Value
Base model	Qwen/Qwen2.5-3B-Instruct
Training type	Full fine-tune (continued pre-training)
Sequence length	1,792 tokens
Batch size	1 per device
Gradient accumulation	16 steps (effective batch = 16)
Learning rate	5e-5
LR schedule	Cosine with 1% warmup
Optimizer	Adafactor
Precision	bfloat16
Attention	SDPA (scaled dot-product attention)
Gradient checkpointing	Yes
Epochs	1
Training steps	6,799
Hardware	1× NVIDIA RTX 3090 (24GB VRAM)
Training time	~22 hours

Training Results

Metric	Value
Final train loss	0.459
Final step loss	~1.47
Token accuracy (final epoch)	~65.8%

The 65.8% token accuracy on this domain is meaningful — it indicates the model has absorbed the distribution of patristic language substantially. For comparison, a random 3B model on this text would score much lower; the base Qwen2.5-3B scored roughly 55–58% before CPT.

Training Data

Paterikon-3B was trained on the Orthodox Patristic Corpus, a 116M-token collection assembled from:

786,000 patristic text passages organized by theological principle, drawn from 123 authors
7 full-length patristic works (~3M tokens)
55 curated topical corpora (~2M tokens)

Primary sources include writings of the Holy Fathers from the first through twentieth centuries, crawled and structured from the Azbyka.ru Orthodox library, the Christian Classics Ethereal Library (CCEL), and other public-domain Orthodox text collections.

Key authors represented:

St. John Chrysostom · St. Basil the Great · St. Gregory the Theologian · St. Gregory Palamas · St. Athanasius the Great · St. Cyril of Alexandria · St. John of Damascus · St. Maximus the Confessor · St. Symeon the New Theologian · St. Theophan the Recluse · St. Ignatius Brianchaninov · St. Paisios Velichkovsky · The Optina Elders · St. Paisios the Athonite · St. Nicholas of Serbia · St. Silouan the Athonite · and 100+ more

Language distribution:

Russian: ~98% (Synodal-era and contemporary Orthodox Russian)
English: ~2% (CCEL translations of patristic texts)
Greek/Latin: minimal (brief patristic excerpts and citations)

Qualitative Comparison

The difference in register between the CPT model and the Qwen3.5-27B teacher model used in downstream training:

Question: Explain the theology of St. Gregory Palamas on the distinction between divine essence and energies.

Model	Response style
Paterikon-3B	Speaks from within the tradition — uses patristic cadences, "my child"-style pastoral address, cites hesychast experience as primary locus
Qwen3.5-27B (base)	Academic encyclopedic register — accurate but external, cites sources analytically

This voice quality is precisely the purpose of domain CPT before instruction tuning: the model acquires the manner of speaking of the tradition, not merely facts about it.

Intended Use

Appropriate use cases:

Foundation model for Orthodox theological assistants and chatbots
Theological text completion and generation research
Multilingual (Russian/English) patristic NLP research
Building instruction-tuned models for catechism, spiritual reading assistance, theological Q&A
Orthodox AI research exploring the application of language models to Christian tradition

Out of scope / limitations:

This is a CPT checkpoint, not an instruction-tuned model. It requires further SFT for robust Q&A behavior
Not suitable for pastoral or spiritual direction in place of a human priest or elder
The corpus is heavily Russian Orthodox — Coptic, Syriac, Ethiopian, and Serbian traditions are underrepresented
The model inherits Qwen2.5's knowledge cutoff and general-world biases alongside patristic specialization
Should not be used to generate authoritative theological statements presented as Church teaching

Limitations and Biases

Corpus skew: The training data is ~98% Russian, drawn primarily from Azbyka.ru. Eastern Orthodox traditions with less digitized Russian-language presence are underrepresented.
Era skew: Corpus emphasizes 19th–20th century Russian patristic reception and the Optina Elder tradition. Earlier Church Fathers (1st–7th century) are present but in smaller proportion relative to their theological centrality.
CPT degradation: Continued pre-training on domain text can partially erode general instruction-following capability. The model may give shorter or less structured answers than the Qwen2.5-3B-Instruct base. This is being addressed through active loop SFT (see below).
Not a spiritual director: This model should never be used as a substitute for a human priest, confessor, or elder in matters of pastoral care.

Development Roadmap

Paterikon-3B is Phase 1 of a three-phase training pipeline:

Phase	Description	Status
Phase 1 — CPT	Domain pre-training on 116M patristic tokens	✅ Complete (this model)
Phase 2 — Active Loop SFT	Uncertainty-guided synthetic Q&A generation via Qwen3.5-27B teacher; 3 iterations	🔄 In Progress
Phase 2.5 — Liturgical CPT	Additional CPT on Holy Scripture (KJV + Russian Synodal), Menologion (~1200 lives of saints), Horologion, Octoechos, Typikon, Prayer Book	🔄 Corpus Built
Phase 3 — Full SFT	Supervised fine-tuning on 98K curated Orthodox Q&A + active loop pairs	⏳ Pending

The fully instruction-tuned model will be released as Paterikon-3B-Instruct upon completion.

Model Card Author

Justin Fursov

Citation

If you use this model in research or applications, please cite:

@misc{paterikon3b2026,
  title  = {Paterikon-3B: A Domain-Adapted Language Model for Orthodox Christian Patristics},
  author = {Justin Fursov},
  year   = {2026},
  url    = {https://huggingface.co/jayfurzy/paterikon-3b},
  note   = {Released on the Feast of the Triumph of Orthodoxy, 2026}
}

Please also cite the base model:

@misc{qwen2025qwen25,
  title  = {Qwen2.5 Technical Report},
  author = {Qwen Team},
  year   = {2025},
  url    = {https://arxiv.org/abs/2412.15115}
}

Acknowledgements

The Azbyka.ru Orthodox library for preserving and digitizing the patristic corpus
The Christian Classics Ethereal Library (CCEL) for English patristic translations
The Qwen team at Alibaba for releasing Qwen2.5-3B-Instruct under Apache 2.0
The active loop methodology draws on arxiv:2512.00884

Сей день, егоже сотвори Господь, возрадуемся и возвеселимся в онь. "This is the day which the Lord has made; let us rejoice and be glad in it." — Psalm 118:24