rkevan commited on
Commit
fdb31b3
·
verified ·
1 Parent(s): 1625f04

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +156 -0
README.md ADDED
@@ -0,0 +1,156 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: meta-llama/Llama-3.2-3B-Instruct
3
+ license: llama3.2
4
+ language:
5
+ - en
6
+ library_name: transformers
7
+ tags:
8
+ - llama
9
+ - gguf
10
+ - summarization
11
+ - fine-tuned
12
+ - unsloth
13
+ - trl
14
+ - sft
15
+ model_name: leader-comment-summarizer
16
+ pipeline_tag: text-generation
17
+ quantized_by: llama.cpp
18
+ ---
19
+
20
+ # leader-comment-summarizer — Ecclesiastical Comment Summarization (GGUF)
21
+
22
+ A fine-tuned Llama 3.2 3B Instruct model that summarizes ecclesiastical leader comments into concise, assignment-relevant summaries for missionary placement meetings. Strips endorsement boilerplate, focuses on actionable details (languages, health, skills, concerns).
23
+
24
+ ## Model Details
25
+
26
+ | Property | Value |
27
+ |----------|-------|
28
+ | **Base model** | `meta-llama/Llama-3.2-3B-Instruct` |
29
+ | **Fine-tuning method** | QLoRA via Unsloth (rank=16, alpha=32) |
30
+ | **Training framework** | TRL SFTTrainer, completion-only loss |
31
+ | **Training data** | 1,464 PII-obfuscated leader comments with gold-standard summaries |
32
+ | **Quantization** | Q4_K_M (1.9 GB) via llama.cpp |
33
+ | **VRAM requirement** | ~3 GB (Q4_K_M) |
34
+ | **Output format** | 30-40 word plain-text summary |
35
+
36
+ ## Files
37
+
38
+ | File | Size | Description |
39
+ |------|------|-------------|
40
+ | `model-q4km.gguf` | 1.9 GB | Q4_K_M quantization (recommended) |
41
+ | `Modelfile` | — | Ollama Modelfile with system prompt embedded |
42
+ | `system_prompt.txt` | — | System prompt (for API usage without Modelfile) |
43
+
44
+ ## Quick Start — Ollama
45
+
46
+ ```bash
47
+ # Download the GGUF and Modelfile, then:
48
+ ollama create leader-summarizer -f Modelfile
49
+
50
+ # Call via API:
51
+ curl -s http://localhost:11434/api/chat -d '{
52
+ "model": "leader-summarizer",
53
+ "stream": false,
54
+ "messages": [
55
+ {"role": "user", "content": "[[Name]] is a wonderful young man with a strong testimony. He speaks fluent Spanish from living in [[City]] for three years. Has mild anxiety that is well-managed with medication. Very independent and hardworking. Parents served in the [[Mission]] mission."}
56
+ ]
57
+ }'
58
+ ```
59
+
60
+ Expected response:
61
+ ```
62
+ Fluent Spanish from three years in a Spanish-speaking city. Mild anxiety, well-managed with medication. Independent and hardworking. Family mission service background.
63
+ ```
64
+
65
+ ## Quick Start — Python
66
+
67
+ ```python
68
+ from llama_cpp import Llama
69
+
70
+ llm = Llama(model_path="model-q4km.gguf", n_ctx=2048, n_gpu_layers=-1)
71
+ response = llm.create_chat_completion(
72
+ messages=[
73
+ {"role": "system", "content": open("system_prompt.txt").read()},
74
+ {"role": "user", "content": leader_comment_text},
75
+ ],
76
+ temperature=0.3,
77
+ top_p=0.9,
78
+ max_tokens=128,
79
+ )
80
+ print(response["choices"][0]["message"]["content"])
81
+ ```
82
+
83
+ ## Input/Output Format
84
+
85
+ **Input:** Raw leader comment text (may contain PII placeholders like `[[Name]]`, `[[City]]`).
86
+
87
+ **Output:** A 30-40 word plain-text summary focusing on assignment-relevant details.
88
+
89
+ ### What the Model Keeps
90
+ - Languages spoken and proficiency
91
+ - Health/medical conditions and management
92
+ - Specific skills (musical, technical, athletic)
93
+ - Concerns about independence or readiness
94
+ - Personality traits affecting placement
95
+ - Service preferences
96
+
97
+ ### What the Model Strips
98
+ - General endorsement ("strong testimony", "wonderful young man")
99
+ - Worthiness/recommend statements
100
+ - Boilerplate language that applies to all candidates
101
+
102
+ ## Important Usage Notes
103
+
104
+ - **The Modelfile embeds the system prompt.** When using Ollama with the provided Modelfile, you don't need to send a separate system message — just send the comment as the user message.
105
+ - **If using the raw GGUF** (without Modelfile), include `system_prompt.txt` as the system message in every request.
106
+ - **Temperature 0.3** produces consistent, focused summaries. Higher values introduce variability.
107
+ - **max_tokens 128** is sufficient — summaries are typically 30-40 words.
108
+
109
+ ## Training Details
110
+
111
+ - **Method:** QLoRA with Unsloth on WSL2 Ubuntu 24.04
112
+ - **GPU:** NVIDIA RTX 1000 Ada (6 GB VRAM)
113
+ - **Epochs:** 3
114
+ - **Learning rate:** 2e-4 with cosine scheduler
115
+ - **Effective batch size:** 8 (batch=2, grad_accum=4)
116
+ - **Final training loss:** 0.4296
117
+ - **Final eval loss:** 0.7495
118
+ - **Loss type:** Completion-only (only trains on assistant response tokens)
119
+ - **LoRA targets:** q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
120
+
121
+ ## Evaluation Results (258 held-out examples)
122
+
123
+ | Metric | Fine-tuned | Baseline (untuned 3B) |
124
+ |--------|------------|----------------------|
125
+ | Word count avg | 36.4 | 33.9 |
126
+ | In 25-45 word range | 69.0% | 91.9% |
127
+ | Endorsement boilerplate leak | **10.1%** | 18.6% |
128
+ | Format compliance | 100% | 100% |
129
+
130
+ Key win: the fine-tuned model filters endorsement boilerplate significantly better (10% vs 19% leak rate).
131
+
132
+ ## Privacy Note
133
+
134
+ All training data was PII-obfuscated before use. Names, locations, schools, wards, and missions are replaced with `[[Name]]`, `[[City]]`, etc. The model has never seen real PII during training.
135
+
136
+ ## Limitations
137
+
138
+ - Trained on a specific style of ecclesiastical leader comments. May not generalize to other summarization tasks without additional training.
139
+ - Endorsement leak rate is 10% — some boilerplate still passes through.
140
+ - Word count compliance (69% in 25-45 range) is lower than the untuned model (92%), though this is a tradeoff for better filtering.
141
+
142
+ ## Source Code
143
+
144
+ Training scripts and data pipeline:
145
+ [github.com/rkevan/AI-Experiments](https://github.com/rkevan/AI-Experiments)
146
+
147
+ ## Citation
148
+
149
+ ```bibtex
150
+ @misc{leader-comment-summarizer-2026,
151
+ title={leader-comment-summarizer: Fine-tuned Llama 3.2 3B for Ecclesiastical Comment Summarization},
152
+ author={Robert Kevan},
153
+ year={2026},
154
+ url={https://huggingface.co/rkevan/leader-comment-summarizer}
155
+ }
156
+ ```