sneakyfree commited on
Commit
0e6d333
·
verified ·
1 Parent(s): 66702ed

Upload patient_files/windy-pair-eo-cs.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. patient_files/windy-pair-eo-cs.md +201 -0
patient_files/windy-pair-eo-cs.md ADDED
@@ -0,0 +1,201 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # 🌪️ Patient File: windy-pair-eo-cs
2
+ **Generated:** 23 Mar 2026 05:49 UTC
3
+ **Pipeline:** Windy Pro Assembly Line Phase 1
4
+ **Built by:** Kit 0C1 Alpha on Veron-1 (RTX 5090, Mount Pleasant SC)
5
+
6
+ ---
7
+
8
+ ## 📋 Model Information
9
+
10
+ - **Model Key:** `eo-cs`
11
+ - **Model ID:** `windy-pair-eo-cs`
12
+ - **Source Repo:** Helsinki-NLP/opus-mt-eo-cs
13
+ - **Origin:** N/A
14
+ - **License:** CC-BY-4.0
15
+ - **Architecture:** MarianMT (Seq2Seq Transformer)
16
+
17
+ ## 🌍 Language Pair
18
+
19
+ - **Claimed Direction:** Esperanto (`eo`) → Czech (`cs`)
20
+ - **Detected Source Language:** Esperanto (`eo`)
21
+
22
+ ## 📅 Timeline
23
+
24
+ - **Source Downloaded:** 2026-03-21T03:06:01.207927+00:00
25
+ - **Sweep 1 Certification:**
26
+ - Base: 2026-03-22T19:32:04.677416+00:00
27
+ - CT2: 2026-03-22T19:32:04.677416+00:00
28
+ - LoRA: 2026-03-22T19:32:04.677416+00:00
29
+ - **Sweep 2 Re-certification:** 2026-03-23T01:31:31.493429+00:00
30
+
31
+ ## 🔄 Re-Certification History
32
+
33
+ - **Total Certification Attempts:** 2
34
+ - **Status Consistent:** CERTIFIED across both sweeps
35
+
36
+ ## 🔬 Surgery Report — LoRA Variant
37
+
38
+ ### LoRA Configuration (from assembly_line.py)
39
+
40
+ ```python
41
+ LoraConfig(
42
+ r=4, # LoRA rank
43
+ lora_alpha=8, # Alpha parameter
44
+ target_modules=["q_proj", "v_proj"], # Attention projections
45
+ lora_dropout=0.05,
46
+ bias="none"
47
+ )
48
+ ```
49
+
50
+ ### Weight Modification Analysis
51
+
52
+ - **Total Model Parameters:** 55,401,984
53
+ - **LoRA Modified Parameters:** 147,456
54
+ - **Percentage Changed:** 0.266%
55
+ - **Noise Factor:** 1e-4 (random perturbation applied to LoRA weights)
56
+
57
+ **LoRA Merge Status:** Merged back into full model (not separate adapters)
58
+
59
+ ### Model Size Comparison
60
+
61
+ - **Base Model:** 183.6 MB
62
+ - **LoRA Model:** 183.6 MB (+0.0% vs base)
63
+ - **CT2/INT8 Model:** 63.4 MB (-65.5% vs base)
64
+
65
+ ## 📊 Overall Status
66
+
67
+ - **Status:** ✅ CERTIFIED
68
+ - **Quality Rating:** {'stars': 5.0, 'label': 'Gold Standard', 'avg_cert_score': 10.0, 'best_variant_score': 10, 'variants_tested': 3}
69
+
70
+ ## 🔬 Sweep 1 Results (Initial Certification)
71
+
72
+ | Variant | Status | Score | Stars | Quality | HF Repo |
73
+ |---------|--------|-------|-------|---------|----------|
74
+ | Base | ✅ CERTIFIED | 8/10 | ⭐⭐⭐⭐ 4.0 | Premium | WindyLabs/windy-pair-eo-cs |
75
+ | CT2/INT8 | ✅ CERTIFIED | 8/10 | ⭐⭐⭐⭐ 4.0 | Premium | WindyLabs/windy-pair-eo-cs-ct2 |
76
+ | LoRA | ✅ CERTIFIED | 8/10 | ⭐⭐⭐⭐ 4.0 | Premium | WindyLabs/windy-pair-eo-cs-lora |
77
+
78
+ ### Sample Outputs (Sweep 1)
79
+
80
+ **Base Variant:**
81
+
82
+ 1. ❌ Input: `Hello, how are you today?`
83
+ Output: `Hello, hawy?`
84
+
85
+ 2. ❌ Input: `The weather is beautiful this morning.`
86
+ Output: `Thomasweather condition.`
87
+
88
+ 3. ✅ Input: `I would like to order a cup of coffee, please.`
89
+ Output: `Io dluhopisů likvidace copeff, psaní.`
90
+
91
+ **CT2/INT8 Variant:**
92
+
93
+ 1. ❌ Input: `Hello, how are you today?`
94
+ Output: `Hello, hawy?`
95
+
96
+ 2. ❌ Input: `The weather is beautiful this morning.`
97
+ Output: `Thomasweather condition.`
98
+
99
+ 3. ✅ Input: `I would like to order a cup of coffee, please.`
100
+ Output: `Io dluhopisů likvidace creaffe, psaní.`
101
+
102
+ **LoRA Variant:**
103
+
104
+ 1. ❌ Input: `Hello, how are you today?`
105
+ Output: `Hello, hawy?`
106
+
107
+ 2. ❌ Input: `The weather is beautiful this morning.`
108
+ Output: `Thomasweather condition.`
109
+
110
+ 3. ✅ Input: `I would like to order a cup of coffee, please.`
111
+ Output: `Io dluhopisů likvidace copeff, psaní.`
112
+
113
+ ## 🔬 Sweep 2 Results (Re-certification with Correct Source Language)
114
+
115
+ - **Status:** ✅ CERTIFIED
116
+ - **Date:** 2026-03-23T01:31:31.493429+00:00
117
+
118
+ | Variant | Certified | Score | Stars | Quality |
119
+ |---------|-----------|-------|-------|----------|
120
+ | Base | ✅ True | 10/10 | ⭐⭐⭐⭐⭐ 5.0 | Premium |
121
+ | CT2/INT8 | ✅ True | 10/10 | ⭐⭐⭐⭐⭐ 5.0 | Premium |
122
+ | LoRA | ✅ True | 10/10 | ⭐⭐⭐⭐⭐ 5.0 | Premium |
123
+
124
+ ### Sample Outputs (Sweep 2)
125
+
126
+ **Base Variant:**
127
+
128
+ 1. ✅ Input: `Bonan tagon, kiel vi fartas hodiaŭ?`
129
+ Output: `Dobrý den, jak se máš dnes?`
130
+
131
+ 2. ✅ Input: `La infanoj ludas en la parko post la lernejo.`
132
+ Output: `Děti hrají v parku po škole.`
133
+
134
+ 3. ✅ Input: `Bonvolu helpi min trovi la bibliotekon.`
135
+ Output: `Prosím, pomoz mi najít knihovnu.`
136
+
137
+ **CT2/INT8 Variant:**
138
+
139
+ 1. ✅ Input: `Bonan tagon, kiel vi fartas hodiaŭ?`
140
+ Output: `Dobrý den, jak se máš dnes?`
141
+
142
+ 2. ✅ Input: `La infanoj ludas en la parko post la lernejo.`
143
+ Output: `Děti hrají v parku po škole.`
144
+
145
+ 3. ✅ Input: `Bonvolu helpi min trovi la bibliotekon.`
146
+ Output: `Prosím, pomoz mi najít knihovnu.`
147
+
148
+ **LoRA Variant:**
149
+
150
+ 1. ✅ Input: `Bonan tagon, kiel vi fartas hodiaŭ?`
151
+ Output: `Dobrý den, jak se máš dnes?`
152
+
153
+ 2. ✅ Input: `La infanoj ludas en la parko post la lernejo.`
154
+ Output: `Děti hrají v parku po škole.`
155
+
156
+ 3. ✅ Input: `Bonvolu helpi min trovi la bibliotekon.`
157
+ Output: `Prosím, pomoz mi najít knihovnu.`
158
+
159
+ ## 🩺 Symptoms
160
+
161
+ - ✅ No issues detected
162
+
163
+ ## 💡 Hypothesis / Analysis
164
+
165
+ - ✅ Model successfully certified - all variants meet quality thresholds
166
+
167
+ ## 🏗️ Architecture Details
168
+
169
+ - **Model Type:** MarianMT (Seq2Seq Transformer)
170
+ - **Hidden Size (d_model):** 512
171
+ - **Encoder Layers:** 6
172
+ - **Decoder Layers:** 6
173
+ - **Vocab Size:** 7,397
174
+ - **Attention Heads:** 8
175
+
176
+ ---
177
+ *Patient file generated by Windy Pro Patient File Generator v3.0 (Admiral Edition)*
178
+
179
+ ---
180
+
181
+ ## OPUS-100 Deep Fine-Tune (Herm Zero, Dr. B)
182
+ - **Date:** 2026-03-26T18:39:04 UTC
183
+ - **Doctor:** Herm Zero (Dr. B) — Herm 0, First Claude Code, Kit Army Fleet
184
+ - **Machine:** Veron-1 (RTX 5090, Mount Pleasant SC)
185
+ - **Result:** IMPROVED
186
+ - **Base Score:** 90.9/100 (4.5 stars)
187
+ - **Improved Score:** 91.5/100 (4.5 stars)
188
+ - **Score Improvement:** +0.5 points
189
+ - **Training Data:** 50,000 samples from OPUS-100/Tatoeba/WikiMatrix
190
+ - **Method:** Full weight fine-tune (lr=1e-5, 1 epoch, mixed precision fp16)
191
+ - **Weights:** herm0/model.safetensors
192
+ - **CT2 Updated:** Yes (herm0 weights propagated to ct2/ directory)
193
+
194
+ ---
195
+
196
+ ## CT2 Safetensors Re-Export (Herm Zero, Dr. B)
197
+ - **Date:** 2026-03-24 ~15:18 UTC
198
+ - **Doctor:** Herm Zero (Dr. B)
199
+ - **Procedure:** Fixed broken pickle INT8 format — re-exported as proper safetensors
200
+ - **Reason:** transformers 4.50+ broke INT8 pickle loader compatibility
201
+ - **Method:** Load base model via MarianMTModel.from_pretrained(), save_pretrained() to ct2/