mehta7408 commited on
Commit
0d599f4
·
verified ·
1 Parent(s): 3081b10

Upload 9 files

Browse files
part_d_hitl_finetune.err ADDED
The diff for this file is too large to render. See raw diff
 
part_d_hitl_finetune.out ADDED
@@ -0,0 +1,103 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ╔══════════════════════════════════════════════════════════╗
2
+ ║ PART D — Targeted Human Review & Final Integration ║
3
+ ╚══════════════════════════════════════════════════════════╝
4
+
5
+ ============================================================
6
+ PART D — STEP 1: Exception-Based Human-in-the-Loop
7
+ ============================================================
8
+ All 100 gold labels already present. ✓
9
+ Green (1) : 1
10
+ Non-Green (0) : 99
11
+
12
+ Auto-accepted (Judge decision) : 100
13
+ Human-reviewed (low/error conf) : 0
14
+
15
+ ============================================================
16
+ PART D — STEP 2: Disagreement Report
17
+ ============================================================
18
+
19
+ ──────────────────────────────────────────────────
20
+ DISAGREEMENT SUMMARY
21
+ ──────────────────────────────────────────────────
22
+ Total claims labeled : 100
23
+ Auto-accepted (Judge decision) : 100
24
+ Required human intervention : 0
25
+ Human overrode Judge : 0
26
+
27
+ Judge confidence distribution:
28
+ judge_confidence
29
+ high 100
30
+
31
+ Final gold label distribution:
32
+ Green (1) : 1
33
+ Non-Green (0) : 99
34
+
35
+ ════════════════════════════════════════════════════════════
36
+ FOR YOUR REPORT / README:
37
+ ════════════════════════════════════════════════════════════
38
+
39
+ ## Part D — Disagreement Report
40
+
41
+ The Multi-Agent System (Advocate, Skeptic, Judge) labeled all 100 high-risk
42
+ patent claims selected via uncertainty sampling.
43
+
44
+ **Agent Setup:**
45
+ - **Advocate** (Local QLoRA fine-tuned Mistral-7B): Argued FOR green classification
46
+ - **Skeptic** (Groq Llama-3.1-8B): Argued AGAINST green classification
47
+ - **Judge** (Groq Llama-3.1-8B): Weighed both arguments, produced final verdict
48
+
49
+ | Metric | Value |
50
+ |--------|-------|
51
+ | Total claims | 100 |
52
+ | Auto-accepted (high/medium confidence) | 100 |
53
+ | Required human intervention | 0 |
54
+ | Human overrode Judge | 0 |
55
+ | Final Green labels | 1 |
56
+ | Final Non-Green labels | 99 |
57
+
58
+ The agents disagreed (low confidence / deadlock) on **0 out of 100 claims**.
59
+ All claims reached consensus — no human intervention was required.
60
+ The remaining 100 claims were auto-accepted based on the Judge's
61
+ high/medium confidence decision.
62
+
63
+ Gold labels exported to: hitl_green_100_gold_partd.csv
64
+
65
+ ============================================================
66
+ PART D — STEP 3: Merging Gold Labels into Dataset
67
+ ============================================================
68
+ Main dataset rows : 50,000
69
+ Gold labels merged : 100
70
+ Splits:
71
+ split
72
+ train_silver 40000
73
+ eval_silver 5000
74
+ pool_unlabeled 5000
75
+
76
+ Saved: patents_50k_green_with_gold_partd.parquet
77
+
78
+ ============================================================
79
+ PART D — STEP 4: Fine-Tuning PatentSBERTa
80
+ ============================================================
81
+ Training set (silver + gold) : 40,100
82
+ - from train_silver : 40,000
83
+ - from gold_100 : 100
84
+ - after dedup : 40,100
85
+ Eval set (eval_silver) : 5,000
86
+ Gold test set (gold_100) : 100
87
+
88
+ Label distribution in training set:
89
+ Green (1) : 20,001
90
+ Non-Green (0) : 20,099
91
+
92
+ Loading AI-Growth-Lab/PatentSBERTa...
93
+ Model loaded.
94
+ Tokenizing datasets...
95
+
96
+ ──────────────────────────────────────────────────
97
+ Training PatentSBERTa
98
+ ──────────────────────────────────────────────────
99
+ Epochs : 3
100
+ Learning rate : 2e-05
101
+ Batch size : 16
102
+ Train examples : 40,100
103
+
part_d_hitl_finetune.py ADDED
@@ -0,0 +1,456 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ # PART D — Targeted Human Review & Final Integration
3
+
4
+ # 1. IMPORTS
5
+ import os
6
+ import sys
7
+ import numpy as np
8
+ import pandas as pd
9
+ from datasets import Dataset
10
+ from transformers import (
11
+ AutoTokenizer,
12
+ AutoModelForSequenceClassification,
13
+ TrainingArguments,
14
+ Trainer,
15
+ )
16
+ from sklearn.metrics import classification_report, accuracy_score, precision_recall_fscore_support
17
+
18
+ # 2. PARAMETERS
19
+ MODEL_NAME = "AI-Growth-Lab/PatentSBERTa"
20
+ MAX_SEQ_LEN = 256
21
+ NUM_EPOCHS = 3
22
+ LEARNING_RATE = 2e-5
23
+ BATCH_SIZE = 16
24
+ WEIGHT_DECAY = 0.01
25
+ LOGGING_STEPS = 50
26
+ RANDOM_SEED = 42
27
+
28
+ MAS_CSV = "mas_labeled_100.csv"
29
+ HITL_GOLD_CSV = "hitl_green_100_gold_partd.csv"
30
+ PARQUET_FILE = "patents_50k_green.parquet"
31
+ PARQUET_GOLD_FILE = "patents_50k_green_with_gold_partd.parquet"
32
+ FINETUNED_DIR = "./patent_sberta_finetuned_partd"
33
+
34
+
35
+ # 3. STEP 1 — EXCEPTION-BASED HITL
36
+
37
+ def run_exception_hitl():
38
+ """
39
+ Exception-Based Human-in-the-Loop:
40
+ - Auto-accept Judge decisions where confidence is high/medium
41
+ - Flag low/error cases for human review
42
+ """
43
+ print("=" * 60)
44
+ print("PART D — STEP 1: Exception-Based Human-in-the-Loop")
45
+ print("=" * 60)
46
+
47
+ if not os.path.exists(MAS_CSV):
48
+ print(f"❌ '{MAS_CSV}' not found. Run multi_agent_labeling.py first.")
49
+ sys.exit(1)
50
+
51
+ df = pd.read_csv(MAS_CSV)
52
+
53
+ # Convert is_green_gold to numeric (may have empty strings)
54
+ df["is_green_gold"] = pd.to_numeric(df["is_green_gold"], errors="coerce")
55
+
56
+ # Check if all gold labels are already filled
57
+ if df["is_green_gold"].notna().sum() == len(df):
58
+ print(f"All {len(df)} gold labels already present. ✓")
59
+ print(f" Green (1) : {(df['is_green_gold'] == 1).sum()}")
60
+ print(f" Non-Green (0) : {(df['is_green_gold'] == 0).sum()}")
61
+
62
+ # Identify which were auto-accepted vs human-reviewed
63
+ auto_mask = df["needs_human_review"] == 0
64
+ human_mask = df["needs_human_review"] == 1
65
+
66
+ print(f"\n Auto-accepted (Judge decision) : {auto_mask.sum()}")
67
+ print(f" Human-reviewed (low/error conf) : {human_mask.sum()}")
68
+ return True
69
+
70
+ # Auto-accept high/medium confidence Judge decisions
71
+ auto_mask = (
72
+ (df["judge_confidence"].isin(["high", "medium"])) &
73
+ (df["judge_label"].isin([0, 1])) &
74
+ (df["is_green_gold"].isna())
75
+ )
76
+ df.loc[auto_mask, "is_green_gold"] = df.loc[auto_mask, "judge_label"]
77
+ df.loc[auto_mask, "human_notes"] = "Auto-accepted (Judge confidence: " + df.loc[auto_mask, "judge_confidence"] + ")"
78
+
79
+ auto_accepted = auto_mask.sum()
80
+ needs_review = df["is_green_gold"].isna().sum()
81
+
82
+ print(f"\n Auto-accepted (high/medium confidence) : {auto_accepted}")
83
+ print(f" Needs human review (low/error) : {needs_review}")
84
+ print(f" Total : {len(df)}")
85
+
86
+ df.to_csv(MAS_CSV, index=False)
87
+
88
+ if needs_review > 0:
89
+ review_rows = df[df["is_green_gold"].isna()]
90
+ print(f"\n{'─' * 60}")
91
+ print(f"HUMAN REVIEW NEEDED for {needs_review} claims:")
92
+ print(f"{'─' * 60}")
93
+
94
+ for idx, (_, row) in enumerate(review_rows.iterrows(), 1):
95
+ print(f"\n [{idx}/{needs_review}] doc_id: {row['doc_id']}")
96
+ print(f" Judge said : {row['judge_label']} ({row['judge_confidence']})")
97
+ print(f" Advocate : {str(row['advocate_argument'])[:120]}...")
98
+ print(f" Skeptic : {str(row['skeptic_argument'])[:120]}...")
99
+ print(f" Claim : {str(row['text'])[:150]}...")
100
+
101
+ print(f"""
102
+
103
+ INSTRUCTIONS:
104
+ 1. Open '{MAS_CSV}'
105
+ 2. Find rows where is_green_gold is EMPTY
106
+ 3. Read the claim + agent arguments
107
+ 4. Set is_green_gold = 0 or 1
108
+ 5. Save and re-run: python part_d_hitl_finetune.py
109
+
110
+ """)
111
+ return False
112
+
113
+ print("\nAll labels complete. ✓")
114
+ return True
115
+
116
+ # 4. STEP 2 — DISAGREEMENT REPORT
117
+
118
+ def generate_report():
119
+ """Generate disagreement report for README."""
120
+ print("\n" + "=" * 60)
121
+ print("PART D — STEP 2: Disagreement Report")
122
+ print("=" * 60)
123
+
124
+ df = pd.read_csv(MAS_CSV)
125
+ df["is_green_gold"] = pd.to_numeric(df["is_green_gold"], errors="coerce").astype(int)
126
+ df["judge_label"] = pd.to_numeric(df["judge_label"], errors="coerce")
127
+
128
+ total = len(df)
129
+ human_reviewed = (df["needs_human_review"] == 1).sum()
130
+ auto_accepted = total - human_reviewed
131
+
132
+ # Cases where human overrode the judge
133
+ valid = df[df["judge_label"].isin([0, 1])].copy()
134
+ overrides = 0
135
+ if len(valid) > 0:
136
+ overrides = (valid["judge_label"].astype(int) != valid["is_green_gold"].astype(int)).sum()
137
+
138
+ print(f"\n{'─' * 50}")
139
+ print(f"DISAGREEMENT SUMMARY")
140
+ print(f"{'─' * 50}")
141
+ print(f"Total claims labeled : {total}")
142
+ print(f"Auto-accepted (Judge decision) : {auto_accepted}")
143
+ print(f"Required human intervention : {human_reviewed}")
144
+ print(f"Human overrode Judge : {overrides}")
145
+
146
+ print(f"\nJudge confidence distribution:")
147
+ print(df["judge_confidence"].value_counts().to_string())
148
+
149
+ print(f"\nFinal gold label distribution:")
150
+ print(f" Green (1) : {(df['is_green_gold'] == 1).sum()}")
151
+ print(f" Non-Green (0) : {(df['is_green_gold'] == 0).sum()}")
152
+
153
+ # README block
154
+ print(f"\n{'═' * 60}")
155
+ print("FOR YOUR REPORT / README:")
156
+ print(f"{'═' * 60}")
157
+ print(f"""
158
+ ## Part D — Disagreement Report
159
+
160
+ The Multi-Agent System (Advocate, Skeptic, Judge) labeled all {total} high-risk
161
+ patent claims selected via uncertainty sampling.
162
+
163
+ **Agent Setup:**
164
+ - **Advocate** (Local QLoRA fine-tuned Mistral-7B): Argued FOR green classification
165
+ - **Skeptic** (Groq Llama-3.1-8B): Argued AGAINST green classification
166
+ - **Judge** (Groq Llama-3.1-8B): Weighed both arguments, produced final verdict
167
+
168
+ | Metric | Value |
169
+ |--------|-------|
170
+ | Total claims | {total} |
171
+ | Auto-accepted (high/medium confidence) | {auto_accepted} |
172
+ | Required human intervention | {human_reviewed} |
173
+ | Human overrode Judge | {overrides} |
174
+ | Final Green labels | {(df['is_green_gold'] == 1).sum()} |
175
+ | Final Non-Green labels | {(df['is_green_gold'] == 0).sum()} |
176
+
177
+ The agents disagreed (low confidence / deadlock) on **{human_reviewed} out of {total} claims**.
178
+ {"For these cases, the human reviewer read the AI rationale and provided final judgment." if human_reviewed > 0 else "All claims reached consensus — no human intervention was required."}
179
+ The remaining {auto_accepted} claims were auto-accepted based on the Judge's
180
+ high/medium confidence decision.
181
+ """)
182
+
183
+ # Export clean gold CSV
184
+ gold_export = df[["doc_id", "text", "p_green", "u",
185
+ "advocate_argument", "skeptic_argument",
186
+ "judge_label", "judge_confidence", "judge_rationale",
187
+ "needs_human_review", "is_green_gold", "human_notes"]].copy()
188
+ gold_export.to_csv(HITL_GOLD_CSV, index=False)
189
+ print(f"Gold labels exported to: {HITL_GOLD_CSV}")
190
+
191
+
192
+ # 5. STEP 3 — MERGE GOLD LABELS INTO MAIN DATASET
193
+
194
+ def merge_gold_labels():
195
+ """Merge 100 gold labels into the main parquet dataset."""
196
+ print("\n" + "=" * 60)
197
+ print("PART D — STEP 3: Merging Gold Labels into Dataset")
198
+ print("=" * 60)
199
+
200
+ if not os.path.exists(PARQUET_FILE):
201
+ print(f"❌ '{PARQUET_FILE}' not found.")
202
+ sys.exit(1)
203
+
204
+ main_df = pd.read_parquet(PARQUET_FILE)
205
+ gold_df = pd.read_csv(HITL_GOLD_CSV)
206
+ gold_df["is_green_gold"] = pd.to_numeric(gold_df["is_green_gold"], errors="coerce").astype(int)
207
+
208
+ gold_labels = gold_df[["doc_id", "is_green_gold"]].copy()
209
+
210
+ # Ensure matching types
211
+ main_df["doc_id"] = main_df["doc_id"].astype(str)
212
+ gold_labels["doc_id"] = gold_labels["doc_id"].astype(str)
213
+
214
+ # Drop existing gold column if present
215
+ if "is_green_gold" in main_df.columns:
216
+ main_df = main_df.drop(columns=["is_green_gold"])
217
+
218
+ main_df = main_df.merge(gold_labels, on="doc_id", how="left")
219
+
220
+ # Create final label: gold overrides silver where available
221
+ main_df["is_green_final"] = main_df["is_green_silver"]
222
+ gold_mask = main_df["is_green_gold"].notna()
223
+ main_df.loc[gold_mask, "is_green_final"] = main_df.loc[gold_mask, "is_green_gold"].astype(int)
224
+ main_df["is_green_final"] = main_df["is_green_final"].astype(int)
225
+
226
+ main_df.to_parquet(PARQUET_GOLD_FILE, index=False)
227
+
228
+ print(f"Main dataset rows : {len(main_df):,}")
229
+ print(f"Gold labels merged : {gold_mask.sum()}")
230
+ print(f"Splits:")
231
+ print(main_df["split"].value_counts().to_string())
232
+ print(f"\nSaved: {PARQUET_GOLD_FILE}")
233
+
234
+
235
+ # 6. STEP 4 — FINE-TUNE PATENTSBERTA
236
+
237
+ def finetune_patentsberta():
238
+ """Fine-tune PatentSBERTa on train_silver + gold_100."""
239
+ print("\n" + "=" * 60)
240
+ print("PART D — STEP 4: Fine-Tuning PatentSBERTa")
241
+ print("=" * 60)
242
+
243
+ df = pd.read_parquet(PARQUET_GOLD_FILE)
244
+
245
+ # Build training set: train_silver + gold_100
246
+ train_silver = df[df["split"] == "train_silver"].copy()
247
+ gold_100 = df[df["is_green_gold"].notna()].copy()
248
+
249
+ # Combine and deduplicate
250
+ train_combined = pd.concat([train_silver, gold_100]).drop_duplicates(
251
+ subset="doc_id"
252
+ ).reset_index(drop=True)
253
+
254
+ # Use is_green_final as label (gold overrides silver)
255
+ train_data = train_combined[["text", "is_green_final"]].rename(
256
+ columns={"is_green_final": "label"}
257
+ )
258
+
259
+ # Eval set: eval_silver
260
+ eval_data = df[df["split"] == "eval_silver"][["text", "is_green_final"]].rename(
261
+ columns={"is_green_final": "label"}
262
+ )
263
+
264
+ # Gold test set: gold_100
265
+ gold_data = df[df["is_green_gold"].notna()][["text", "is_green_final"]].rename(
266
+ columns={"is_green_final": "label"}
267
+ )
268
+
269
+ print(f"Training set (silver + gold) : {len(train_data):,}")
270
+ print(f" - from train_silver : {len(train_silver):,}")
271
+ print(f" - from gold_100 : {len(gold_100)}")
272
+ print(f" - after dedup : {len(train_data):,}")
273
+ print(f"Eval set (eval_silver) : {len(eval_data):,}")
274
+ print(f"Gold test set (gold_100) : {len(gold_data)}")
275
+
276
+ print(f"\nLabel distribution in training set:")
277
+ print(f" Green (1) : {(train_data['label'] == 1).sum():,}")
278
+ print(f" Non-Green (0) : {(train_data['label'] == 0).sum():,}")
279
+
280
+ # Convert to HuggingFace datasets
281
+ train_dataset = Dataset.from_pandas(train_data.reset_index(drop=True))
282
+ eval_dataset = Dataset.from_pandas(eval_data.reset_index(drop=True))
283
+ gold_dataset = Dataset.from_pandas(gold_data.reset_index(drop=True))
284
+
285
+ # Load PatentSBERTa
286
+ print(f"\nLoading {MODEL_NAME}...")
287
+ tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
288
+ model = AutoModelForSequenceClassification.from_pretrained(
289
+ MODEL_NAME, num_labels=2
290
+ )
291
+ print("Model loaded.")
292
+
293
+ # Tokenize
294
+ def tokenize_fn(batch):
295
+ return tokenizer(
296
+ batch["text"],
297
+ padding="max_length",
298
+ truncation=True,
299
+ max_length=MAX_SEQ_LEN,
300
+ )
301
+
302
+ print("Tokenizing datasets...")
303
+ train_dataset = train_dataset.map(tokenize_fn, batched=True)
304
+ eval_dataset = eval_dataset.map(tokenize_fn, batched=True)
305
+ gold_dataset = gold_dataset.map(tokenize_fn, batched=True)
306
+
307
+ for ds in [train_dataset, eval_dataset, gold_dataset]:
308
+ ds.set_format("torch", columns=["input_ids", "attention_mask", "label"])
309
+
310
+ # Metrics
311
+ def compute_metrics(eval_pred):
312
+ logits, labels = eval_pred
313
+ preds = np.argmax(logits, axis=-1)
314
+ precision, recall, f1, _ = precision_recall_fscore_support(
315
+ labels, preds, average="binary"
316
+ )
317
+ acc = accuracy_score(labels, preds)
318
+ return {
319
+ "accuracy": acc,
320
+ "precision": precision,
321
+ "recall": recall,
322
+ "f1": f1,
323
+ }
324
+
325
+ # Training
326
+ training_args = TrainingArguments(
327
+ output_dir="./patent_sberta_checkpoints",
328
+ num_train_epochs=NUM_EPOCHS,
329
+ learning_rate=LEARNING_RATE,
330
+ per_device_train_batch_size=BATCH_SIZE,
331
+ per_device_eval_batch_size=BATCH_SIZE,
332
+ weight_decay=WEIGHT_DECAY,
333
+ eval_strategy="epoch",
334
+ save_strategy="epoch",
335
+ load_best_model_at_end=True,
336
+ metric_for_best_model="f1",
337
+ logging_steps=LOGGING_STEPS,
338
+ report_to="none",
339
+ seed=RANDOM_SEED,
340
+ )
341
+
342
+ trainer = Trainer(
343
+ model=model,
344
+ args=training_args,
345
+ train_dataset=train_dataset,
346
+ eval_dataset=eval_dataset,
347
+ compute_metrics=compute_metrics,
348
+ )
349
+
350
+ print(f"\n{'─' * 50}")
351
+ print(f"Training PatentSBERTa")
352
+ print(f"{'─' * 50}")
353
+ print(f" Epochs : {NUM_EPOCHS}")
354
+ print(f" Learning rate : {LEARNING_RATE}")
355
+ print(f" Batch size : {BATCH_SIZE}")
356
+ print(f" Train examples : {len(train_dataset):,}")
357
+ print()
358
+
359
+ trainer.train()
360
+
361
+ #Evaluate on eval_silver
362
+ print(f"\n{'─' * 50}")
363
+ print(f"Evaluation on eval_silver ({len(eval_dataset):,} examples)")
364
+ print(f"{'─' * 50}")
365
+ eval_results = trainer.evaluate(eval_dataset)
366
+ for k, v in sorted(eval_results.items()):
367
+ if isinstance(v, float):
368
+ print(f" {k:<25} {v:.4f}")
369
+
370
+ # Evaluate on gold_100
371
+ print(f"\n{'─' * 50}")
372
+ print(f"Evaluation on gold_100 ({len(gold_dataset)} examples)")
373
+ print(f"{'─' * 50}")
374
+ gold_results = trainer.evaluate(gold_dataset)
375
+ for k, v in sorted(gold_results.items()):
376
+ if isinstance(v, float):
377
+ print(f" {k:<25} {v:.4f}")
378
+
379
+ #Classification report on gold_100
380
+ print(f"\nClassification Report (gold_100):")
381
+ gold_pred_output = trainer.predict(gold_dataset)
382
+ gold_preds = np.argmax(gold_pred_output.predictions, axis=-1)
383
+ gold_labels = gold_pred_output.label_ids
384
+ print(classification_report(
385
+ gold_labels, gold_preds,
386
+ target_names=["Non-Green (0)", "Green (1)"],
387
+ digits=4,
388
+ ))
389
+
390
+ #Save model
391
+ trainer.save_model(FINETUNED_DIR)
392
+ tokenizer.save_pretrained(FINETUNED_DIR)
393
+ print(f"Model saved to: {FINETUNED_DIR}/")
394
+
395
+ #Print results for README
396
+ print(f"\n{'═' * 60}")
397
+ print("FOR YOUR REPORT / README:")
398
+ print(f"{'═' * 60}")
399
+ print(f"""
400
+ ## Part D — PatentSBERTa Fine-Tuning Results
401
+
402
+ **Model:** {MODEL_NAME}
403
+ **Training data:** train_silver ({len(train_silver):,}) + gold_100 ({len(gold_100)}) = {len(train_data):,} examples
404
+ **Epochs:** {NUM_EPOCHS} | **LR:** {LEARNING_RATE} | **Batch:** {BATCH_SIZE}
405
+
406
+ ### Eval Silver Results
407
+ | Metric | Score |
408
+ |-----------|-------|
409
+ | Accuracy | {eval_results.get('eval_accuracy', 0):.4f} |
410
+ | Precision | {eval_results.get('eval_precision', 0):.4f} |
411
+ | Recall | {eval_results.get('eval_recall', 0):.4f} |
412
+ | F1 | {eval_results.get('eval_f1', 0):.4f} |
413
+
414
+ ### Gold 100 Results
415
+ | Metric | Score |
416
+ |-----------|-------|
417
+ | Accuracy | {gold_results.get('eval_accuracy', 0):.4f} |
418
+ | Precision | {gold_results.get('eval_precision', 0):.4f} |
419
+ | Recall | {gold_results.get('eval_recall', 0):.4f} |
420
+ | F1 | {gold_results.get('eval_f1', 0):.4f} |
421
+ """)
422
+
423
+
424
+ # 7. MAIN
425
+
426
+ def main():
427
+ print("╔══════════════════════════════════════════════════════════╗")
428
+ print("║ PART D — Targeted Human Review & Final Integration ║")
429
+ print("╚══════════════════════════════════════════════════════════╝\n")
430
+
431
+ # Step 1: Exception-based HITL
432
+ labels_done = run_exception_hitl()
433
+ if not labels_done:
434
+ print("Complete the human review and re-run this script.")
435
+ sys.exit(0)
436
+
437
+ # Step 2: Disagreement report
438
+ generate_report()
439
+
440
+ # Step 3: Merge gold labels
441
+ merge_gold_labels()
442
+
443
+ # Step 4: Fine-tune PatentSBERTa
444
+ finetune_patentsberta()
445
+
446
+ print("\n" + "=" * 60)
447
+ print("✅ PART D COMPLETE")
448
+ print("=" * 60)
449
+ print(f"\nFiles created:")
450
+ print(f" {HITL_GOLD_CSV} — gold labels for 100 claims")
451
+ print(f" {PARQUET_GOLD_FILE} — merged dataset")
452
+ print(f" {FINETUNED_DIR}/ — fine-tuned PatentSBERTa")
453
+
454
+
455
+ if __name__ == "__main__":
456
+ main()
prepare_gold.err ADDED
File without changes
prepare_gold.out ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ Gold labels set:
2
+ Green (1) : 1
3
+ Non-Green (0) : 99
4
+ Needs review : 0
5
+
6
+ The agents agreed on all 100 claims (0 deadlocks).
7
+ Saved mas_labeled_100.csv with gold labels filled.
prepare_gold.py ADDED
@@ -0,0 +1,17 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Prepare gold labels from MAS results for Part D."""
2
+ import pandas as pd
3
+
4
+ df = pd.read_csv("mas_labeled_100.csv")
5
+
6
+ # Auto-accept all Judge decisions as gold (100% high/medium confidence)
7
+ df["is_green_gold"] = df["judge_label"].astype(int)
8
+ df["human_notes"] = "Accepted Judge decision (high/medium confidence)"
9
+
10
+ print(f"Gold labels set:")
11
+ print(f" Green (1) : {(df['is_green_gold'] == 1).sum()}")
12
+ print(f" Non-Green (0) : {(df['is_green_gold'] == 0).sum()}")
13
+ print(f" Needs review : 0")
14
+ print(f"\nThe agents agreed on all 100 claims (0 deadlocks).")
15
+
16
+ df.to_csv("mas_labeled_100.csv", index=False)
17
+ print(f"Saved mas_labeled_100.csv with gold labels filled.")
qlora_finetune.err ADDED
@@ -0,0 +1,7 @@
 
 
 
 
0
  0%| | 0/313 [00:00<?, ?it/s]`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`.
 
 
 
1
  0%| | 1/313 [00:26<2:19:12, 26.77s/it]
2
  1%| | 2/313 [00:46<1:57:01, 22.58s/it]
3
  1%| | 3/313 [01:06<1:50:23, 21.37s/it]
4
  1%|▏ | 4/313 [01:26<1:47:12, 20.82s/it]
5
  2%|▏ | 5/313 [01:46<1:45:23, 20.53s/it]
6
  2%|▏ | 6/313 [02:06<1:44:13, 20.37s/it]
7
  2%|▏ | 7/313 [02:26<1:43:30, 20.29s/it]
8
  3%|▎ | 8/313 [02:46<1:42:47, 20.22s/it]
9
  3%|▎ | 9/313 [03:06<1:42:13, 20.17s/it]
10
  3%|▎ | 10/313 [03:26<1:41:42, 20.14s/it]
11
  4%|▎ | 11/313 [03:46<1:41:02, 20.07s/it]
12
  4%|▍ | 12/313 [04:06<1:40:35, 20.05s/it]
13
  4%|▍ | 13/313 [04:26<1:40:07, 20.02s/it]
14
  4%|▍ | 14/313 [04:46<1:39:38, 19.99s/it]
15
  5%|▍ | 15/313 [05:06<1:39:11, 19.97s/it]
16
  5%|▌ | 16/313 [05:26<1:38:43, 19.95s/it]
17
  5%|▌ | 17/313 [05:46<1:38:35, 19.98s/it]
18
  6%|▌ | 18/313 [06:06<1:38:10, 19.97s/it]
19
  6%|▌ | 19/313 [06:26<1:37:56, 19.99s/it]
20
  6%|▋ | 20/313 [06:46<1:37:48, 20.03s/it]
21
  7%|▋ | 21/313 [07:06<1:37:31, 20.04s/it]
22
  7%|▋ | 22/313 [07:26<1:37:08, 20.03s/it]
23
  7%|▋ | 23/313 [07:46<1:36:52, 20.04s/it]
24
  8%|▊ | 24/313 [08:06<1:36:39, 20.07s/it]
25
  8%|▊ | 25/313 [08:26<1:36:16, 20.06s/it]
26
  8%|▊ | 26/313 [08:46<1:35:59, 20.07s/it]
27
  9%|▊ | 27/313 [09:06<1:35:33, 20.05s/it]
28
  9%|▉ | 28/313 [09:26<1:35:08, 20.03s/it]
29
  9%|▉ | 29/313 [09:47<1:34:55, 20.06s/it]
30
  10%|▉ | 30/313 [10:07<1:34:36, 20.06s/it]
31
  10%|▉ | 31/313 [10:27<1:34:09, 20.03s/it]
32
  10%|█ | 32/313 [10:46<1:33:41, 20.01s/it]
33
  11%|█ | 33/313 [11:07<1:33:30, 20.04s/it]
34
  11%|█ | 34/313 [11:27<1:33:05, 20.02s/it]
35
  11%|█ | 35/313 [11:47<1:32:45, 20.02s/it]
36
  12%|█▏ | 36/313 [12:07<1:32:26, 20.02s/it]
37
  12%|█▏ | 37/313 [12:27<1:32:12, 20.04s/it]
38
  12%|█▏ | 38/313 [12:47<1:31:51, 20.04s/it]
39
  12%|█▏ | 39/313 [13:07<1:31:26, 20.02s/it]
40
  13%|█▎ | 40/313 [13:27<1:31:02, 20.01s/it]
41
  13%|█▎ | 41/313 [13:47<1:30:48, 20.03s/it]
42
  13%|█▎ | 42/313 [14:07<1:30:22, 20.01s/it]
43
  14%|█▎ | 43/313 [14:27<1:30:04, 20.02s/it]
44
  14%|█▍ | 44/313 [14:47<1:29:42, 20.01s/it]
45
  14%|█▍ | 45/313 [15:07<1:29:22, 20.01s/it]
46
  15%|█▍ | 46/313 [15:27<1:28:58, 19.99s/it]
47
  15%|█▌ | 47/313 [15:47<1:28:41, 20.00s/it]
48
  15%|█▌ | 48/313 [16:07<1:28:24, 20.02s/it]
49
  16%|█▌ | 49/313 [16:27<1:28:08, 20.03s/it]
50
  16%|█▌ | 50/313 [16:47<1:27:45, 20.02s/it]
51
 
52
  16%|█▌ | 50/313 [16:47<1:27:45, 20.02s/it]
53
  16%|█▋ | 51/313 [17:07<1:27:20, 20.00s/it]
54
  17%|█▋ | 52/313 [17:27<1:27:03, 20.01s/it]
55
  17%|█▋ | 53/313 [17:47<1:26:48, 20.03s/it]
56
  17%|█▋ | 54/313 [18:07<1:26:24, 20.02s/it]
57
  18%|█▊ | 55/313 [18:27<1:26:02, 20.01s/it]
58
  18%|█▊ | 56/313 [18:47<1:25:38, 19.99s/it]
59
  18%|█▊ | 57/313 [19:07<1:25:27, 20.03s/it]
60
  19%|█▊ | 58/313 [19:27<1:25:02, 20.01s/it]
61
  19%|█▉ | 59/313 [19:47<1:24:54, 20.06s/it]
62
  19%|█▉ | 60/313 [20:07<1:24:36, 20.07s/it]
63
  19%|█▉ | 61/313 [20:27<1:24:13, 20.05s/it]
64
  20%|█▉ | 62/313 [20:47<1:23:45, 20.02s/it]
65
  20%|██ | 63/313 [21:07<1:23:23, 20.01s/it]
66
  20%|██ | 64/313 [21:27<1:23:04, 20.02s/it]
67
  21%|██ | 65/313 [21:47<1:22:45, 20.02s/it]
68
  21%|██ | 66/313 [22:07<1:22:21, 20.01s/it]
69
  21%|██▏ | 67/313 [22:27<1:22:06, 20.03s/it]
70
  22%|██▏ | 68/313 [22:47<1:21:42, 20.01s/it]
71
  22%|██▏ | 69/313 [23:07<1:21:20, 20.00s/it]
72
  22%|██▏ | 70/313 [23:27<1:21:03, 20.01s/it]
73
  23%|██▎ | 71/313 [23:47<1:20:42, 20.01s/it]
74
  23%|██▎ | 72/313 [24:07<1:20:21, 20.01s/it]
75
  23%|██▎ | 73/313 [24:27<1:20:00, 20.00s/it]
76
  24%|██▎ | 74/313 [24:47<1:19:40, 20.00s/it]
77
  24%|██▍ | 75/313 [25:07<1:19:19, 20.00s/it]
78
  24%|██▍ | 76/313 [25:27<1:18:59, 20.00s/it]
79
  25%|██▍ | 77/313 [25:47<1:18:43, 20.01s/it]
80
  25%|██▍ | 78/313 [26:07<1:18:24, 20.02s/it]
81
  25%|██▌ | 79/313 [26:27<1:18:09, 20.04s/it]
82
  26%|██▌ | 80/313 [26:47<1:17:47, 20.03s/it]
83
  26%|██▌ | 81/313 [27:07<1:17:26, 20.03s/it]
84
  26%|██▌ | 82/313 [27:28<1:17:05, 20.03s/it]
85
  27%|██▋ | 83/313 [27:47<1:16:42, 20.01s/it]
86
  27%|██▋ | 84/313 [28:07<1:16:19, 20.00s/it]
87
  27%|██▋ | 85/313 [28:27<1:16:01, 20.01s/it]
88
  27%|██▋ | 86/313 [28:47<1:15:39, 20.00s/it]
89
  28%|██▊ | 87/313 [29:07<1:15:21, 20.01s/it]
90
  28%|██▊ | 88/313 [29:28<1:15:02, 20.01s/it]
91
  28%|██▊ | 89/313 [29:48<1:14:43, 20.02s/it]
92
  29%|██▉ | 90/313 [30:08<1:14:22, 20.01s/it]
93
  29%|██▉ | 91/313 [30:28<1:14:03, 20.01s/it]
94
  29%|██▉ | 92/313 [30:48<1:13:44, 20.02s/it]
95
  30%|██▉ | 93/313 [31:08<1:13:25, 20.02s/it]
96
  30%|███ | 94/313 [31:28<1:13:08, 20.04s/it]
97
  30%|███ | 95/313 [31:48<1:12:40, 20.00s/it]
98
  31%|███ | 96/313 [32:08<1:12:25, 20.03s/it]
99
  31%|███ | 97/313 [32:28<1:12:00, 20.00s/it]
100
  31%|███▏ | 98/313 [32:48<1:11:36, 19.98s/it]
101
  32%|███▏ | 99/313 [33:08<1:11:20, 20.00s/it]
102
  32%|███▏ | 100/313 [33:28<1:10:59, 20.00s/it]
103
 
104
  32%|███▏ | 100/313 [33:28<1:10:59, 20.00s/it]
105
  32%|███▏ | 101/313 [33:48<1:10:40, 20.00s/it]
106
  33%|███▎ | 102/313 [34:08<1:10:24, 20.02s/it]
107
  33%|███▎ | 103/313 [34:28<1:10:03, 20.02s/it]
108
  33%|███▎ | 104/313 [34:48<1:09:46, 20.03s/it]
109
  34%|███▎ | 105/313 [35:08<1:09:24, 20.02s/it]
110
  34%|███▍ | 106/313 [35:28<1:09:04, 20.02s/it]
111
  34%|███▍ | 107/313 [35:48<1:08:41, 20.01s/it]
112
  35%|███▍ | 108/313 [36:08<1:08:20, 20.00s/it]
113
  35%|███▍ | 109/313 [36:28<1:08:01, 20.01s/it]
114
  35%|███▌ | 110/313 [36:48<1:07:40, 20.00s/it]
115
  35%|███▌ | 111/313 [37:08<1:07:22, 20.01s/it]
116
  36%|███▌ | 112/313 [37:28<1:06:55, 19.98s/it]
117
  36%|███▌ | 113/313 [37:48<1:06:36, 19.98s/it]
118
  36%|███▋ | 114/313 [38:08<1:06:21, 20.01s/it]
119
  37%|███▋ | 115/313 [38:28<1:06:03, 20.02s/it]
120
  37%|███▋ | 116/313 [38:48<1:05:38, 19.99s/it]
121
  37%|███▋ | 117/313 [39:08<1:05:19, 20.00s/it]
122
  38%|███▊ | 118/313 [39:28<1:05:00, 20.00s/it]
123
  38%|███▊ | 119/313 [39:48<1:04:37, 19.99s/it]
124
  38%|███▊ | 120/313 [40:08<1:04:16, 19.98s/it]
125
  39%|███▊ | 121/313 [40:28<1:04:00, 20.00s/it]
126
  39%|███▉ | 122/313 [40:48<1:03:42, 20.01s/it]
127
  39%|███▉ | 123/313 [41:08<1:03:21, 20.01s/it]
128
  40%|███▉ | 124/313 [41:28<1:03:01, 20.01s/it]
129
  40%|███▉ | 125/313 [41:48<1:02:41, 20.01s/it]
130
  40%|████ | 126/313 [42:08<1:02:26, 20.03s/it]
131
  41%|████ | 127/313 [42:28<1:02:04, 20.02s/it]
132
  41%|████ | 128/313 [42:48<1:01:38, 19.99s/it]
133
  41%|████ | 129/313 [43:08<1:01:17, 19.98s/it]
134
  42%|████▏ | 130/313 [43:28<1:00:59, 19.99s/it]
135
  42%|████▏ | 131/313 [43:48<1:00:38, 19.99s/it]
136
  42%|████▏ | 132/313 [44:08<1:00:22, 20.01s/it]
137
  42%|████▏ | 133/313 [44:28<1:00:01, 20.01s/it]
138
  43%|████▎ | 134/313 [44:48<59:37, 19.99s/it]
139
  43%|████▎ | 135/313 [45:08<59:16, 19.98s/it]
140
  43%|████▎ | 136/313 [45:28<59:06, 20.04s/it]
141
  44%|████▍ | 137/313 [45:48<58:43, 20.02s/it]
142
  44%|████▍ | 138/313 [46:08<58:20, 20.00s/it]
143
  44%|████▍ | 139/313 [46:28<58:00, 20.00s/it]
144
  45%|████▍ | 140/313 [46:48<57:40, 20.00s/it]
145
  45%|████▌ | 141/313 [47:08<57:17, 19.99s/it]
146
  45%|████▌ | 142/313 [47:28<56:58, 19.99s/it]
147
  46%|████▌ | 143/313 [47:48<56:37, 19.99s/it]
148
  46%|████▌ | 144/313 [48:08<56:21, 20.01s/it]
149
  46%|████▋ | 145/313 [48:28<56:02, 20.01s/it]
150
  47%|████▋ | 146/313 [48:48<55:42, 20.01s/it]
151
  47%|████▋ | 147/313 [49:08<55:20, 20.00s/it]
152
  47%|████▋ | 148/313 [49:28<54:57, 19.98s/it]
153
  48%|████▊ | 149/313 [49:48<54:35, 19.97s/it]
154
  48%|████▊ | 150/313 [50:08<54:14, 19.97s/it]
155
 
156
  48%|████▊ | 150/313 [50:08<54:14, 19.97s/it]
157
  48%|████▊ | 151/313 [50:28<53:58, 19.99s/it]
158
  49%|████▊ | 152/313 [50:48<53:37, 19.98s/it]
159
  49%|████▉ | 153/313 [51:08<53:19, 20.00s/it]
160
  49%|████▉ | 154/313 [51:28<53:01, 20.01s/it]
161
  50%|████▉ | 155/313 [51:48<52:43, 20.02s/it]
162
  50%|████▉ | 156/313 [52:08<52:21, 20.01s/it]
163
  50%|█████ | 157/313 [52:28<51:57, 19.99s/it]
164
  50%|█████ | 158/313 [52:48<51:39, 20.00s/it]
165
  51%|█████ | 159/313 [53:08<51:19, 20.00s/it]
166
  51%|█████ | 160/313 [53:28<51:01, 20.01s/it]
167
  51%|█████▏ | 161/313 [53:48<50:38, 19.99s/it]
168
  52%|█████▏ | 162/313 [54:08<50:19, 20.00s/it]
169
  52%|█████▏ | 163/313 [54:28<50:04, 20.03s/it]
170
  52%|█████▏ | 164/313 [54:48<49:43, 20.02s/it]
171
  53%|█████▎ | 165/313 [55:08<49:21, 20.01s/it]
172
  53%|█████▎ | 166/313 [55:28<49:02, 20.02s/it]
173
  53%|█████▎ | 167/313 [55:48<48:41, 20.01s/it]
174
  54%|█████▎ | 168/313 [56:08<48:20, 20.01s/it]
175
  54%|█████▍ | 169/313 [56:28<48:01, 20.01s/it]
176
  54%|█████▍ | 170/313 [56:48<47:39, 20.00s/it]
177
  55%|█████▍ | 171/313 [57:08<47:20, 20.00s/it]
178
  55%|█████▍ | 172/313 [57:28<47:01, 20.01s/it]
179
  55%|█████▌ | 173/313 [57:48<46:42, 20.02s/it]
180
  56%|█████▌ | 174/313 [58:08<46:19, 20.00s/it]
181
  56%|█████▌ | 175/313 [58:28<45:58, 19.99s/it]
182
  56%|█████▌ | 176/313 [58:48<45:41, 20.01s/it]
183
  57%|█████▋ | 177/313 [59:08<45:22, 20.02s/it]
184
  57%|█████▋ | 178/313 [59:28<44:58, 19.99s/it]
185
  57%|█████▋ | 179/313 [59:48<44:37, 19.98s/it]
186
  58%|█████▊ | 180/313 [1:00:08<44:19, 20.00s/it]
187
  58%|█████▊ | 181/313 [1:00:28<44:02, 20.02s/it]
188
  58%|█████▊ | 182/313 [1:00:48<43:39, 20.00s/it]
189
  58%|█████▊ | 183/313 [1:01:08<43:19, 19.99s/it]
190
  59%|█████▉ | 184/313 [1:01:28<42:57, 19.98s/it]
191
  59%|█████▉ | 185/313 [1:01:48<42:36, 19.97s/it]
192
  59%|█████▉ | 186/313 [1:02:08<42:16, 19.98s/it]
193
  60%|█████▉ | 187/313 [1:02:28<41:56, 19.97s/it]
194
  60%|██████ | 188/313 [1:02:48<41:36, 19.97s/it]
195
  60%|██████ | 189/313 [1:03:08<41:17, 19.98s/it]
196
  61%|██████ | 190/313 [1:03:28<40:58, 19.99s/it]
197
  61%|██████ | 191/313 [1:03:48<40:38, 19.98s/it]
198
  61%|██████▏ | 192/313 [1:04:08<40:19, 20.00s/it]
199
  62%|██████▏ | 193/313 [1:04:28<40:02, 20.02s/it]
200
  62%|██████▏ | 194/313 [1:04:48<39:39, 20.00s/it]
201
  62%|██████▏ | 195/313 [1:05:08<39:20, 20.01s/it]
202
  63%|██████▎ | 196/313 [1:05:28<39:02, 20.02s/it]
203
  63%|██████▎ | 197/313 [1:05:48<38:40, 20.01s/it]
204
  63%|██████▎ | 198/313 [1:06:08<38:19, 19.99s/it]
205
  64%|██████▎ | 199/313 [1:06:28<38:00, 20.01s/it]
206
  64%|██████▍ | 200/313 [1:06:48<37:37, 19.98s/it]
207
 
208
  64%|██████▍ | 200/313 [1:06:48<37:37, 19.98s/it]
209
  64%|██████▍ | 201/313 [1:07:08<37:16, 19.97s/it]
210
  65%|██████▍ | 202/313 [1:07:28<36:55, 19.96s/it]
211
  65%|██████▍ | 203/313 [1:07:48<36:36, 19.97s/it]
212
  65%|██████▌ | 204/313 [1:08:08<36:18, 19.99s/it]
213
  65%|██████▌ | 205/313 [1:08:28<36:00, 20.00s/it]
214
  66%|██████▌ | 206/313 [1:08:48<35:40, 20.00s/it]
215
  66%|██████▌ | 207/313 [1:09:08<35:16, 19.97s/it]
216
  66%|██████▋ | 208/313 [1:09:27<34:54, 19.94s/it]
217
  67%|██████▋ | 209/313 [1:09:47<34:34, 19.95s/it]
218
  67%|██████▋ | 210/313 [1:10:07<34:15, 19.95s/it]
219
  67%|██████▋ | 211/313 [1:10:27<33:54, 19.94s/it]
220
  68%|██████▊ | 212/313 [1:10:47<33:36, 19.96s/it]
221
  68%|██████▊ | 213/313 [1:11:07<33:15, 19.96s/it]
222
  68%|██████▊ | 214/313 [1:11:27<32:55, 19.95s/it]
223
  69%|██████▊ | 215/313 [1:11:47<32:36, 19.97s/it]
224
  69%|██████▉ | 216/313 [1:12:07<32:16, 19.96s/it]
225
  69%|██████▉ | 217/313 [1:12:27<31:56, 19.96s/it]
226
  70%|██████▉ | 218/313 [1:12:47<31:36, 19.96s/it]
227
  70%|██████▉ | 219/313 [1:13:07<31:16, 19.96s/it]
228
  70%|███████ | 220/313 [1:13:27<30:55, 19.96s/it]
229
  71%|███████ | 221/313 [1:13:47<30:35, 19.96s/it]
230
  71%|███████ | 222/313 [1:14:07<30:16, 19.96s/it]
231
  71%|███████ | 223/313 [1:14:27<29:55, 19.96s/it]
232
  72%|███████▏ | 224/313 [1:14:47<29:36, 19.96s/it]
233
  72%|███████▏ | 225/313 [1:15:07<29:16, 19.96s/it]
234
  72%|███████▏ | 226/313 [1:15:27<28:56, 19.96s/it]
235
  73%|███████▎ | 227/313 [1:15:47<28:35, 19.94s/it]
236
  73%|███████▎ | 228/313 [1:16:07<28:15, 19.95s/it]
237
  73%|███████▎ | 229/313 [1:16:26<27:55, 19.95s/it]
238
  73%|███████▎ | 230/313 [1:16:46<27:36, 19.96s/it]
239
  74%|███████▍ | 231/313 [1:17:06<27:16, 19.95s/it]
240
  74%|███████▍ | 232/313 [1:17:26<26:55, 19.95s/it]
241
  74%|███████▍ | 233/313 [1:17:46<26:37, 19.97s/it]
242
  75%|███████▍ | 234/313 [1:18:06<26:18, 19.98s/it]
243
  75%|███████▌ | 235/313 [1:18:26<25:56, 19.96s/it]
244
  75%|███████▌ | 236/313 [1:18:46<25:36, 19.95s/it]
245
  76%|███████▌ | 237/313 [1:19:06<25:15, 19.94s/it]
246
  76%|███████▌ | 238/313 [1:19:26<24:57, 19.96s/it]
247
  76%|███████▋ | 239/313 [1:19:46<24:37, 19.97s/it]
248
  77%|███████▋ | 240/313 [1:20:06<24:17, 19.96s/it]
249
  77%|███████▋ | 241/313 [1:20:26<23:57, 19.97s/it]
250
  77%|███████▋ | 242/313 [1:20:46<23:37, 19.97s/it]
251
  78%|███████▊ | 243/313 [1:21:06<23:18, 19.98s/it]
252
  78%|███████▊ | 244/313 [1:21:26<22:57, 19.97s/it]
253
  78%|███████▊ | 245/313 [1:21:46<22:39, 20.00s/it]
254
  79%|███████▊ | 246/313 [1:22:06<22:20, 20.01s/it]
255
  79%|███████▉ | 247/313 [1:22:26<21:59, 19.99s/it]
256
  79%|███████▉ | 248/313 [1:22:46<21:38, 19.97s/it]
257
  80%|███████▉ | 249/313 [1:23:06<21:18, 19.97s/it]
258
  80%|███████▉ | 250/313 [1:23:26<20:58, 19.97s/it]
259
 
260
  80%|███████▉ | 250/313 [1:23:26<20:58, 19.97s/it]
261
  80%|████████ | 251/313 [1:23:46<20:37, 19.96s/it]
262
  81%|████████ | 252/313 [1:24:06<20:18, 19.97s/it]
263
  81%|████████ | 253/313 [1:24:26<19:59, 19.98s/it]
264
  81%|████████ | 254/313 [1:24:46<19:38, 19.98s/it]
265
  81%|████████▏ | 255/313 [1:25:06<19:17, 19.96s/it]
266
  82%|████████▏ | 256/313 [1:25:26<18:57, 19.95s/it]
267
  82%|████████▏ | 257/313 [1:25:46<18:36, 19.95s/it]
268
  82%|████████▏ | 258/313 [1:26:06<18:17, 19.95s/it]
269
  83%|████████▎ | 259/313 [1:26:26<17:58, 19.96s/it]
270
  83%|████████▎ | 260/313 [1:26:46<17:38, 19.97s/it]
271
  83%|████████▎ | 261/313 [1:27:05<17:18, 19.97s/it]
272
  84%|████████▎ | 262/313 [1:27:25<16:58, 19.97s/it]
273
  84%|████████▍ | 263/313 [1:27:45<16:38, 19.97s/it]
274
  84%|████████▍ | 264/313 [1:28:05<16:19, 19.99s/it]
275
  85%|████████▍ | 265/313 [1:28:26<16:00, 20.00s/it]
276
  85%|████████▍ | 266/313 [1:28:45<15:39, 19.98s/it]
277
  85%|████████▌ | 267/313 [1:29:05<15:19, 19.98s/it]
278
  86%|████████▌ | 268/313 [1:29:25<14:59, 20.00s/it]
279
  86%|████████▌ | 269/313 [1:29:45<14:40, 20.01s/it]
280
  86%|████████▋ | 270/313 [1:30:05<14:19, 19.99s/it]
281
  87%|████████▋ | 271/313 [1:30:25<13:59, 20.00s/it]
282
  87%|████████▋ | 272/313 [1:30:45<13:39, 19.99s/it]
283
  87%|████████▋ | 273/313 [1:31:05<13:19, 19.98s/it]
284
  88%|████████▊ | 274/313 [1:31:25<12:58, 19.97s/it]
285
  88%|████████▊ | 275/313 [1:31:45<12:39, 19.99s/it]
286
  88%|████████▊ | 276/313 [1:32:05<12:19, 19.99s/it]
287
  88%|████████▊ | 277/313 [1:32:25<11:59, 19.99s/it]
288
  89%|████████▉ | 278/313 [1:32:45<11:39, 19.97s/it]
289
  89%|████████▉ | 279/313 [1:33:05<11:18, 19.96s/it]
290
  89%|████████▉ | 280/313 [1:33:25<10:58, 19.97s/it]
291
  90%|████████▉ | 281/313 [1:33:45<10:38, 19.95s/it]
292
  90%|█████████ | 282/313 [1:34:05<10:18, 19.96s/it]
293
  90%|█████████ | 283/313 [1:34:25<09:58, 19.96s/it]
294
  91%|█████████ | 284/313 [1:34:45<09:39, 19.97s/it]
295
  91%|█████████ | 285/313 [1:35:05<09:18, 19.96s/it]
296
  91%|█████████▏| 286/313 [1:35:25<08:58, 19.95s/it]
297
  92%|█████████▏| 287/313 [1:35:45<08:39, 19.96s/it]
298
  92%|█████████▏| 288/313 [1:36:05<08:19, 19.97s/it]
299
  92%|█████████▏| 289/313 [1:36:25<07:59, 19.97s/it]
300
  93%|█████████▎| 290/313 [1:36:45<07:39, 19.97s/it]
301
  93%|█████████▎| 291/313 [1:37:05<07:19, 19.97s/it]
302
  93%|█████████▎| 292/313 [1:37:25<06:59, 19.97s/it]
303
  94%|█████████▎| 293/313 [1:37:45<06:39, 19.98s/it]
304
  94%|█████████▍| 294/313 [1:38:05<06:19, 19.97s/it]
305
  94%|█████████▍| 295/313 [1:38:25<05:59, 19.96s/it]
306
  95%|█████████▍| 296/313 [1:38:45<05:39, 19.96s/it]
307
  95%|█████████▍| 297/313 [1:39:05<05:19, 19.96s/it]
308
  95%|█████████▌| 298/313 [1:39:25<04:59, 19.97s/it]
309
  96%|█████████▌| 299/313 [1:39:44<04:39, 19.95s/it]
310
  96%|█████████▌| 300/313 [1:40:04<04:19, 19.96s/it]
311
 
312
  96%|█████████▌| 300/313 [1:40:04<04:19, 19.96s/it]
313
  96%|█████████▌| 301/313 [1:40:24<03:59, 19.94s/it]
314
  96%|█████████▋| 302/313 [1:40:44<03:39, 19.94s/it]
315
  97%|█████████▋| 303/313 [1:41:04<03:19, 19.96s/it]
316
  97%|█████████▋| 304/313 [1:41:24<02:59, 19.97s/it]
317
  97%|█████████▋| 305/313 [1:41:44<02:39, 19.96s/it]
318
  98%|█████████▊| 306/313 [1:42:04<02:19, 19.96s/it]
319
  98%|█████████▊| 307/313 [1:42:24<01:59, 19.97s/it]
320
  98%|█████████▊| 308/313 [1:42:44<01:39, 19.95s/it]
321
  99%|█████████▊| 309/313 [1:43:04<01:19, 19.96s/it]
322
  99%|█████████▉| 310/313 [1:43:24<00:59, 19.95s/it]
323
  99%|█████████▉| 311/313 [1:43:44<00:39, 19.94s/it]
324
 
 
1
+
2
+
3
+ No label_names provided for model class `PeftModelForCausalLM`. Since `PeftModel` hides base models input arguments, if label_names is not given, label_names can't be set automatically within `Trainer`. Note that empty label_names list will be used instead.
4
+
5
  0%| | 0/313 [00:00<?, ?it/s]`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`.
6
+ /ceph/home/student.aau.dk/jx14ak/myenv/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py:1044: UserWarning: torch.utils.checkpoint: the use_reentrant parameter should be passed explicitly. Starting in PyTorch 2.9, calling checkpoint without use_reentrant will raise an exception. use_reentrant=False is recommended, but if you need to preserve the current default behavior, you can pass use_reentrant=True. Refer to docs for more details on the differences between the two variants.
7
+ return fn(*args, **kwargs)
8
+
9
  0%| | 1/313 [00:26<2:19:12, 26.77s/it]
10
  1%| | 2/313 [00:46<1:57:01, 22.58s/it]
11
  1%| | 3/313 [01:06<1:50:23, 21.37s/it]
12
  1%|▏ | 4/313 [01:26<1:47:12, 20.82s/it]
13
  2%|▏ | 5/313 [01:46<1:45:23, 20.53s/it]
14
  2%|▏ | 6/313 [02:06<1:44:13, 20.37s/it]
15
  2%|▏ | 7/313 [02:26<1:43:30, 20.29s/it]
16
  3%|▎ | 8/313 [02:46<1:42:47, 20.22s/it]
17
  3%|▎ | 9/313 [03:06<1:42:13, 20.17s/it]
18
  3%|▎ | 10/313 [03:26<1:41:42, 20.14s/it]
19
  4%|▎ | 11/313 [03:46<1:41:02, 20.07s/it]
20
  4%|▍ | 12/313 [04:06<1:40:35, 20.05s/it]
21
  4%|▍ | 13/313 [04:26<1:40:07, 20.02s/it]
22
  4%|▍ | 14/313 [04:46<1:39:38, 19.99s/it]
23
  5%|▍ | 15/313 [05:06<1:39:11, 19.97s/it]
24
  5%|▌ | 16/313 [05:26<1:38:43, 19.95s/it]
25
  5%|▌ | 17/313 [05:46<1:38:35, 19.98s/it]
26
  6%|▌ | 18/313 [06:06<1:38:10, 19.97s/it]
27
  6%|▌ | 19/313 [06:26<1:37:56, 19.99s/it]
28
  6%|▋ | 20/313 [06:46<1:37:48, 20.03s/it]
29
  7%|▋ | 21/313 [07:06<1:37:31, 20.04s/it]
30
  7%|▋ | 22/313 [07:26<1:37:08, 20.03s/it]
31
  7%|▋ | 23/313 [07:46<1:36:52, 20.04s/it]
32
  8%|▊ | 24/313 [08:06<1:36:39, 20.07s/it]
33
  8%|▊ | 25/313 [08:26<1:36:16, 20.06s/it]
34
  8%|▊ | 26/313 [08:46<1:35:59, 20.07s/it]
35
  9%|▊ | 27/313 [09:06<1:35:33, 20.05s/it]
36
  9%|▉ | 28/313 [09:26<1:35:08, 20.03s/it]
37
  9%|▉ | 29/313 [09:47<1:34:55, 20.06s/it]
38
  10%|▉ | 30/313 [10:07<1:34:36, 20.06s/it]
39
  10%|▉ | 31/313 [10:27<1:34:09, 20.03s/it]
40
  10%|█ | 32/313 [10:46<1:33:41, 20.01s/it]
41
  11%|█ | 33/313 [11:07<1:33:30, 20.04s/it]
42
  11%|█ | 34/313 [11:27<1:33:05, 20.02s/it]
43
  11%|█ | 35/313 [11:47<1:32:45, 20.02s/it]
44
  12%|█▏ | 36/313 [12:07<1:32:26, 20.02s/it]
45
  12%|█▏ | 37/313 [12:27<1:32:12, 20.04s/it]
46
  12%|█▏ | 38/313 [12:47<1:31:51, 20.04s/it]
47
  12%|█▏ | 39/313 [13:07<1:31:26, 20.02s/it]
48
  13%|█▎ | 40/313 [13:27<1:31:02, 20.01s/it]
49
  13%|█▎ | 41/313 [13:47<1:30:48, 20.03s/it]
50
  13%|█▎ | 42/313 [14:07<1:30:22, 20.01s/it]
51
  14%|█▎ | 43/313 [14:27<1:30:04, 20.02s/it]
52
  14%|█▍ | 44/313 [14:47<1:29:42, 20.01s/it]
53
  14%|█▍ | 45/313 [15:07<1:29:22, 20.01s/it]
54
  15%|█▍ | 46/313 [15:27<1:28:58, 19.99s/it]
55
  15%|█▌ | 47/313 [15:47<1:28:41, 20.00s/it]
56
  15%|█▌ | 48/313 [16:07<1:28:24, 20.02s/it]
57
  16%|█▌ | 49/313 [16:27<1:28:08, 20.03s/it]
58
  16%|█▌ | 50/313 [16:47<1:27:45, 20.02s/it]
59
 
60
  16%|█▌ | 50/313 [16:47<1:27:45, 20.02s/it]
61
  16%|█▋ | 51/313 [17:07<1:27:20, 20.00s/it]
62
  17%|█▋ | 52/313 [17:27<1:27:03, 20.01s/it]
63
  17%|█▋ | 53/313 [17:47<1:26:48, 20.03s/it]
64
  17%|█▋ | 54/313 [18:07<1:26:24, 20.02s/it]
65
  18%|█▊ | 55/313 [18:27<1:26:02, 20.01s/it]
66
  18%|█▊ | 56/313 [18:47<1:25:38, 19.99s/it]
67
  18%|█▊ | 57/313 [19:07<1:25:27, 20.03s/it]
68
  19%|█▊ | 58/313 [19:27<1:25:02, 20.01s/it]
69
  19%|█▉ | 59/313 [19:47<1:24:54, 20.06s/it]
70
  19%|█▉ | 60/313 [20:07<1:24:36, 20.07s/it]
71
  19%|█▉ | 61/313 [20:27<1:24:13, 20.05s/it]
72
  20%|█▉ | 62/313 [20:47<1:23:45, 20.02s/it]
73
  20%|██ | 63/313 [21:07<1:23:23, 20.01s/it]
74
  20%|██ | 64/313 [21:27<1:23:04, 20.02s/it]
75
  21%|██ | 65/313 [21:47<1:22:45, 20.02s/it]
76
  21%|██ | 66/313 [22:07<1:22:21, 20.01s/it]
77
  21%|██▏ | 67/313 [22:27<1:22:06, 20.03s/it]
78
  22%|██▏ | 68/313 [22:47<1:21:42, 20.01s/it]
79
  22%|██▏ | 69/313 [23:07<1:21:20, 20.00s/it]
80
  22%|██▏ | 70/313 [23:27<1:21:03, 20.01s/it]
81
  23%|██▎ | 71/313 [23:47<1:20:42, 20.01s/it]
82
  23%|██▎ | 72/313 [24:07<1:20:21, 20.01s/it]
83
  23%|██▎ | 73/313 [24:27<1:20:00, 20.00s/it]
84
  24%|██▎ | 74/313 [24:47<1:19:40, 20.00s/it]
85
  24%|██▍ | 75/313 [25:07<1:19:19, 20.00s/it]
86
  24%|██▍ | 76/313 [25:27<1:18:59, 20.00s/it]
87
  25%|██▍ | 77/313 [25:47<1:18:43, 20.01s/it]
88
  25%|██▍ | 78/313 [26:07<1:18:24, 20.02s/it]
89
  25%|██▌ | 79/313 [26:27<1:18:09, 20.04s/it]
90
  26%|██▌ | 80/313 [26:47<1:17:47, 20.03s/it]
91
  26%|██▌ | 81/313 [27:07<1:17:26, 20.03s/it]
92
  26%|██▌ | 82/313 [27:28<1:17:05, 20.03s/it]
93
  27%|██▋ | 83/313 [27:47<1:16:42, 20.01s/it]
94
  27%|██▋ | 84/313 [28:07<1:16:19, 20.00s/it]
95
  27%|██▋ | 85/313 [28:27<1:16:01, 20.01s/it]
96
  27%|██▋ | 86/313 [28:47<1:15:39, 20.00s/it]
97
  28%|██▊ | 87/313 [29:07<1:15:21, 20.01s/it]
98
  28%|██▊ | 88/313 [29:28<1:15:02, 20.01s/it]
99
  28%|██▊ | 89/313 [29:48<1:14:43, 20.02s/it]
100
  29%|██▉ | 90/313 [30:08<1:14:22, 20.01s/it]
101
  29%|██▉ | 91/313 [30:28<1:14:03, 20.01s/it]
102
  29%|██▉ | 92/313 [30:48<1:13:44, 20.02s/it]
103
  30%|██▉ | 93/313 [31:08<1:13:25, 20.02s/it]
104
  30%|███ | 94/313 [31:28<1:13:08, 20.04s/it]
105
  30%|███ | 95/313 [31:48<1:12:40, 20.00s/it]
106
  31%|███ | 96/313 [32:08<1:12:25, 20.03s/it]
107
  31%|███ | 97/313 [32:28<1:12:00, 20.00s/it]
108
  31%|███▏ | 98/313 [32:48<1:11:36, 19.98s/it]
109
  32%|███▏ | 99/313 [33:08<1:11:20, 20.00s/it]
110
  32%|███▏ | 100/313 [33:28<1:10:59, 20.00s/it]
111
 
112
  32%|███▏ | 100/313 [33:28<1:10:59, 20.00s/it]
113
  32%|███▏ | 101/313 [33:48<1:10:40, 20.00s/it]
114
  33%|███▎ | 102/313 [34:08<1:10:24, 20.02s/it]
115
  33%|███▎ | 103/313 [34:28<1:10:03, 20.02s/it]
116
  33%|███▎ | 104/313 [34:48<1:09:46, 20.03s/it]
117
  34%|███▎ | 105/313 [35:08<1:09:24, 20.02s/it]
118
  34%|███▍ | 106/313 [35:28<1:09:04, 20.02s/it]
119
  34%|███▍ | 107/313 [35:48<1:08:41, 20.01s/it]
120
  35%|███▍ | 108/313 [36:08<1:08:20, 20.00s/it]
121
  35%|███▍ | 109/313 [36:28<1:08:01, 20.01s/it]
122
  35%|███▌ | 110/313 [36:48<1:07:40, 20.00s/it]
123
  35%|███▌ | 111/313 [37:08<1:07:22, 20.01s/it]
124
  36%|███▌ | 112/313 [37:28<1:06:55, 19.98s/it]
125
  36%|███▌ | 113/313 [37:48<1:06:36, 19.98s/it]
126
  36%|███▋ | 114/313 [38:08<1:06:21, 20.01s/it]
127
  37%|███▋ | 115/313 [38:28<1:06:03, 20.02s/it]
128
  37%|███▋ | 116/313 [38:48<1:05:38, 19.99s/it]
129
  37%|███▋ | 117/313 [39:08<1:05:19, 20.00s/it]
130
  38%|███▊ | 118/313 [39:28<1:05:00, 20.00s/it]
131
  38%|███▊ | 119/313 [39:48<1:04:37, 19.99s/it]
132
  38%|███▊ | 120/313 [40:08<1:04:16, 19.98s/it]
133
  39%|███▊ | 121/313 [40:28<1:04:00, 20.00s/it]
134
  39%|███▉ | 122/313 [40:48<1:03:42, 20.01s/it]
135
  39%|███▉ | 123/313 [41:08<1:03:21, 20.01s/it]
136
  40%|███▉ | 124/313 [41:28<1:03:01, 20.01s/it]
137
  40%|███▉ | 125/313 [41:48<1:02:41, 20.01s/it]
138
  40%|████ | 126/313 [42:08<1:02:26, 20.03s/it]
139
  41%|████ | 127/313 [42:28<1:02:04, 20.02s/it]
140
  41%|████ | 128/313 [42:48<1:01:38, 19.99s/it]
141
  41%|████ | 129/313 [43:08<1:01:17, 19.98s/it]
142
  42%|████▏ | 130/313 [43:28<1:00:59, 19.99s/it]
143
  42%|████▏ | 131/313 [43:48<1:00:38, 19.99s/it]
144
  42%|████▏ | 132/313 [44:08<1:00:22, 20.01s/it]
145
  42%|████▏ | 133/313 [44:28<1:00:01, 20.01s/it]
146
  43%|████▎ | 134/313 [44:48<59:37, 19.99s/it]
147
  43%|████▎ | 135/313 [45:08<59:16, 19.98s/it]
148
  43%|████▎ | 136/313 [45:28<59:06, 20.04s/it]
149
  44%|████▍ | 137/313 [45:48<58:43, 20.02s/it]
150
  44%|████▍ | 138/313 [46:08<58:20, 20.00s/it]
151
  44%|████▍ | 139/313 [46:28<58:00, 20.00s/it]
152
  45%|████▍ | 140/313 [46:48<57:40, 20.00s/it]
153
  45%|████▌ | 141/313 [47:08<57:17, 19.99s/it]
154
  45%|████▌ | 142/313 [47:28<56:58, 19.99s/it]
155
  46%|████▌ | 143/313 [47:48<56:37, 19.99s/it]
156
  46%|████▌ | 144/313 [48:08<56:21, 20.01s/it]
157
  46%|████▋ | 145/313 [48:28<56:02, 20.01s/it]
158
  47%|████▋ | 146/313 [48:48<55:42, 20.01s/it]
159
  47%|████▋ | 147/313 [49:08<55:20, 20.00s/it]
160
  47%|████▋ | 148/313 [49:28<54:57, 19.98s/it]
161
  48%|████▊ | 149/313 [49:48<54:35, 19.97s/it]
162
  48%|████▊ | 150/313 [50:08<54:14, 19.97s/it]
163
 
164
  48%|████▊ | 150/313 [50:08<54:14, 19.97s/it]
165
  48%|████▊ | 151/313 [50:28<53:58, 19.99s/it]
166
  49%|████▊ | 152/313 [50:48<53:37, 19.98s/it]
167
  49%|████▉ | 153/313 [51:08<53:19, 20.00s/it]
168
  49%|████▉ | 154/313 [51:28<53:01, 20.01s/it]
169
  50%|████▉ | 155/313 [51:48<52:43, 20.02s/it]
170
  50%|████▉ | 156/313 [52:08<52:21, 20.01s/it]
171
  50%|█████ | 157/313 [52:28<51:57, 19.99s/it]
172
  50%|█████ | 158/313 [52:48<51:39, 20.00s/it]
173
  51%|█████ | 159/313 [53:08<51:19, 20.00s/it]
174
  51%|█████ | 160/313 [53:28<51:01, 20.01s/it]
175
  51%|█████▏ | 161/313 [53:48<50:38, 19.99s/it]
176
  52%|█████▏ | 162/313 [54:08<50:19, 20.00s/it]
177
  52%|█████▏ | 163/313 [54:28<50:04, 20.03s/it]
178
  52%|█████▏ | 164/313 [54:48<49:43, 20.02s/it]
179
  53%|█████▎ | 165/313 [55:08<49:21, 20.01s/it]
180
  53%|█████▎ | 166/313 [55:28<49:02, 20.02s/it]
181
  53%|█████▎ | 167/313 [55:48<48:41, 20.01s/it]
182
  54%|█████▎ | 168/313 [56:08<48:20, 20.01s/it]
183
  54%|█████▍ | 169/313 [56:28<48:01, 20.01s/it]
184
  54%|█████▍ | 170/313 [56:48<47:39, 20.00s/it]
185
  55%|█████▍ | 171/313 [57:08<47:20, 20.00s/it]
186
  55%|█████▍ | 172/313 [57:28<47:01, 20.01s/it]
187
  55%|█████▌ | 173/313 [57:48<46:42, 20.02s/it]
188
  56%|█████▌ | 174/313 [58:08<46:19, 20.00s/it]
189
  56%|█████▌ | 175/313 [58:28<45:58, 19.99s/it]
190
  56%|█████▌ | 176/313 [58:48<45:41, 20.01s/it]
191
  57%|█████▋ | 177/313 [59:08<45:22, 20.02s/it]
192
  57%|█████▋ | 178/313 [59:28<44:58, 19.99s/it]
193
  57%|█████▋ | 179/313 [59:48<44:37, 19.98s/it]
194
  58%|█████▊ | 180/313 [1:00:08<44:19, 20.00s/it]
195
  58%|█████▊ | 181/313 [1:00:28<44:02, 20.02s/it]
196
  58%|█████▊ | 182/313 [1:00:48<43:39, 20.00s/it]
197
  58%|█████▊ | 183/313 [1:01:08<43:19, 19.99s/it]
198
  59%|█████▉ | 184/313 [1:01:28<42:57, 19.98s/it]
199
  59%|█████▉ | 185/313 [1:01:48<42:36, 19.97s/it]
200
  59%|█████▉ | 186/313 [1:02:08<42:16, 19.98s/it]
201
  60%|█████▉ | 187/313 [1:02:28<41:56, 19.97s/it]
202
  60%|██████ | 188/313 [1:02:48<41:36, 19.97s/it]
203
  60%|██████ | 189/313 [1:03:08<41:17, 19.98s/it]
204
  61%|██████ | 190/313 [1:03:28<40:58, 19.99s/it]
205
  61%|██████ | 191/313 [1:03:48<40:38, 19.98s/it]
206
  61%|██████▏ | 192/313 [1:04:08<40:19, 20.00s/it]
207
  62%|██████▏ | 193/313 [1:04:28<40:02, 20.02s/it]
208
  62%|██████▏ | 194/313 [1:04:48<39:39, 20.00s/it]
209
  62%|██████▏ | 195/313 [1:05:08<39:20, 20.01s/it]
210
  63%|██████▎ | 196/313 [1:05:28<39:02, 20.02s/it]
211
  63%|██████▎ | 197/313 [1:05:48<38:40, 20.01s/it]
212
  63%|██████▎ | 198/313 [1:06:08<38:19, 19.99s/it]
213
  64%|██████▎ | 199/313 [1:06:28<38:00, 20.01s/it]
214
  64%|██████▍ | 200/313 [1:06:48<37:37, 19.98s/it]
215
 
216
  64%|██████▍ | 200/313 [1:06:48<37:37, 19.98s/it]
217
  64%|██████▍ | 201/313 [1:07:08<37:16, 19.97s/it]
218
  65%|██████▍ | 202/313 [1:07:28<36:55, 19.96s/it]
219
  65%|██████▍ | 203/313 [1:07:48<36:36, 19.97s/it]
220
  65%|██████▌ | 204/313 [1:08:08<36:18, 19.99s/it]
221
  65%|██████▌ | 205/313 [1:08:28<36:00, 20.00s/it]
222
  66%|██████▌ | 206/313 [1:08:48<35:40, 20.00s/it]
223
  66%|██████▌ | 207/313 [1:09:08<35:16, 19.97s/it]
224
  66%|██████▋ | 208/313 [1:09:27<34:54, 19.94s/it]
225
  67%|██████▋ | 209/313 [1:09:47<34:34, 19.95s/it]
226
  67%|██████▋ | 210/313 [1:10:07<34:15, 19.95s/it]
227
  67%|██████▋ | 211/313 [1:10:27<33:54, 19.94s/it]
228
  68%|██████▊ | 212/313 [1:10:47<33:36, 19.96s/it]
229
  68%|██████▊ | 213/313 [1:11:07<33:15, 19.96s/it]
230
  68%|██████▊ | 214/313 [1:11:27<32:55, 19.95s/it]
231
  69%|██████▊ | 215/313 [1:11:47<32:36, 19.97s/it]
232
  69%|██████▉ | 216/313 [1:12:07<32:16, 19.96s/it]
233
  69%|██████▉ | 217/313 [1:12:27<31:56, 19.96s/it]
234
  70%|██████▉ | 218/313 [1:12:47<31:36, 19.96s/it]
235
  70%|██████▉ | 219/313 [1:13:07<31:16, 19.96s/it]
236
  70%|███████ | 220/313 [1:13:27<30:55, 19.96s/it]
237
  71%|███████ | 221/313 [1:13:47<30:35, 19.96s/it]
238
  71%|███████ | 222/313 [1:14:07<30:16, 19.96s/it]
239
  71%|███████ | 223/313 [1:14:27<29:55, 19.96s/it]
240
  72%|███████▏ | 224/313 [1:14:47<29:36, 19.96s/it]
241
  72%|███████▏ | 225/313 [1:15:07<29:16, 19.96s/it]
242
  72%|███████▏ | 226/313 [1:15:27<28:56, 19.96s/it]
243
  73%|███████▎ | 227/313 [1:15:47<28:35, 19.94s/it]
244
  73%|███████▎ | 228/313 [1:16:07<28:15, 19.95s/it]
245
  73%|███████▎ | 229/313 [1:16:26<27:55, 19.95s/it]
246
  73%|███████▎ | 230/313 [1:16:46<27:36, 19.96s/it]
247
  74%|███████▍ | 231/313 [1:17:06<27:16, 19.95s/it]
248
  74%|███████▍ | 232/313 [1:17:26<26:55, 19.95s/it]
249
  74%|███████▍ | 233/313 [1:17:46<26:37, 19.97s/it]
250
  75%|███████▍ | 234/313 [1:18:06<26:18, 19.98s/it]
251
  75%|███████▌ | 235/313 [1:18:26<25:56, 19.96s/it]
252
  75%|███████▌ | 236/313 [1:18:46<25:36, 19.95s/it]
253
  76%|███████▌ | 237/313 [1:19:06<25:15, 19.94s/it]
254
  76%|███████▌ | 238/313 [1:19:26<24:57, 19.96s/it]
255
  76%|███████▋ | 239/313 [1:19:46<24:37, 19.97s/it]
256
  77%|███████▋ | 240/313 [1:20:06<24:17, 19.96s/it]
257
  77%|███████▋ | 241/313 [1:20:26<23:57, 19.97s/it]
258
  77%|███████▋ | 242/313 [1:20:46<23:37, 19.97s/it]
259
  78%|███████▊ | 243/313 [1:21:06<23:18, 19.98s/it]
260
  78%|███████▊ | 244/313 [1:21:26<22:57, 19.97s/it]
261
  78%|███████▊ | 245/313 [1:21:46<22:39, 20.00s/it]
262
  79%|███████▊ | 246/313 [1:22:06<22:20, 20.01s/it]
263
  79%|███████▉ | 247/313 [1:22:26<21:59, 19.99s/it]
264
  79%|███████▉ | 248/313 [1:22:46<21:38, 19.97s/it]
265
  80%|███████▉ | 249/313 [1:23:06<21:18, 19.97s/it]
266
  80%|███████▉ | 250/313 [1:23:26<20:58, 19.97s/it]
267
 
268
  80%|███████▉ | 250/313 [1:23:26<20:58, 19.97s/it]
269
  80%|████████ | 251/313 [1:23:46<20:37, 19.96s/it]
270
  81%|████████ | 252/313 [1:24:06<20:18, 19.97s/it]
271
  81%|████████ | 253/313 [1:24:26<19:59, 19.98s/it]
272
  81%|████████ | 254/313 [1:24:46<19:38, 19.98s/it]
273
  81%|████████▏ | 255/313 [1:25:06<19:17, 19.96s/it]
274
  82%|████████▏ | 256/313 [1:25:26<18:57, 19.95s/it]
275
  82%|████████▏ | 257/313 [1:25:46<18:36, 19.95s/it]
276
  82%|████████▏ | 258/313 [1:26:06<18:17, 19.95s/it]
277
  83%|████████▎ | 259/313 [1:26:26<17:58, 19.96s/it]
278
  83%|████████▎ | 260/313 [1:26:46<17:38, 19.97s/it]
279
  83%|████████▎ | 261/313 [1:27:05<17:18, 19.97s/it]
280
  84%|████████▎ | 262/313 [1:27:25<16:58, 19.97s/it]
281
  84%|████████▍ | 263/313 [1:27:45<16:38, 19.97s/it]
282
  84%|████████▍ | 264/313 [1:28:05<16:19, 19.99s/it]
283
  85%|████████▍ | 265/313 [1:28:26<16:00, 20.00s/it]
284
  85%|████████▍ | 266/313 [1:28:45<15:39, 19.98s/it]
285
  85%|████████▌ | 267/313 [1:29:05<15:19, 19.98s/it]
286
  86%|████████▌ | 268/313 [1:29:25<14:59, 20.00s/it]
287
  86%|████████▌ | 269/313 [1:29:45<14:40, 20.01s/it]
288
  86%|████████▋ | 270/313 [1:30:05<14:19, 19.99s/it]
289
  87%|████████▋ | 271/313 [1:30:25<13:59, 20.00s/it]
290
  87%|████████▋ | 272/313 [1:30:45<13:39, 19.99s/it]
291
  87%|████████▋ | 273/313 [1:31:05<13:19, 19.98s/it]
292
  88%|████████▊ | 274/313 [1:31:25<12:58, 19.97s/it]
293
  88%|████████▊ | 275/313 [1:31:45<12:39, 19.99s/it]
294
  88%|████████▊ | 276/313 [1:32:05<12:19, 19.99s/it]
295
  88%|████████▊ | 277/313 [1:32:25<11:59, 19.99s/it]
296
  89%|████████▉ | 278/313 [1:32:45<11:39, 19.97s/it]
297
  89%|████████▉ | 279/313 [1:33:05<11:18, 19.96s/it]
298
  89%|████████▉ | 280/313 [1:33:25<10:58, 19.97s/it]
299
  90%|████████▉ | 281/313 [1:33:45<10:38, 19.95s/it]
300
  90%|█████████ | 282/313 [1:34:05<10:18, 19.96s/it]
301
  90%|█████████ | 283/313 [1:34:25<09:58, 19.96s/it]
302
  91%|█████████ | 284/313 [1:34:45<09:39, 19.97s/it]
303
  91%|█████████ | 285/313 [1:35:05<09:18, 19.96s/it]
304
  91%|█████████▏| 286/313 [1:35:25<08:58, 19.95s/it]
305
  92%|█████████▏| 287/313 [1:35:45<08:39, 19.96s/it]
306
  92%|█████████▏| 288/313 [1:36:05<08:19, 19.97s/it]
307
  92%|█████████▏| 289/313 [1:36:25<07:59, 19.97s/it]
308
  93%|█████████▎| 290/313 [1:36:45<07:39, 19.97s/it]
309
  93%|█████████▎| 291/313 [1:37:05<07:19, 19.97s/it]
310
  93%|█████████▎| 292/313 [1:37:25<06:59, 19.97s/it]
311
  94%|█████████▎| 293/313 [1:37:45<06:39, 19.98s/it]
312
  94%|█████████▍| 294/313 [1:38:05<06:19, 19.97s/it]
313
  94%|█████████▍| 295/313 [1:38:25<05:59, 19.96s/it]
314
  95%|█████████▍| 296/313 [1:38:45<05:39, 19.96s/it]
315
  95%|█████████▍| 297/313 [1:39:05<05:19, 19.96s/it]
316
  95%|█████████▌| 298/313 [1:39:25<04:59, 19.97s/it]
317
  96%|█████████▌| 299/313 [1:39:44<04:39, 19.95s/it]
318
  96%|█████████▌| 300/313 [1:40:04<04:19, 19.96s/it]
319
 
320
  96%|█████████▌| 300/313 [1:40:04<04:19, 19.96s/it]
321
  96%|█████████▌| 301/313 [1:40:24<03:59, 19.94s/it]
322
  96%|█████████▋| 302/313 [1:40:44<03:39, 19.94s/it]
323
  97%|█████████▋| 303/313 [1:41:04<03:19, 19.96s/it]
324
  97%|█████████▋| 304/313 [1:41:24<02:59, 19.97s/it]
325
  97%|█████████▋| 305/313 [1:41:44<02:39, 19.96s/it]
326
  98%|█████████▊| 306/313 [1:42:04<02:19, 19.96s/it]
327
  98%|█████████▊| 307/313 [1:42:24<01:59, 19.97s/it]
328
  98%|█████████▊| 308/313 [1:42:44<01:39, 19.95s/it]
329
  99%|█████████▊| 309/313 [1:43:04<01:19, 19.96s/it]
330
  99%|█████████▉| 310/313 [1:43:24<00:59, 19.95s/it]
331
  99%|█████████▉| 311/313 [1:43:44<00:39, 19.94s/it]
332
 
qlora_finetune.out ADDED
@@ -0,0 +1,44 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ============================================================
2
+ STEP 1 — Loading and formatting training data
3
+ ============================================================
4
+ Training examples: 5,000
5
+ Formatted 5000 training examples.
6
+
7
+ ============================================================
8
+ STEP 2 — Loading model in 4-bit quantization
9
+ ============================================================
10
+ Model loaded in 4-bit.
11
+
12
+ ============================================================
13
+ STEP 3 — Attaching LoRA adapters
14
+ ============================================================
15
+ Total parameters : 7,289,966,592
16
+ Trainable parameters : 41,943,040 (0.58%)
17
+
18
+ ============================================================
19
+ STEP 4 — Tokenizing dataset
20
+ ============================================================
21
+ Tokenized 5000 examples, max_length=512
22
+
23
+ ============================================================
24
+ STEP 5 — Fine-tuning with QLoRA
25
+ ============================================================
26
+ Training: 5,000 examples, 1 epoch
27
+ Effective batch size: 16
28
+
29
+ {'loss': 1.0607, 'grad_norm': 0.7086756229400635, 'learning_rate': 0.00019193530389822363, 'epoch': 0.16}
30
+ {'loss': 0.9639, 'grad_norm': 0.5901047587394714, 'learning_rate': 0.00016036076085226814, 'epoch': 0.32}
31
+ {'loss': 0.9558, 'grad_norm': 0.5561144351959229, 'learning_rate': 0.0001129241134155949, 'epoch': 0.48}
32
+ {'loss': 0.9428, 'grad_norm': 1.1043578386306763, 'learning_rate': 6.209115961596208e-05, 'epoch': 0.64}
33
+ {'loss': 0.9361, 'grad_norm': 0.5233080983161926, 'learning_rate': 2.1220207206178688e-05, 'epoch': 0.8}
34
+ {'loss': 0.9374, 'grad_norm': 0.48741865158081055, 'learning_rate': 1.0516660902673448e-06, 'epoch': 0.96}
35
+ {'train_runtime': 6254.5575, 'train_samples_per_second': 0.799, 'train_steps_per_second': 0.05, 'train_loss': 0.9650596307870298, 'epoch': 1.0}
36
+
37
+ Training complete.
38
+
39
+ ============================================================
40
+ STEP 6 — Saving fine-tuned adapter
41
+ ============================================================
42
+ Saved to: ./qlora_patent_model/
43
+
44
+ ✅ QLoRA fine-tuning complete!
qlora_finetune.py ADDED
@@ -0,0 +1,193 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # PART C — STEP 1: QLoRA Fine-Tuning on Patent Claims
2
+ # Uses standard Trainer — no trl dependency
3
+
4
+ # 1. IMPORTS
5
+ import os
6
+ import torch
7
+ import pandas as pd
8
+ from datasets import Dataset
9
+ from transformers import (
10
+ AutoTokenizer,
11
+ AutoModelForCausalLM,
12
+ BitsAndBytesConfig,
13
+ TrainingArguments,
14
+ Trainer,
15
+ DataCollatorForLanguageModeling,
16
+ )
17
+ from peft import LoraConfig, get_peft_model, prepare_model_for_kbit_training
18
+
19
+ # 2. PARAMETERS
20
+ BASE_MODEL = "mistralai/Mistral-7B-Instruct-v0.3"
21
+ PARQUET_FILE = "patents_50k_green.parquet"
22
+ OUTPUT_DIR = "./qlora_patent_model"
23
+ MAX_SEQ_LEN = 512
24
+ NUM_EPOCHS = 1
25
+ LEARNING_RATE = 2e-4
26
+ BATCH_SIZE = 4
27
+ GRAD_ACCUM = 4
28
+ LOGGING_STEPS = 50
29
+ RANDOM_SEED = 42
30
+ MAX_TRAIN_SAMPLES = 5000
31
+
32
+ LORA_R = 16
33
+ LORA_ALPHA = 32
34
+ LORA_DROPOUT = 0.05
35
+
36
+ # 3. FORMAT TRAINING DATA
37
+ print("=" * 60)
38
+ print("STEP 1 — Loading and formatting training data")
39
+ print("=" * 60)
40
+
41
+ df = pd.read_parquet(PARQUET_FILE)
42
+ train_df = df[df["split"] == "train_silver"].copy()
43
+
44
+ if len(train_df) > MAX_TRAIN_SAMPLES:
45
+ train_df = train_df.sample(n=MAX_TRAIN_SAMPLES, random_state=RANDOM_SEED)
46
+
47
+ print(f"Training examples: {len(train_df):,}")
48
+
49
+ def format_training_example(row):
50
+ label = int(row["is_green_silver"])
51
+ label_word = "green technology" if label == 1 else "not green technology"
52
+
53
+ text = f"""### Instruction:
54
+ You are a patent examiner. Classify this patent claim as green technology (1) or not green technology (0). Green technology includes inventions for reducing emissions, renewable energy, energy efficiency, pollution reduction, or environmental protection. Respond with JSON only.
55
+
56
+ ### Claim:
57
+ {row['text'][:1500]}
58
+
59
+ ### Response:
60
+ {{"label": {label}, "rationale": "This patent claim describes {label_word} based on the technical content of the claim."}}"""
61
+
62
+ return {"text": text}
63
+
64
+ formatted_data = train_df.apply(format_training_example, axis=1).tolist()
65
+ train_dataset = Dataset.from_list(formatted_data)
66
+ print(f"Formatted {len(train_dataset)} training examples.")
67
+
68
+ # 4. LOAD MODEL IN 4-BIT
69
+ print("\n" + "=" * 60)
70
+ print("STEP 2 — Loading model in 4-bit quantization")
71
+ print("=" * 60)
72
+
73
+ bnb_config = BitsAndBytesConfig(
74
+ load_in_4bit=True,
75
+ bnb_4bit_quant_type="nf4",
76
+ bnb_4bit_compute_dtype=torch.float16,
77
+ bnb_4bit_use_double_quant=True,
78
+ )
79
+
80
+ tokenizer = AutoTokenizer.from_pretrained(BASE_MODEL, trust_remote_code=True)
81
+ tokenizer.pad_token = tokenizer.eos_token
82
+ tokenizer.padding_side = "right"
83
+
84
+ model = AutoModelForCausalLM.from_pretrained(
85
+ BASE_MODEL,
86
+ quantization_config=bnb_config,
87
+ device_map="auto",
88
+ trust_remote_code=True,
89
+ )
90
+
91
+ model = prepare_model_for_kbit_training(model)
92
+ print("Model loaded in 4-bit.")
93
+
94
+ # 5. ATTACH LoRA ADAPTERS
95
+ print("\n" + "=" * 60)
96
+ print("STEP 3 — Attaching LoRA adapters")
97
+ print("=" * 60)
98
+
99
+ lora_config = LoraConfig(
100
+ r=LORA_R,
101
+ lora_alpha=LORA_ALPHA,
102
+ lora_dropout=LORA_DROPOUT,
103
+ bias="none",
104
+ task_type="CAUSAL_LM",
105
+ target_modules=[
106
+ "q_proj", "k_proj", "v_proj", "o_proj",
107
+ "gate_proj", "up_proj", "down_proj",
108
+ ],
109
+ )
110
+
111
+ model = get_peft_model(model, lora_config)
112
+
113
+ trainable, total = model.get_nb_trainable_parameters()
114
+ print(f"Total parameters : {total:,}")
115
+ print(f"Trainable parameters : {trainable:,} ({100 * trainable / total:.2f}%)")
116
+
117
+ # 6. TOKENIZE DATASET
118
+ print("\n" + "=" * 60)
119
+ print("STEP 4 — Tokenizing dataset")
120
+ print("=" * 60)
121
+
122
+ def tokenize_function(examples):
123
+ tokens = tokenizer(
124
+ examples["text"],
125
+ truncation=True,
126
+ max_length=MAX_SEQ_LEN,
127
+ padding="max_length",
128
+ )
129
+ # For causal language modeling, labels = input_ids
130
+ # The model learns to predict the next token at each position
131
+ tokens["labels"] = tokens["input_ids"].copy()
132
+ return tokens
133
+
134
+ tokenized_dataset = train_dataset.map(
135
+ tokenize_function,
136
+ batched=True,
137
+ remove_columns=["text"],
138
+ )
139
+ tokenized_dataset.set_format("torch")
140
+ print(f"Tokenized {len(tokenized_dataset)} examples, max_length={MAX_SEQ_LEN}")
141
+
142
+ # 7. TRAIN
143
+ print("\n" + "=" * 60)
144
+ print("STEP 5 — Fine-tuning with QLoRA")
145
+ print("=" * 60)
146
+
147
+ training_args = TrainingArguments(
148
+ output_dir="./qlora_checkpoints",
149
+ num_train_epochs=NUM_EPOCHS,
150
+ per_device_train_batch_size=BATCH_SIZE,
151
+ gradient_accumulation_steps=GRAD_ACCUM,
152
+ learning_rate=LEARNING_RATE,
153
+ weight_decay=0.01,
154
+ logging_steps=LOGGING_STEPS,
155
+ save_strategy="no",
156
+ bf16=torch.cuda.is_bf16_supported(),
157
+ fp16=not torch.cuda.is_bf16_supported(),
158
+ optim="paged_adamw_8bit",
159
+ warmup_ratio=0.03,
160
+ lr_scheduler_type="cosine",
161
+ report_to="none",
162
+ seed=RANDOM_SEED,
163
+ )
164
+
165
+ data_collator = DataCollatorForLanguageModeling(
166
+ tokenizer=tokenizer,
167
+ mlm=False, # False = causal LM (predict next token, not masked)
168
+ )
169
+
170
+ trainer = Trainer(
171
+ model=model,
172
+ args=training_args,
173
+ train_dataset=tokenized_dataset,
174
+ data_collator=data_collator,
175
+ )
176
+
177
+ print(f"Training: {len(tokenized_dataset):,} examples, {NUM_EPOCHS} epoch")
178
+ print(f"Effective batch size: {BATCH_SIZE * GRAD_ACCUM}")
179
+ print()
180
+
181
+ trainer.train()
182
+ print("\nTraining complete.")
183
+
184
+ # 8. SAVE ADAPTER
185
+ print("\n" + "=" * 60)
186
+ print("STEP 6 — Saving fine-tuned adapter")
187
+ print("=" * 60)
188
+
189
+ model.save_pretrained(OUTPUT_DIR)
190
+ tokenizer.save_pretrained(OUTPUT_DIR)
191
+ print(f"Saved to: {OUTPUT_DIR}/")
192
+
193
+ print("\n✅ QLoRA fine-tuning complete!")