Ory999 commited on
Commit
fe18ce2
Β·
verified Β·
1 Parent(s): ce28ff5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +14 -24
README.md CHANGED
@@ -9,30 +9,26 @@ This is the final assignment it synthesizes Assignments 2 and 3 into a data labe
9
  ## Pipeline Architecture
10
 
11
  patents_50k_green.parquet
12
- β”‚
13
- β–Ό
14
  [Part A & B] Baseline PatentSBERTa + Uncertainty Sampling
15
- β”‚ β†’ Top 100 high-risk claims (u β‰ˆ 1.0)
16
- β–Ό
17
  [Part C – Step 1] QLoRA Fine-tuning on Colab (Qwen3-8B, 4-bit, 3 epochs)
18
- β”‚ β†’ qlora_green_patent_adapter (LoRA weights)
19
- β”‚ β†’ Qwen3-8B.Q4_K_M.gguf (served via LM Studio)
20
- β–Ό
21
  [Part C – Step 2] Multi-Agent System (CrewAI)
22
- β”‚ Agent 1 – Advocate (Qwen3-4B, argues for green: Advocator)
23
- β”‚ Agent 2 – Skeptic (Qwen3-4B, argues against green: Skeptic)
24
- β”‚ Agent 3 – Judge (QLoRA Qwen3-8B, final verdict: Judge)
25
- β–Ό
26
  [Part D] Exception-Based HITL (only deadlocks / low-confidence)
27
- β”‚ β†’ 26 claims reviewed with deadlock, 3 human overrides
28
- β”‚ β†’ hitl_green_100_final.csv (gold labels)
29
- β–Ό
30
- [Part D] Final PatentSBERTa Fine-tuning on gold dataset
31
- β†’ patentsberta_finetuned_final/
32
-
33
 
 
 
34
 
35
- ---
36
 
37
  ## Part C – Step 1: QLoRA Domain Adaptation
38
 
@@ -55,7 +51,6 @@ The generative LLM fine-tuning was performed on Google Colab (T4, 15 GB VRAM) us
55
 
56
  The fine-tuned adapter was exported to GGUF Q4_K_M format (4.682 GB) and served locally via LM Studio for use in the MAS.
57
 
58
- ---
59
 
60
  ## Part C – Step 2: Multi-Agent System
61
 
@@ -69,7 +64,6 @@ Three agents collaborate to label each of the 100 high-risk patent claims using
69
 
70
  Each claim produces a structured JSON output: `classification` (0/1), `confidence` (Low/Medium/High), and `rationale`.
71
 
72
- ---
73
 
74
  ## Part D: Targeted HITL & Final Fine-tuning
75
 
@@ -85,7 +79,6 @@ Each claim produces a structured JSON output: `classification` (0/1), `confidenc
85
 
86
  The gold-labelled dataset (`hitl_green_100_final.csv`) was used to fine-tune PatentSBERTa for 3 epochs using CosineSimilarityLoss on an AMD Radeon RX 9070 XT via DirectML (fell back to CPU, completed in ~31 minutes).
87
 
88
- ---
89
 
90
  ## Performance Results
91
 
@@ -98,7 +91,6 @@ The gold-labelled dataset (`hitl_green_100_final.csv`) was used to fine-tune Pat
98
 
99
  The Final Model achieves the highest F1 score across all iterations, demonstrating that QLoRA domain adaptation combined with structured agent debate and targeted human review produces measurable improvements.
100
 
101
- ---
102
 
103
  ## Key Findings
104
 
@@ -117,7 +109,6 @@ The Final Model achieves the highest F1 score across all iterations, demonstrati
117
  - torchao 0.16.0 conflicts with transformers lazy loader in certain environments
118
  - LM Studio local inference adds latency (~3–5s per agent call)
119
 
120
- ---
121
 
122
  ## Repository Contents
123
 
@@ -130,7 +121,6 @@ The Final Model achieves the highest F1 score across all iterations, demonstrati
130
  | `qlora_outputs.zip` | QLoRA adapter weights (`qlora_green_patent_adapter/`) |
131
  | `Part C Step 1.ipynb` | Colab notebook for QLoRA fine-tuning |
132
 
133
- ---
134
 
135
  ## Related Repositories
136
 
 
9
  ## Pipeline Architecture
10
 
11
  patents_50k_green.parquet
12
+
 
13
  [Part A & B] Baseline PatentSBERTa + Uncertainty Sampling
14
+ - Top 100 high-risk claims (u β‰ˆ 1.0)
15
+
16
  [Part C – Step 1] QLoRA Fine-tuning on Colab (Qwen3-8B, 4-bit, 3 epochs)
17
+ - qlora_green_patent_adapter (LoRA weights)
18
+ - Qwen3-8B.Q4_K_M.gguf (served via LM Studio)
19
+
20
  [Part C – Step 2] Multi-Agent System (CrewAI)
21
+ - Agent 1 – Advocate (Qwen3-4B, argues for green: Advocator)
22
+ - Agent 2 – Skeptic (Qwen3-4B, argues against green: Skeptic)
23
+ - Agent 3 – Judge (QLoRA Qwen3-8B, final verdict: Judge)
24
+
25
  [Part D] Exception-Based HITL (only deadlocks / low-confidence)
26
+ - 26 claims reviewed with deadlock, 3 human overrides
27
+ - hitl_green_100_final.csv (gold labels)
 
 
 
 
28
 
29
+ [Part D] Final PatentSBERTa Fine-tuning on gold dataset
30
+ - patentsberta_finetuned_final/
31
 
 
32
 
33
  ## Part C – Step 1: QLoRA Domain Adaptation
34
 
 
51
 
52
  The fine-tuned adapter was exported to GGUF Q4_K_M format (4.682 GB) and served locally via LM Studio for use in the MAS.
53
 
 
54
 
55
  ## Part C – Step 2: Multi-Agent System
56
 
 
64
 
65
  Each claim produces a structured JSON output: `classification` (0/1), `confidence` (Low/Medium/High), and `rationale`.
66
 
 
67
 
68
  ## Part D: Targeted HITL & Final Fine-tuning
69
 
 
79
 
80
  The gold-labelled dataset (`hitl_green_100_final.csv`) was used to fine-tune PatentSBERTa for 3 epochs using CosineSimilarityLoss on an AMD Radeon RX 9070 XT via DirectML (fell back to CPU, completed in ~31 minutes).
81
 
 
82
 
83
  ## Performance Results
84
 
 
91
 
92
  The Final Model achieves the highest F1 score across all iterations, demonstrating that QLoRA domain adaptation combined with structured agent debate and targeted human review produces measurable improvements.
93
 
 
94
 
95
  ## Key Findings
96
 
 
109
  - torchao 0.16.0 conflicts with transformers lazy loader in certain environments
110
  - LM Studio local inference adds latency (~3–5s per agent call)
111
 
 
112
 
113
  ## Repository Contents
114
 
 
121
  | `qlora_outputs.zip` | QLoRA adapter weights (`qlora_green_patent_adapter/`) |
122
  | `Part C Step 1.ipynb` | Colab notebook for QLoRA fine-tuning |
123
 
 
124
 
125
  ## Related Repositories
126