File size: 5,892 Bytes
9373133 c55cc47 9373133 fe18ce2 9373133 fe18ce2 9373133 fe18ce2 9373133 fe18ce2 9373133 fe18ce2 9373133 fe18ce2 9373133 454b5c4 9373133 454b5c4 9373133 3bed427 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 | # Green Patent Detection: Advanced Agentic Workflow with QLoRA
## Project Summary
This is the final assignment it synthesizes Assignments 2 and 3 into a data labelling pipeline. A Generative LLM is fine-tuned via QLoRA to understand patent language, then integrated as the "jduge" of a Multi-Agent System (MAS) to debate and label complex patent claims. Finally, a targeted Human-in-the-Loop (HITL) review step produces a gold dataset for a final PatentSBERTa fine-tuning.
---
## Pipeline Architecture
patents_50k_green.parquet
[Part A & B] Baseline PatentSBERTa + Uncertainty Sampling
- Top 100 high-risk claims (u β 1.0)
[Part C β Step 1] QLoRA Fine-tuning on Colab (Qwen3-8B, 4-bit, 3 epochs)
- qlora_green_patent_adapter (LoRA weights)
- Qwen3-8B.Q4_K_M.gguf (served via LM Studio)
[Part C β Step 2] Multi-Agent System (CrewAI)
- Agent 1 β Advocate (Qwen3-4B, argues for green: Advocator)
- Agent 2 β Skeptic (Qwen3-4B, argues against green: Skeptic)
- Agent 3 β Judge (QLoRA Qwen3-8B, final verdict: Judge)
[Part D] Exception-Based HITL (only deadlocks / low-confidence)
- 26 claims reviewed with deadlock, 3 human overrides
- hitl_green_100_final.csv (gold labels)
[Part D] Final PatentSBERTa Fine-tuning on gold dataset
- patentsberta_finetuned_final/
## Part C β Step 1: QLoRA Domain Adaptation
The generative LLM fine-tuning was performed on Google Colab (T4, 15 GB VRAM) using Unsloth's QLoRA implementation.
| Parameter | Value |
|---|---|
| Base model | `unsloth/Qwen3-8B-bnb-4bit` |
| LoRA rank (r) | 16 |
| LoRA alpha | 16 |
| Target modules | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj |
| Training examples | 2,000 (train_silver, Alpaca format) |
| Epochs | 3 (375 total steps) |
| Batch size | 4 Γ 4 gradient accumulation = effective 16 |
| Learning rate | 2e-4 (AdamW 8-bit, linear schedule) |
| Max sequence length | 2,048 tokens |
| Training loss | 0.8899 |
| Training time | ~105 minutes on T4 |
| VRAM usage | ~5 GB (4-bit quantization) |
The fine-tuned adapter was exported to GGUF Q4_K_M format (4.682 GB) and served locally via LM Studio for use in the MAS.
## Part C β Step 2: Multi-Agent System
Three agents collaborate to label each of the 100 high-risk patent claims using CrewAI as the orchestration framework. The QLoRA fine-tuned model serves as the Judge's brain.
| Agent | Model | Temperature | Role |
|---|---|---|---|
| Advocate | Qwen3-4B (LM Studio) | 0.1 | Argues FOR Y02 green classification |
| Skeptic | Qwen3-4B (LM Studio) | 0.1 | Argues AGAINST (identifies greenwashing) |
| Judge | QLoRA Qwen3-8B (LM Studio) | 0.1 | Weighs debate and produces final JSON label |
Each claim produces a structured JSON output: `classification` (0/1), `confidence` (Low/Medium/High), and `rationale`.
## Part D: Targeted HITL & Final Fine-tuning
**Exception-Based HITL** was applied β only intervening when agents reached a deadlock or produced low-confidence outputs.
| Metric | Value |
|---|---|
| Total claims reviewed by MAS | 100 |
| Auto-accepted (high confidence) | 74 |
| Escalated to human review | 26 |
| Human overrides | 3 |
| Human agreement rate with Judge | 88.5% |
The gold-labelled dataset (`hitl_green_100_final.csv`) was used to fine-tune PatentSBERTa for 3 epochs using CosineSimilarityLoss on an AMD Radeon RX 9070 XT via DirectML (fell back to CPU, completed in ~31 minutes).
## Performance Results
| Model Version | Training Data Source | F1 Score (Test Set) |
|---|---|---|
| 1. Baseline | Frozen Embeddings (No Fine-tuning) | 0.7494 |
| 2. Assignment 2 Model | Silver + Gold (Simple Generic LLM) | 0.7465 |
| 3. Assignment 3 Model | Silver + Gold (Advanced Techniques / MAS) | 0.7467 |
| 4. Final Model | Silver + Gold (QLoRA-Powered MAS + Targeted HITL) | 0.7530 |
The Final Model achieves the highest F1 score across all iterations, demonstrating that QLoRA domain adaptation combined with structured agent debate and targeted human review produces measurable improvements.
## Key Findings
**QLoRA advantages:**
- Adapts a generative LLM to patent language with only 0.53% of parameters trained
- Enables a domain-aware Judge that understands Y02 classification logic
- 4-bit quantization fits 8B model on a free 15 GB T4 GPU
**MAS + HITL advantages:**
- Debate structure surfaces disagreements that single-model approaches miss
- Exception-based HITL reduces human effort by 74% (26 vs 100 reviews)
- Gold labels are higher-quality than silver LLM labels alone
**Limitations:**
- DirectML (AMD GPU) not fully supported by sentence-transformers training β fell back to CPU
- torchao 0.16.0 conflicts with transformers lazy loader in certain environments
## Repository Contents
| File | Description |
|---|---|
| `Final_Assignment.ipynb` | Main notebook (Parts AβD) |
| `patentsberta_finetuned_final/` | Final fine-tuned PatentSBERTa model |
| `hitl_green_100_final.csv` | Gold dataset β 100 claims with HITL labels and debate rationales |
| `final_classifier.joblib` | Serialised final Logistic Regression classifier |
| `qlora_outputs.zip` | QLoRA adapter weights (`qlora_green_patent_adapter/`) |
| `Part C Step 1.ipynb` | Colab notebook for QLoRA fine-tuning |
| `Debate transcripts` | All debate transcrips for MAS |
## Related Repositories
- [Assignment 1 β ](https://github.com/Ory999/Portfolio-Assignment-1-M4-SGD-Mechanics-Attention-Context)
- [Assignment 2 β ](https://huggingface.co/Ory999/Assignment_2)
- [Assignment 3 β ](https://huggingface.co/Ory999/Assignment_3)
- Video link: https://aaudk-my.sharepoint.com/:v:/g/personal/dl02af_student_aau_dk/IQCPwPoohhSGQr_1T0-tmw7-AZHTGk2JIY01ZNHmYpGHiNQ?email=hamidb%40business.aau.dk&e=GYbwCw&nav=eyJyZWZlcnJhbEluZm8iOnsicmVmZXJyYWxBcHAiOiJTdHJlYW1XZWJBcHAiLCJyZWZlcnJhbFZpZXciOiJTaGFyZURpYWxvZy1MaW5rIiwicmVmZXJyYWxBcHBQbGF0Zm9ybSI6IldlYiIsInJlZmVycmFsTW9kZSI6InZpZXcifX0%3D
|