File size: 6,713 Bytes
5174f31 5edf221 5174f31 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 |
---
license: mit
tags:
- Jerome Powell AI model
- Federal Reserve chatbot
- fine-tuned Phi-3
- financial language model
- LLM fine-tuning
- machine learning engineering
- LoRA training
- NLP
datasets:
- BoostedJonP/JeromePowell-SFT
language:
- en
base_model:
- microsoft/Phi-3-mini-4k-instruct
pipeline_tag: text-generation
---
# Powell-Phi3-Mini β Jerome Powell Style Language Model
[](https://huggingface.co/BoostedJonP/powell-phi3-mini)
[](https://opensource.org/licenses/MIT)
[](https://images.nvidia.com/content/tesla/pdf/nvidia-tesla-p100-PCIe-datasheet.pdf)
[](https://arxiv.org/abs/2106.09685)
## π―Summary
**Powell-Phi3-Mini** is an fine-tuned language model that replicates Federal Reserve Chair Jerome Powell's distinctive communication style, tone, and strategic hedging patterns. This project showcases expertise in **modern LLM fine-tuning techniques**, **parameter-efficient training methods**, and **responsible AI development** β demonstrating industry-ready machine learning engineering skills.
---
## π Key Features & Capabilities
### **Style Mimicry & Linguistic Analysis**
- β
**Authentic Communication Style**: Replicates Powell's cautious, data-dependent rhetoric
- β
**Strategic Hedging Patterns**: Maintains appropriate uncertainty in speculative scenarios
- β
**Domain-Specific Responses**: Handles economic and monetary policy discussions contextually
- β
**Refusal Training**: Appropriately declines to provide financial advice or policy predictions (to an extent)
### **Technical Implementation**
- β
**Efficient Architecture**: Built on Microsoft Phi-3-mini-4k-instruct (3.8B parameters)
- β
**Scalable Training**: LoRA r=16, alpha=32 configuration optimized for consumer GPUs
- β
**Deployment Flexibility**: Available as lightweight adapter or full merged model
- β
**Integration Ready**: One-line inference with Hugging Face Transformers
---
## π» Implementation Examples
### Production Ready - Merged Model
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
# One-line model loading
tokenizer = AutoTokenizer.from_pretrained("BoostedJonP/powell-phi3-mini")
model = AutoModelForCausalLM.from_pretrained("BoostedJonP/powell-phi3-mini", device_map="auto")
# Economic analysis prompt
prompt = "How is the current labor market affecting your inflation outlook?"
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
response = model.generate(**inputs, max_new_tokens=200, do_sample=True)
print(tokenizer.decode(response[0], skip_special_tokens=True))
```
---
## π Technical Specifications & Training Pipeline
### **Model Architecture**
| Component | Specification |
|-----------|---------------|
| **Base Model** | microsoft/Phi-3-mini-4k-instruct (3.8B parameters) |
| **License** | MIT License (Commercial Use Approved) |
| **Fine-tuning Method** | QLoRA with PEFT integration |
| **Context Length** | 4,096 tokens |
| **Training Hardware** | NVIDIA TESLA P100 (16GB VRAM) |
### **Training Configuration**
| Hyperparameter | Value | Rationale |
|----------------|-------|-----------|
| **LoRA Rank (r)** | 16 | Optimal parameter/performance balance |
| **LoRA Alpha** | 32 | 2x rank for stable training |
| **Dropout Rate** | 0.05 | Regularization without overfitting |
| **Learning Rate** | 1.5e-4 | Conservative rate for stable convergence |
| **Scheduler** | Cosine decay | Smooth learning rate reduction |
| **Training Epochs** | 3 | Prevents overfitting on specialized domain |
| **Sequence Length** | 1,536 tokens | Optimized for dataset |
| **Precision** | Mixed fp16 | 2x memory efficiency, maintained accuracy |
### **Dataset & Methodology**
- **Data Source**: Public domain FOMC transcripts and Federal Reserve speeches -> [Jerome Powell Press Release Q&A](https://www.kaggle.com/datasets/jonathanpaserman/fed-press-release-text)
- **Data Processing**: Instruction-response pairs extracted from press conferences -> [Jerome Powell Press Release SFT data processing](https://www.kaggle.com/code/jonathanpaserman/jerome-powell-press-release-sft-data-processing)
- Available on [HuggingFace](https://huggingface.co/datasets/BoostedJonP/JeromePowell-SFT)
- **Quality Control**: Manual review and filtering for authentic Powell communication patterns
---
## π Performance Metrics & Evaluation
### **Quantitative Results**
| Metric | Baseline (Phi-3) | Powell-Phi3-Mini | Improvement |
|--------|------------------|------------------|-------------|
| **Powell-style Classification** |NA | NA | **NA** |
| **Economic Domain Accuracy** |NA | NA | **NA** |
| **Response Coherence (BLEU)**|NA | NA | **NA** |
### **Qualitative Assessment**
- NA
---
## π Deployment & Access
### **π Live Demo**
**[Try Powell-Phi3-Mini Interactive Demo β](https://huggingface.co/spaces/BoostedJonP/powell-assistant)**
### **π¦ Model Downloads**
- **Adapter Version**: `BoostedJonP/powell-phi3-mini-adapter`
- **Merged Model**: `BoostedJonP/powell-phi3-mini` (Full Model - 7.4GB)
### **π Resources**
- **[GitHub Repository](https://github.com/BigJonP/powell-phi3-sft)**: Complete training code and evaluation scripts
- **[Technical Blog Post](https://medium.com/@jonathanpaserman)**: Detailed implementation walkthrough
- **[Hugging Face Collection](https://huggingface.co/collections/BoostedJonP/jerome-powell-68b9e7843f64507481d24ce9)**: All model variants and datasets
---
## βοΈ Responsible AI & Legal Compliance
### **Ethical Considerations**
- β οΈ **No Official Affiliation**: Not endorsed by or affiliated with the Federal Reserve System
- β οΈ **Educational Purpose Only**: Designed for research, education, and demonstration purposes
- β οΈ **No Financial Advice**: Model responses should not be interpreted as investment guidance
- β οΈ **Transparency**: All training data sourced from public domain government transcripts
### **Licensing & Usage Rights**
- **Base Model License**: MIT License (Microsoft Phi-3)
- **Fine-tuned Weights**: MIT License (Commercial use permitted)
- **Training Data**: Public domain (U.S. government works)
- **Usage**: Unrestricted for research, education, and commercial applications
---
### π¨βπ» **Connect & Collaborate**
- **GitHub**: [Jonathan Paserman](https://github.com/BigJonP)
- **Kaggle**: [Jonathan Paserman](https://www.kaggle.com/jonathanpaserman)
- **HuggingFace**: [Jonathan Paserman](https://huggingface.co/BoostedJonP) |