|
|
--- |
|
|
license: mit |
|
|
tags: |
|
|
- Jerome Powell AI model |
|
|
- Federal Reserve chatbot |
|
|
- fine-tuned Phi-3 |
|
|
- financial language model |
|
|
- LLM fine-tuning |
|
|
- machine learning engineering |
|
|
- LoRA training |
|
|
- NLP |
|
|
datasets: |
|
|
- BoostedJonP/JeromePowell-SFT |
|
|
language: |
|
|
- en |
|
|
base_model: |
|
|
- microsoft/Phi-3-mini-4k-instruct |
|
|
pipeline_tag: text-generation |
|
|
--- |
|
|
|
|
|
# Powell-Phi3-Mini β Jerome Powell Style Language Model |
|
|
|
|
|
[](https://huggingface.co/BoostedJonP/powell-phi3-mini) |
|
|
[](https://opensource.org/licenses/MIT) |
|
|
[](https://images.nvidia.com/content/tesla/pdf/nvidia-tesla-p100-PCIe-datasheet.pdf) |
|
|
[](https://arxiv.org/abs/2106.09685) |
|
|
|
|
|
## π―Summary |
|
|
|
|
|
**Powell-Phi3-Mini** is an fine-tuned language model that replicates Federal Reserve Chair Jerome Powell's distinctive communication style, tone, and strategic hedging patterns. This project showcases expertise in **modern LLM fine-tuning techniques**, **parameter-efficient training methods**, and **responsible AI development** β demonstrating industry-ready machine learning engineering skills. |
|
|
|
|
|
--- |
|
|
|
|
|
## π Key Features & Capabilities |
|
|
|
|
|
### **Style Mimicry & Linguistic Analysis** |
|
|
- β
**Authentic Communication Style**: Replicates Powell's cautious, data-dependent rhetoric |
|
|
- β
**Strategic Hedging Patterns**: Maintains appropriate uncertainty in speculative scenarios |
|
|
- β
**Domain-Specific Responses**: Handles economic and monetary policy discussions contextually |
|
|
- β
**Refusal Training**: Appropriately declines to provide financial advice or policy predictions (to an extent) |
|
|
|
|
|
### **Technical Implementation** |
|
|
- β
**Efficient Architecture**: Built on Microsoft Phi-3-mini-4k-instruct (3.8B parameters) |
|
|
- β
**Scalable Training**: LoRA r=16, alpha=32 configuration optimized for consumer GPUs |
|
|
- β
**Deployment Flexibility**: Available as lightweight adapter or full merged model |
|
|
- β
**Integration Ready**: One-line inference with Hugging Face Transformers |
|
|
|
|
|
--- |
|
|
|
|
|
## π» Implementation Examples |
|
|
|
|
|
### Production Ready - Merged Model |
|
|
```python |
|
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
|
|
|
|
# One-line model loading |
|
|
tokenizer = AutoTokenizer.from_pretrained("BoostedJonP/powell-phi3-mini") |
|
|
model = AutoModelForCausalLM.from_pretrained("BoostedJonP/powell-phi3-mini", device_map="auto") |
|
|
|
|
|
# Economic analysis prompt |
|
|
prompt = "How is the current labor market affecting your inflation outlook?" |
|
|
inputs = tokenizer(prompt, return_tensors="pt").to("cuda") |
|
|
response = model.generate(**inputs, max_new_tokens=200, do_sample=True) |
|
|
print(tokenizer.decode(response[0], skip_special_tokens=True)) |
|
|
``` |
|
|
|
|
|
--- |
|
|
|
|
|
## π Technical Specifications & Training Pipeline |
|
|
|
|
|
### **Model Architecture** |
|
|
| Component | Specification | |
|
|
|-----------|---------------| |
|
|
| **Base Model** | microsoft/Phi-3-mini-4k-instruct (3.8B parameters) | |
|
|
| **License** | MIT License (Commercial Use Approved) | |
|
|
| **Fine-tuning Method** | QLoRA with PEFT integration | |
|
|
| **Context Length** | 4,096 tokens | |
|
|
| **Training Hardware** | NVIDIA TESLA P100 (16GB VRAM) | |
|
|
|
|
|
### **Training Configuration** |
|
|
| Hyperparameter | Value | Rationale | |
|
|
|----------------|-------|-----------| |
|
|
| **LoRA Rank (r)** | 16 | Optimal parameter/performance balance | |
|
|
| **LoRA Alpha** | 32 | 2x rank for stable training | |
|
|
| **Dropout Rate** | 0.05 | Regularization without overfitting | |
|
|
| **Learning Rate** | 1.5e-4 | Conservative rate for stable convergence | |
|
|
| **Scheduler** | Cosine decay | Smooth learning rate reduction | |
|
|
| **Training Epochs** | 3 | Prevents overfitting on specialized domain | |
|
|
| **Sequence Length** | 1,536 tokens | Optimized for dataset | |
|
|
| **Precision** | Mixed fp16 | 2x memory efficiency, maintained accuracy | |
|
|
|
|
|
### **Dataset & Methodology** |
|
|
- **Data Source**: Public domain FOMC transcripts and Federal Reserve speeches -> [Jerome Powell Press Release Q&A](https://www.kaggle.com/datasets/jonathanpaserman/fed-press-release-text) |
|
|
- **Data Processing**: Instruction-response pairs extracted from press conferences -> [Jerome Powell Press Release SFT data processing](https://www.kaggle.com/code/jonathanpaserman/jerome-powell-press-release-sft-data-processing) |
|
|
- Available on [HuggingFace](https://huggingface.co/datasets/BoostedJonP/JeromePowell-SFT) |
|
|
- **Quality Control**: Manual review and filtering for authentic Powell communication patterns |
|
|
|
|
|
--- |
|
|
|
|
|
## π Performance Metrics & Evaluation |
|
|
|
|
|
### **Quantitative Results** |
|
|
| Metric | Baseline (Phi-3) | Powell-Phi3-Mini | Improvement | |
|
|
|--------|------------------|------------------|-------------| |
|
|
| **Powell-style Classification** |NA | NA | **NA** | |
|
|
| **Economic Domain Accuracy** |NA | NA | **NA** | |
|
|
| **Response Coherence (BLEU)**|NA | NA | **NA** | |
|
|
|
|
|
### **Qualitative Assessment** |
|
|
- NA |
|
|
|
|
|
--- |
|
|
|
|
|
## π Deployment & Access |
|
|
|
|
|
### **π Live Demo** |
|
|
**[Try Powell-Phi3-Mini Interactive Demo β](https://huggingface.co/spaces/BoostedJonP/powell-assistant)** |
|
|
|
|
|
### **π¦ Model Downloads** |
|
|
- **Adapter Version**: `BoostedJonP/powell-phi3-mini-adapter` |
|
|
- **Merged Model**: `BoostedJonP/powell-phi3-mini` (Full Model - 7.4GB) |
|
|
|
|
|
### **π Resources** |
|
|
- **[GitHub Repository](https://github.com/BigJonP/powell-phi3-sft)**: Complete training code and evaluation scripts |
|
|
- **[Technical Blog Post](https://medium.com/@jonathanpaserman)**: Detailed implementation walkthrough |
|
|
- **[Hugging Face Collection](https://huggingface.co/collections/BoostedJonP/jerome-powell-68b9e7843f64507481d24ce9)**: All model variants and datasets |
|
|
|
|
|
--- |
|
|
|
|
|
## βοΈ Responsible AI & Legal Compliance |
|
|
|
|
|
### **Ethical Considerations** |
|
|
- β οΈ **No Official Affiliation**: Not endorsed by or affiliated with the Federal Reserve System |
|
|
- β οΈ **Educational Purpose Only**: Designed for research, education, and demonstration purposes |
|
|
- β οΈ **No Financial Advice**: Model responses should not be interpreted as investment guidance |
|
|
- β οΈ **Transparency**: All training data sourced from public domain government transcripts |
|
|
|
|
|
### **Licensing & Usage Rights** |
|
|
- **Base Model License**: MIT License (Microsoft Phi-3) |
|
|
- **Fine-tuned Weights**: MIT License (Commercial use permitted) |
|
|
- **Training Data**: Public domain (U.S. government works) |
|
|
- **Usage**: Unrestricted for research, education, and commercial applications |
|
|
|
|
|
--- |
|
|
|
|
|
### π¨βπ» **Connect & Collaborate** |
|
|
- **GitHub**: [Jonathan Paserman](https://github.com/BigJonP) |
|
|
- **Kaggle**: [Jonathan Paserman](https://www.kaggle.com/jonathanpaserman) |
|
|
- **HuggingFace**: [Jonathan Paserman](https://huggingface.co/BoostedJonP) |