Update README.md

Browse files

Files changed (1) hide show

README.md +149 -142

README.md CHANGED Viewed

@@ -1,142 +1,149 @@
----
-license: mit
-tags:
-- Jerome Powell AI model
-- Federal Reserve chatbot
-- fine-tuned Phi-3
-- financial language model
-- LLM fine-tuning
-- machine learning engineering
-- LoRA training
-- NLP
----
-# Powell-Phi3-Mini — Jerome Powell Style Language Model
-[![Hugging Face](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Model-yellow)](https://huggingface.co/BoostedJonP/powell-phi3-mini)
-[![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](https://opensource.org/licenses/MIT)
-[![GPU Training](https://img.shields.io/badge/Trained%20on-TESLA%20P100-green)](https://images.nvidia.com/content/tesla/pdf/nvidia-tesla-p100-PCIe-datasheet.pdf)
-[![Fine-tuning](https://img.shields.io/badge/Method-LoRA%2FQLoRA-orange)](https://arxiv.org/abs/2106.09685)
-## 🎯Summary
-**Powell-Phi3-Mini** is an fine-tuned language model that replicates Federal Reserve Chair Jerome Powell's distinctive communication style, tone, and strategic hedging patterns. This project showcases expertise in **modern LLM fine-tuning techniques**, **parameter-efficient training methods**, and **responsible AI development** — demonstrating industry-ready machine learning engineering skills.
----
-## 🚀 Key Features & Capabilities
-### **Style Mimicry & Linguistic Analysis**
-- ✅ **Authentic Communication Style**: Replicates Powell's cautious, data-dependent rhetoric
-- ✅ **Strategic Hedging Patterns**: Maintains appropriate uncertainty in speculative scenarios
-- ✅ **Domain-Specific Responses**: Handles economic and monetary policy discussions contextually
-- ✅ **Refusal Training**: Appropriately declines to provide financial advice or policy predictions (to an extent)
-### **Technical Implementation**
-- ✅ **Efficient Architecture**: Built on Microsoft Phi-3-mini-4k-instruct (3.8B parameters)
-- ✅ **Scalable Training**: LoRA r=16, alpha=32 configuration optimized for consumer GPUs
-- ✅ **Deployment Flexibility**: Available as lightweight adapter or full merged model
-- ✅ **Integration Ready**: One-line inference with Hugging Face Transformers
----
-## 💻 Implementation Examples
-### Production Ready - Merged Model
-```python
-from transformers import AutoTokenizer, AutoModelForCausalLM
-# One-line model loading
-tokenizer = AutoTokenizer.from_pretrained("BoostedJonP/powell-phi3-mini")
-model = AutoModelForCausalLM.from_pretrained("BoostedJonP/powell-phi3-mini", device_map="auto")
-# Economic analysis prompt
-prompt = "How is the current labor market affecting your inflation outlook?"
-inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
-response = model.generate(**inputs, max_new_tokens=200, do_sample=True)
-print(tokenizer.decode(response[0], skip_special_tokens=True))
-```
----
-## 📊 Technical Specifications & Training Pipeline
-### **Model Architecture**
-| Component | Specification |
-|-----------|---------------|
-| **Base Model** | microsoft/Phi-3-mini-4k-instruct (3.8B parameters) |
-| **License** | MIT License (Commercial Use Approved) |
-| **Fine-tuning Method** | QLoRA with PEFT integration |
-| **Context Length** | 4,096 tokens |
-| **Training Hardware** | NVIDIA TESLA P100 (16GB VRAM) |
-### **Training Configuration**
-| Hyperparameter | Value | Rationale |
-|----------------|-------|-----------|
-| **LoRA Rank (r)** | 16 | Optimal parameter/performance balance |
-| **LoRA Alpha** | 32 | 2x rank for stable training |
-| **Dropout Rate** | 0.05 | Regularization without overfitting |
-| **Learning Rate** | 1.5e-4 | Conservative rate for stable convergence |
-| **Scheduler** | Cosine decay | Smooth learning rate reduction |
-| **Training Epochs** | 3 | Prevents overfitting on specialized domain |
-| **Sequence Length** | 1,536 tokens | Optimized for dataset |
-| **Precision** | Mixed fp16 | 2x memory efficiency, maintained accuracy |
-### **Dataset & Methodology**
-- **Data Source**: Public domain FOMC transcripts and Federal Reserve speeches -> [Jerome Powell Press Release Q&A](https://www.kaggle.com/datasets/jonathanpaserman/fed-press-release-text)
-- **Data Processing**: Instruction-response pairs extracted from press conferences -> [Jerome Powell Press Release SFT data processing](https://www.kaggle.com/code/jonathanpaserman/jerome-powell-press-release-sft-data-processing)
-- Available on [HuggingFace](https://huggingface.co/datasets/BoostedJonP/JeromePowell-SFT)
-- **Quality Control**: Manual review and filtering for authentic Powell communication patterns
----
-## 📈 Performance Metrics & Evaluation
-### **Quantitative Results**
-| Metric | Baseline (Phi-3) | Powell-Phi3-Mini | Improvement |
-|--------|------------------|------------------|-------------|
-| **Powell-style Classification** |NA | NA | **NA** |
-| **Economic Domain Accuracy** |NA | NA | **NA** |
-| **Response Coherence (BLEU)**|NA | NA | **NA** |
-### **Qualitative Assessment**
-- NA
----
-## 🌐 Deployment & Access
-### **🚀 Live Demo**
-**[Try Powell-Phi3-Mini Interactive Demo →](https://huggingface.co/spaces/BoostedJonP/powell-phi3-demo)**
-### **📦 Model Downloads**
-- **Adapter Version**: `BoostedJonP/powell-phi3-mini-adapter`
-- **Merged Model**: `BoostedJonP/powell-phi3-mini` (Full Model - 7.4GB)
-### **🔗 Resources**
-- **[GitHub Repository](https://github.com/BigJonP/powell-phi3-sft)**: Complete training code and evaluation scripts
-- **[Technical Blog Post](https://medium.com/@jonathanpaserman)**: Detailed implementation walkthrough
-- **[Hugging Face Collection](https://huggingface.co/collections/BoostedJonP/jerome-powell-68b9e7843f64507481d24ce9)**: All model variants and datasets
----
-## ⚖️ Responsible AI & Legal Compliance
-### **Ethical Considerations**
-- ⚠️ **No Official Affiliation**: Not endorsed by or affiliated with the Federal Reserve System
-- ⚠️ **Educational Purpose Only**: Designed for research, education, and demonstration purposes
-- ⚠️ **No Financial Advice**: Model responses should not be interpreted as investment guidance
-- ⚠️ **Transparency**: All training data sourced from public domain government transcripts
-### **Licensing & Usage Rights**
-- **Base Model License**: MIT License (Microsoft Phi-3)
-- **Fine-tuned Weights**: MIT License (Commercial use permitted)
-- **Training Data**: Public domain (U.S. government works)
-- **Usage**: Unrestricted for research, education, and commercial applications
----
-### 👨‍💻 **Connect & Collaborate**
-- **GitHub**: [Jonathan Paserman](https://github.com/BigJonP)
-- **Kaggle**: [Jonathan Paserman](https://www.kaggle.com/jonathanpaserman)
-- **HuggingFace**: [Jonathan Paserman](https://huggingface.co/BoostedJonP)

+---
+license: mit
+tags:
+- Jerome Powell AI model
+- Federal Reserve chatbot
+- fine-tuned Phi-3
+- financial language model
+- LLM fine-tuning
+- machine learning engineering
+- LoRA training
+- NLP
+datasets:
+- BoostedJonP/JeromePowell-SFT
+language:
+- en
+base_model:
+- microsoft/Phi-3-mini-4k-instruct
+pipeline_tag: text-generation
+---
+# Powell-Phi3-Mini — Jerome Powell Style Language Model
+[![Hugging Face](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Model-yellow)](https://huggingface.co/BoostedJonP/powell-phi3-mini)
+[![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](https://opensource.org/licenses/MIT)
+[![GPU Training](https://img.shields.io/badge/Trained%20on-TESLA%20P100-green)](https://images.nvidia.com/content/tesla/pdf/nvidia-tesla-p100-PCIe-datasheet.pdf)
+[![Fine-tuning](https://img.shields.io/badge/Method-LoRA%2FQLoRA-orange)](https://arxiv.org/abs/2106.09685)
+## 🎯Summary
+**Powell-Phi3-Mini** is an fine-tuned language model that replicates Federal Reserve Chair Jerome Powell's distinctive communication style, tone, and strategic hedging patterns. This project showcases expertise in **modern LLM fine-tuning techniques**, **parameter-efficient training methods**, and **responsible AI development** — demonstrating industry-ready machine learning engineering skills.
+---
+## 🚀 Key Features & Capabilities
+### **Style Mimicry & Linguistic Analysis**
+- ✅ **Authentic Communication Style**: Replicates Powell's cautious, data-dependent rhetoric
+- ✅ **Strategic Hedging Patterns**: Maintains appropriate uncertainty in speculative scenarios
+- ✅ **Domain-Specific Responses**: Handles economic and monetary policy discussions contextually
+- ✅ **Refusal Training**: Appropriately declines to provide financial advice or policy predictions (to an extent)
+### **Technical Implementation**
+- ✅ **Efficient Architecture**: Built on Microsoft Phi-3-mini-4k-instruct (3.8B parameters)
+- ✅ **Scalable Training**: LoRA r=16, alpha=32 configuration optimized for consumer GPUs
+- ✅ **Deployment Flexibility**: Available as lightweight adapter or full merged model
+- ✅ **Integration Ready**: One-line inference with Hugging Face Transformers
+---
+## 💻 Implementation Examples
+### Production Ready - Merged Model
+```python
+from transformers import AutoTokenizer, AutoModelForCausalLM
+# One-line model loading
+tokenizer = AutoTokenizer.from_pretrained("BoostedJonP/powell-phi3-mini")
+model = AutoModelForCausalLM.from_pretrained("BoostedJonP/powell-phi3-mini", device_map="auto")
+# Economic analysis prompt
+prompt = "How is the current labor market affecting your inflation outlook?"
+inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
+response = model.generate(**inputs, max_new_tokens=200, do_sample=True)
+print(tokenizer.decode(response[0], skip_special_tokens=True))
+```
+---
+## 📊 Technical Specifications & Training Pipeline
+### **Model Architecture**
+| Component | Specification |
+|-----------|---------------|
+| **Base Model** | microsoft/Phi-3-mini-4k-instruct (3.8B parameters) |
+| **License** | MIT License (Commercial Use Approved) |
+| **Fine-tuning Method** | QLoRA with PEFT integration |
+| **Context Length** | 4,096 tokens |
+| **Training Hardware** | NVIDIA TESLA P100 (16GB VRAM) |
+### **Training Configuration**
+| Hyperparameter | Value | Rationale |
+|----------------|-------|-----------|
+| **LoRA Rank (r)** | 16 | Optimal parameter/performance balance |
+| **LoRA Alpha** | 32 | 2x rank for stable training |
+| **Dropout Rate** | 0.05 | Regularization without overfitting |
+| **Learning Rate** | 1.5e-4 | Conservative rate for stable convergence |
+| **Scheduler** | Cosine decay | Smooth learning rate reduction |
+| **Training Epochs** | 3 | Prevents overfitting on specialized domain |
+| **Sequence Length** | 1,536 tokens | Optimized for dataset |
+| **Precision** | Mixed fp16 | 2x memory efficiency, maintained accuracy |
+### **Dataset & Methodology**
+- **Data Source**: Public domain FOMC transcripts and Federal Reserve speeches -> [Jerome Powell Press Release Q&A](https://www.kaggle.com/datasets/jonathanpaserman/fed-press-release-text)
+- **Data Processing**: Instruction-response pairs extracted from press conferences -> [Jerome Powell Press Release SFT data processing](https://www.kaggle.com/code/jonathanpaserman/jerome-powell-press-release-sft-data-processing)
+- Available on [HuggingFace](https://huggingface.co/datasets/BoostedJonP/JeromePowell-SFT)
+- **Quality Control**: Manual review and filtering for authentic Powell communication patterns
+---
+## 📈 Performance Metrics & Evaluation
+### **Quantitative Results**
+| Metric | Baseline (Phi-3) | Powell-Phi3-Mini | Improvement |
+|--------|------------------|------------------|-------------|
+| **Powell-style Classification** |NA | NA | **NA** |
+| **Economic Domain Accuracy** |NA | NA | **NA** |
+| **Response Coherence (BLEU)**|NA | NA | **NA** |
+### **Qualitative Assessment**
+- NA
+---
+## 🌐 Deployment & Access
+### **🚀 Live Demo**
+**[Try Powell-Phi3-Mini Interactive Demo →](https://huggingface.co/spaces/BoostedJonP/powell-phi3-demo)**
+### **📦 Model Downloads**
+- **Adapter Version**: `BoostedJonP/powell-phi3-mini-adapter`
+- **Merged Model**: `BoostedJonP/powell-phi3-mini` (Full Model - 7.4GB)
+### **🔗 Resources**
+- **[GitHub Repository](https://github.com/BigJonP/powell-phi3-sft)**: Complete training code and evaluation scripts
+- **[Technical Blog Post](https://medium.com/@jonathanpaserman)**: Detailed implementation walkthrough
+- **[Hugging Face Collection](https://huggingface.co/collections/BoostedJonP/jerome-powell-68b9e7843f64507481d24ce9)**: All model variants and datasets
+---
+## ⚖️ Responsible AI & Legal Compliance
+### **Ethical Considerations**
+- ⚠️ **No Official Affiliation**: Not endorsed by or affiliated with the Federal Reserve System
+- ⚠️ **Educational Purpose Only**: Designed for research, education, and demonstration purposes
+- ⚠️ **No Financial Advice**: Model responses should not be interpreted as investment guidance
+- ⚠️ **Transparency**: All training data sourced from public domain government transcripts
+### **Licensing & Usage Rights**
+- **Base Model License**: MIT License (Microsoft Phi-3)
+- **Fine-tuned Weights**: MIT License (Commercial use permitted)
+- **Training Data**: Public domain (U.S. government works)
+- **Usage**: Unrestricted for research, education, and commercial applications
+---
+### 👨‍💻 **Connect & Collaborate**
+- **GitHub**: [Jonathan Paserman](https://github.com/BigJonP)
+- **Kaggle**: [Jonathan Paserman](https://www.kaggle.com/jonathanpaserman)
+- **HuggingFace**: [Jonathan Paserman](https://huggingface.co/BoostedJonP)