FinHight / FINGPT_MIGRATION.md
Starfish55's picture
Upload 8 files
23c518c verified
# FinGPT Migration Guide
## Overview
This document describes the migration from Google Gemini 2.5 Pro to FinGPT model for the FinRobot Forecaster application.
## Changes Made
### 1. Dependencies Updated
- **Removed**: `google-generativeai>=0.3.0`
- **Added**:
- `transformers>=4.30.0`
- `torch>=2.0.0`
- `accelerate>=0.20.0`
- `peft>=0.4.0`
- `bitsandbytes>=0.39.0`
### 2. Model Configuration
- **Old**: `GEMINI_MODEL = "gemini-2.5-pro"`
- **New**: `FINGPT_MODEL_NAME = "Starfish55/fingpt-complete"`
### 3. API Key Changes
- **Old**: `GOOGLE_API_KEYS` environment variable
- **New**: `HF_TOKEN` environment variable for Hugging Face access
### 4. Model Loading
- **Old**: Google Generative AI client initialization
- **New**: Hugging Face Transformers with BitsAndBytesConfig for memory efficiency
### 5. Inference Function
- **Old**: `genai.GenerativeModel.generate_content()`
- **New**: `model.generate()` with custom tokenization and decoding
## Environment Variables Required
### For Hugging Face Spaces:
```bash
HF_TOKEN=your_hugging_face_token_here
FINNHUB_KEYS=your_finnhub_api_keys_here
RAPIDAPI_KEYS=your_rapidapi_keys_here
```
## Model Features
### FinGPT Advantages:
1. **Specialized for Finance**: Trained specifically on financial data
2. **Open Source**: No API rate limits or costs
3. **Memory Efficient**: Uses 4-bit quantization
4. **Real-time Updates**: Can be fine-tuned with latest financial data
### Performance Considerations:
- **Memory Usage**: ~4-8GB GPU memory (with quantization)
- **Inference Speed**: Slower than API calls but more reliable
- **Model Size**: ~7B parameters (much smaller than Gemini)
## Testing
Run the test script to verify the integration:
```bash
python test_fingpt_integration.py
```
## Deployment Notes
1. **Hugging Face Spaces**: Ensure `HF_TOKEN` is set in secrets
2. **GPU Requirements**: Requires GPU with at least 8GB VRAM
3. **Model Loading**: First run may take longer due to model download
4. **Fallback**: App will use mock responses if model fails to load
## Troubleshooting
### Common Issues:
1. **Out of Memory**: Reduce `MAX_LENGTH` or use CPU inference
2. **Model Loading Fails**: Check `HF_TOKEN` and internet connection
3. **Slow Inference**: Consider using smaller model or CPU inference
### Debug Mode:
Set `debug=True` in the app launch to see detailed error messages.
## Migration Benefits
1. **Cost Effective**: No API costs for inference
2. **Privacy**: Data stays local, no external API calls
3. **Reliability**: No rate limits or API downtime
4. **Customization**: Can fine-tune for specific financial tasks
5. **Transparency**: Full control over model behavior
## Next Steps
1. Test the application with real financial data
2. Fine-tune the model if needed for specific use cases
3. Monitor performance and adjust parameters
4. Consider implementing model caching for faster startup