Spaces:
Sleeping
Sleeping
| # FinGPT Migration Guide | |
| ## Overview | |
| This document describes the migration from Google Gemini 2.5 Pro to FinGPT model for the FinRobot Forecaster application. | |
| ## Changes Made | |
| ### 1. Dependencies Updated | |
| - **Removed**: `google-generativeai>=0.3.0` | |
| - **Added**: | |
| - `transformers>=4.30.0` | |
| - `torch>=2.0.0` | |
| - `accelerate>=0.20.0` | |
| - `peft>=0.4.0` | |
| - `bitsandbytes>=0.39.0` | |
| ### 2. Model Configuration | |
| - **Old**: `GEMINI_MODEL = "gemini-2.5-pro"` | |
| - **New**: `FINGPT_MODEL_NAME = "Starfish55/fingpt-complete"` | |
| ### 3. API Key Changes | |
| - **Old**: `GOOGLE_API_KEYS` environment variable | |
| - **New**: `HF_TOKEN` environment variable for Hugging Face access | |
| ### 4. Model Loading | |
| - **Old**: Google Generative AI client initialization | |
| - **New**: Hugging Face Transformers with BitsAndBytesConfig for memory efficiency | |
| ### 5. Inference Function | |
| - **Old**: `genai.GenerativeModel.generate_content()` | |
| - **New**: `model.generate()` with custom tokenization and decoding | |
| ## Environment Variables Required | |
| ### For Hugging Face Spaces: | |
| ```bash | |
| HF_TOKEN=your_hugging_face_token_here | |
| FINNHUB_KEYS=your_finnhub_api_keys_here | |
| RAPIDAPI_KEYS=your_rapidapi_keys_here | |
| ``` | |
| ## Model Features | |
| ### FinGPT Advantages: | |
| 1. **Specialized for Finance**: Trained specifically on financial data | |
| 2. **Open Source**: No API rate limits or costs | |
| 3. **Memory Efficient**: Uses 4-bit quantization | |
| 4. **Real-time Updates**: Can be fine-tuned with latest financial data | |
| ### Performance Considerations: | |
| - **Memory Usage**: ~4-8GB GPU memory (with quantization) | |
| - **Inference Speed**: Slower than API calls but more reliable | |
| - **Model Size**: ~7B parameters (much smaller than Gemini) | |
| ## Testing | |
| Run the test script to verify the integration: | |
| ```bash | |
| python test_fingpt_integration.py | |
| ``` | |
| ## Deployment Notes | |
| 1. **Hugging Face Spaces**: Ensure `HF_TOKEN` is set in secrets | |
| 2. **GPU Requirements**: Requires GPU with at least 8GB VRAM | |
| 3. **Model Loading**: First run may take longer due to model download | |
| 4. **Fallback**: App will use mock responses if model fails to load | |
| ## Troubleshooting | |
| ### Common Issues: | |
| 1. **Out of Memory**: Reduce `MAX_LENGTH` or use CPU inference | |
| 2. **Model Loading Fails**: Check `HF_TOKEN` and internet connection | |
| 3. **Slow Inference**: Consider using smaller model or CPU inference | |
| ### Debug Mode: | |
| Set `debug=True` in the app launch to see detailed error messages. | |
| ## Migration Benefits | |
| 1. **Cost Effective**: No API costs for inference | |
| 2. **Privacy**: Data stays local, no external API calls | |
| 3. **Reliability**: No rate limits or API downtime | |
| 4. **Customization**: Can fine-tune for specific financial tasks | |
| 5. **Transparency**: Full control over model behavior | |
| ## Next Steps | |
| 1. Test the application with real financial data | |
| 2. Fine-tune the model if needed for specific use cases | |
| 3. Monitor performance and adjust parameters | |
| 4. Consider implementing model caching for faster startup | |