Spaces:

Starfish55
/

FinHight

Sleeping

App Files Files Community

FinHight / FINGPT_MIGRATION.md

Starfish55

Upload 8 files

23c518c verified 4 months ago

preview code

raw

history blame contribute delete

2.99 kB

	# FinGPT Migration Guide

	## Overview
	This document describes the migration from Google Gemini 2.5 Pro to FinGPT model for the FinRobot Forecaster application.

	## Changes Made

	### 1. Dependencies Updated
	- Removed: `google-generativeai>=0.3.0`
	- Added:
	- `transformers>=4.30.0`
	- `torch>=2.0.0`
	- `accelerate>=0.20.0`
	- `peft>=0.4.0`
	- `bitsandbytes>=0.39.0`

	### 2. Model Configuration
	- Old: `GEMINI_MODEL = "gemini-2.5-pro"`
	- New: `FINGPT_MODEL_NAME = "Starfish55/fingpt-complete"`

	### 3. API Key Changes
	- Old: `GOOGLE_API_KEYS` environment variable
	- New: `HF_TOKEN` environment variable for Hugging Face access

	### 4. Model Loading
	- Old: Google Generative AI client initialization
	- New: Hugging Face Transformers with BitsAndBytesConfig for memory efficiency

	### 5. Inference Function
	- Old: `genai.GenerativeModel.generate_content()`
	- New: `model.generate()` with custom tokenization and decoding

	## Environment Variables Required

	### For Hugging Face Spaces:
	```bash
	HF_TOKEN=your_hugging_face_token_here
	FINNHUB_KEYS=your_finnhub_api_keys_here
	RAPIDAPI_KEYS=your_rapidapi_keys_here
	```

	## Model Features

	### FinGPT Advantages:
	1. Specialized for Finance: Trained specifically on financial data
	2. Open Source: No API rate limits or costs
	3. Memory Efficient: Uses 4-bit quantization
	4. Real-time Updates: Can be fine-tuned with latest financial data

	### Performance Considerations:
	- Memory Usage: ~4-8GB GPU memory (with quantization)
	- Inference Speed: Slower than API calls but more reliable
	- Model Size: ~7B parameters (much smaller than Gemini)

	## Testing

	Run the test script to verify the integration:
	```bash
	python test_fingpt_integration.py
	```

	## Deployment Notes

	1. Hugging Face Spaces: Ensure `HF_TOKEN` is set in secrets
	2. GPU Requirements: Requires GPU with at least 8GB VRAM
	3. Model Loading: First run may take longer due to model download
	4. Fallback: App will use mock responses if model fails to load

	## Troubleshooting

	### Common Issues:
	1. Out of Memory: Reduce `MAX_LENGTH` or use CPU inference
	2. Model Loading Fails: Check `HF_TOKEN` and internet connection
	3. Slow Inference: Consider using smaller model or CPU inference

	### Debug Mode:
	Set `debug=True` in the app launch to see detailed error messages.

	## Migration Benefits

	1. Cost Effective: No API costs for inference
	2. Privacy: Data stays local, no external API calls
	3. Reliability: No rate limits or API downtime
	4. Customization: Can fine-tune for specific financial tasks
	5. Transparency: Full control over model behavior

	## Next Steps

	1. Test the application with real financial data
	2. Fine-tune the model if needed for specific use cases
	3. Monitor performance and adjust parameters
	4. Consider implementing model caching for faster startup