Spaces:

Egeekle
/

Investment_Assistant

Build error

App Files Files Community

Investment_Assistant / MLOPS_GUIDE.md

Egeekle

Add MLOps, RAG, monitoring, and utility dependencies to requirements.txt

7a658e1 3 months ago

preview code

raw

history blame contribute delete

11.1 kB

	# 🚀 Guía de MLOps - Investment Assistant

	## 📋 Tabla de Contenidos

	1. [Introducción](#introducción)
	2. [Arquitectura MLOps](#arquitectura-mlops)
	3. [Configuración](#configuración)
	4. [Pipeline de Datos (DVC)](#pipeline-de-datos-dvc)
	5. [Tracking de Modelos (MLflow)](#tracking-de-modelos-mlflow)
	6. [Monitoreo y Drift Detection](#monitoreo-y-drift-detection)
	7. [Agentes con RAG](#agentes-con-rag)
	8. [Despliegue en Kubernetes](#despliegue-en-kubernetes)
	9. [Evaluación de Modelos](#evaluación-de-modelos)

	## 📖 Introducción

	Este proyecto implementa un sistema completo de MLOps para el Investment Assistant, incluyendo:

	- ✅ DVC: Versionado de datos y modelos
	- ✅ MLflow: Tracking de experimentos y modelos
	- ✅ RAG System: Agentes con Retrieval Augmented Generation
	- ✅ Drift Detection: Detección de cambios en distribución de datos
	- ✅ Kubernetes: Despliegue escalable y resiliente
	- ✅ Monitoring: Métricas y alertas en tiempo real

	## 🏗️ Arquitectura MLOps

	```
	┌─────────────────────────────────────────────────────────────┐
	│ Investment Assistant │
	├─────────────────────────────────────────────────────────────┤
	│ │
	│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
	│ │ FastAPI │ │ Streamlit │ │ Agents │ │
	│ │ Backend │◄───┤ Frontend │ │ with RAG │ │
	│ └──────┬───────┘ └──────────────┘ └──────┬───────┘ │
	│ │ │ │
	│ ┌──────▼──────────────────────────────────────────▼──────┐ │
	│ │ Monitoring & Drift Detection │ │
	│ └──────┬─────────────────────────────────────────────────┘ │
	│ │ │
	│ ┌──────▼──────┐ ┌──────────────┐ ┌──────────────┐ │
	│ │ MLflow │ │ DVC │ │ Kubernetes │ │
	│ │ Tracking │ │ Versioning │ │ Cluster │ │
	│ └─────────────┘ └──────────────┘ └──────────────┘ │
	│ │
	└─────────────────────────────────────────────────────────────┘
	```

	## ⚙️ Configuración

	### Prerrequisitos

	```bash
	# Instalar dependencias
	pip install -r requirements.txt

	# Instalar DVC
	pip install dvc dvc-s3

	# Instalar MLflow
	pip install mlflow
	```

	### Variables de Entorno

	Crear archivo `.env` con:

	```env
	# OpenAI
	OPENAI_API_KEY=your-openai-key
	OPENAI_MODEL=gpt-4

	# Azure
	AZURE_TEXT_ANALYTICS_ENDPOINT=https://...
	AZURE_TEXT_ANALYTICS_KEY=your-key

	# APIs
	ALPHA_VANTAGE_API_KEY=your-key

	# MLOps
	MLFLOW_TRACKING_URI=http://localhost:5000
	```

	## 📊 Pipeline de Datos (DVC)

	### Inicializar DVC

	```bash
	# Inicializar repositorio DVC
	dvc init

	# Configurar remoto (opcional)
	dvc remote add -d local ./dvc-cache

	# O usar S3
	dvc remote add -d s3 s3://your-bucket/dvc-cache
	```

	### Ejecutar Pipeline

	```bash
	# Ejecutar pipeline completo
	dvc repro

	# Ejecutar etapa específica
	dvc repro prepare_data

	# Ver pipeline
	dvc dag

	# Ver métricas
	dvc metrics show
	```

	### Estructura de Datos

	```
	data/
	├── raw/ # Datos crudos
	├── processed/ # Datos procesados (DVC tracked)
	│ ├── market_data.parquet
	│ └── indicators.parquet
	models/ # Modelos (DVC tracked)
	│ ├── top_strategy_model.pkl
	│ └── bottom_strategy_model.pkl
	metrics/ # Métricas (DVC tracked)
	│ ├── model_metrics.json
	│ └── evaluation_metrics.json
	```

	## 🎯 Tracking de Modelos (MLflow)

	### Iniciar Servidor MLflow

	```bash
	# Servidor MLflow
	mlflow server --host 0.0.0.0 --port 5000

	# O usar Docker
	docker run -p 5000:5000 ghcr.io/mlflow/mlflow:v2.11.1
	```

	### Entrenar Modelos

	```bash
	# Entrenar modelos (registra automáticamente en MLflow)
	python scripts/train_model.py
	```

	### Ver Experimentos

	Abrir en navegador: `http://localhost:5000`

	### Registrar Modelo

	```python
	import mlflow

	# En train_model.py ya está incluido
	mlflow.sklearn.log_model(model, "strategy_model")
	```

	### Cargar Modelo para Producción

	```python
	import mlflow.pyfunc

	# Cargar modelo por versión
	model = mlflow.pyfunc.load_model(
	model_uri=f"runs:/{run_id}/strategy_model"
	)
	```

	## 🔍 Monitoreo y Drift Detection

	### Inicializar Monitoreo

	```python
	from src.monitoring.monitoring_service import MonitoringService

	# Crear servicio de monitoreo
	monitoring = MonitoringService(drift_threshold=0.15)

	# Establecer baseline de referencia
	reference_data = pd.read_parquet("data/processed/indicators.parquet")
	monitoring.initialize_reference_baseline(reference_data)
	```

	### Detectar Drift

	```python
	# Obtener datos actuales
	current_data = fetch_current_market_data()

	# Detectar drift
	drift_result = monitoring.drift_detector.detect_drift(current_data)

	if drift_result["drift_detected"]:
	print("⚠️ Drift detectado! Re-entrenar modelo recomendado.")
	```

	### Obtener Health Report

	```python
	# Reporte de salud del sistema
	health = monitoring.get_health_report()

	print(f"Status: {health['status']}")
	print(f"Predictions 24h: {health['metrics']['predictions_last_24h']}")
	print(f"Drift alerts: {health['metrics']['drift_alerts_last_7d']}")
	```

	## 🤖 Agentes con RAG

	### Inicializar Agente

	```python
	from openai import OpenAI
	from src.agents.investment_agent import InvestmentAgent
	from src.agents.rag_system import RAGSystem

	# Inicializar cliente OpenAI
	openai_client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

	# Crear sistema RAG
	rag = RAGSystem(openai_client)

	# Crear agente
	agent = InvestmentAgent(openai_client, rag)
	```

	### Usar Agente

	```python
	# Chat con agente (usa RAG automáticamente)
	response = agent.chat(
	"¿Debería invertir en Bitcoin ahora?",
	context={"market_data": current_market_data}
	)

	print(response)

	# Análisis de estrategia con agente
	analysis = agent.analyze_strategy(
	symbol="SPY",
	strategy_type="TOP",
	market_data=market_data,
	news_sentiment=sentiment
	)
	```

	### Agregar Conocimiento al RAG

	```python
	# Agregar documentos al knowledge base
	documents = [
	{
	"content": "Nueva estrategia de inversión...",
	"metadata": {"source": "research", "type": "strategy"}
	}
	]

	rag.add_documents(documents)
	```

	## ☸️ Despliegue en Kubernetes

	### Prerrequisitos

	- Kubernetes cluster (local con Minikube o cloud)
	- `kubectl` configurado
	- Docker images construidas

	### Construir Imágenes

	```bash
	# Construir imagen Docker
	docker build -t investment-assistant:latest .

	# O usar docker-compose
	docker-compose build
	```

	### Aplicar Configuraciones

	```bash
	# Crear secrets (ajustar valores primero)
	kubectl apply -f kubernetes/secrets.yaml

	# Crear PVCs
	kubectl apply -f kubernetes/pvc.yaml

	# Crear servicios
	kubectl apply -f kubernetes/service.yaml

	# Crear deployments
	kubectl apply -f kubernetes/deployment.yaml

	# Crear HPA
	kubectl apply -f kubernetes/hpa.yaml
	```

	### Verificar Despliegue

	```bash
	# Ver pods
	kubectl get pods

	# Ver servicios
	kubectl get services

	# Ver logs
	kubectl logs -f deployment/investment-assistant-api

	# Ver HPA
	kubectl get hpa
	```

	### Escalar Manualmente

	```bash
	# Escalar deployment
	kubectl scale deployment investment-assistant-api --replicas=5
	```

	## 📈 Evaluación de Modelos

	### Evaluación Básica

	```bash
	# Evaluar modelos entrenados
	python scripts/evaluate_model.py
	```

	### Evaluación Comprehensiva

	```bash
	# Evaluación con drift detection y métricas completas
	python scripts/evaluate_model_enhanced.py
	```

	### Ejecutar Tests

	```bash
	# Todos los tests
	pytest

	# Tests unitarios
	pytest tests/test_agents.py tests/test_monitoring.py

	# Tests de integración
	pytest tests/test_integration.py

	# Con coverage
	pytest --cov=src --cov-report=html
	```

	## 📊 Métricas y Reportes

	### Ver Métricas DVC

	```bash
	# Mostrar métricas
	dvc metrics show

	# Comparar métricas entre versiones
	dvc metrics diff HEAD~1
	```

	### Ver Métricas MLflow

	Abrir UI de MLflow: `http://localhost:5000`

	### Métricas de Monitoreo

	```python
	# Obtener métricas recientes
	recent_metrics = monitoring.metrics_collector.get_recent_metrics(hours=24)

	# Estadísticas de rendimiento
	stats = monitoring.metrics_collector.calculate_performance_stats(strategy_type="TOP")
	```

	## 🔧 Mejores Prácticas

	### Versionado de Datos

	1. Siempre versionar datos con DVC antes de entrenar
	2. Tag releases importantes: `dvc tag v1.0`
	3. Comparar versiones: `dvc diff v1.0 v1.1`

	### Tracking de Experimentos

	1. Registrar todos los parámetros en MLflow
	2. Log métricas durante entrenamiento
	3. Tag runs importantes (production, baseline, etc.)

	### Monitoreo Continuo

	1. Establecer baseline con datos históricos
	2. Monitorear drift diariamente
	3. Alertar cuando drift > threshold

	### Despliegue

	1. Usar versiones específicas de imágenes
	2. Health checks en todos los servicios
	3. Resource limits apropiados
	4. HPA para escalado automático

	## 🆘 Troubleshooting

	### DVC Issues

	```bash
	# Verificar cache
	dvc cache dir

	# Limpiar cache
	dvc cache cleanup

	# Forzar reproducción
	dvc repro --force
	```

	### MLflow Issues

	```bash
	# Verificar servidor
	curl http://localhost:5000/health

	# Resetear base de datos (¡cuidado!)
	mlflow db upgrade
	```

	### Kubernetes Issues

	```bash
	# Ver eventos
	kubectl get events --sort-by=.metadata.creationTimestamp

	# Describir pod
	kubectl describe pod <pod-name>

	# Ver logs
	kubectl logs <pod-name> --previous
	```

	## 📚 Recursos Adicionales

	- [DVC Documentation](https://dvc.org/doc)
	- [MLflow Documentation](https://mlflow.org/docs/latest/index.html)
	- [Kubernetes Documentation](https://kubernetes.io/docs/)
	- [LangChain RAG](https://python.langchain.com/docs/use_cases/question_answering/)

	---

	Versión: 1.0
	Última actualización: {{ fecha }}