Spaces:
Build error
Build error
| # 🚀 Guía de MLOps - Investment Assistant | |
| ## 📋 Tabla de Contenidos | |
| 1. [Introducción](#introducción) | |
| 2. [Arquitectura MLOps](#arquitectura-mlops) | |
| 3. [Configuración](#configuración) | |
| 4. [Pipeline de Datos (DVC)](#pipeline-de-datos-dvc) | |
| 5. [Tracking de Modelos (MLflow)](#tracking-de-modelos-mlflow) | |
| 6. [Monitoreo y Drift Detection](#monitoreo-y-drift-detection) | |
| 7. [Agentes con RAG](#agentes-con-rag) | |
| 8. [Despliegue en Kubernetes](#despliegue-en-kubernetes) | |
| 9. [Evaluación de Modelos](#evaluación-de-modelos) | |
| ## 📖 Introducción | |
| Este proyecto implementa un sistema completo de MLOps para el Investment Assistant, incluyendo: | |
| - ✅ **DVC**: Versionado de datos y modelos | |
| - ✅ **MLflow**: Tracking de experimentos y modelos | |
| - ✅ **RAG System**: Agentes con Retrieval Augmented Generation | |
| - ✅ **Drift Detection**: Detección de cambios en distribución de datos | |
| - ✅ **Kubernetes**: Despliegue escalable y resiliente | |
| - ✅ **Monitoring**: Métricas y alertas en tiempo real | |
| ## 🏗️ Arquitectura MLOps | |
| ``` | |
| ┌─────────────────────────────────────────────────────────────┐ | |
| │ Investment Assistant │ | |
| ├─────────────────────────────────────────────────────────────┤ | |
| │ │ | |
| │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ | |
| │ │ FastAPI │ │ Streamlit │ │ Agents │ │ | |
| │ │ Backend │◄───┤ Frontend │ │ with RAG │ │ | |
| │ └──────┬───────┘ └──────────────┘ └──────┬───────┘ │ | |
| │ │ │ │ | |
| │ ┌──────▼──────────────────────────────────────────▼──────┐ │ | |
| │ │ Monitoring & Drift Detection │ │ | |
| │ └──────┬─────────────────────────────────────────────────┘ │ | |
| │ │ │ | |
| │ ┌──────▼──────┐ ┌──────────────┐ ┌──────────────┐ │ | |
| │ │ MLflow │ │ DVC │ │ Kubernetes │ │ | |
| │ │ Tracking │ │ Versioning │ │ Cluster │ │ | |
| │ └─────────────┘ └──────────────┘ └──────────────┘ │ | |
| │ │ | |
| └─────────────────────────────────────────────────────────────┘ | |
| ``` | |
| ## ⚙️ Configuración | |
| ### Prerrequisitos | |
| ```bash | |
| # Instalar dependencias | |
| pip install -r requirements.txt | |
| # Instalar DVC | |
| pip install dvc dvc-s3 | |
| # Instalar MLflow | |
| pip install mlflow | |
| ``` | |
| ### Variables de Entorno | |
| Crear archivo `.env` con: | |
| ```env | |
| # OpenAI | |
| OPENAI_API_KEY=your-openai-key | |
| OPENAI_MODEL=gpt-4 | |
| # Azure | |
| AZURE_TEXT_ANALYTICS_ENDPOINT=https://... | |
| AZURE_TEXT_ANALYTICS_KEY=your-key | |
| # APIs | |
| ALPHA_VANTAGE_API_KEY=your-key | |
| # MLOps | |
| MLFLOW_TRACKING_URI=http://localhost:5000 | |
| ``` | |
| ## 📊 Pipeline de Datos (DVC) | |
| ### Inicializar DVC | |
| ```bash | |
| # Inicializar repositorio DVC | |
| dvc init | |
| # Configurar remoto (opcional) | |
| dvc remote add -d local ./dvc-cache | |
| # O usar S3 | |
| dvc remote add -d s3 s3://your-bucket/dvc-cache | |
| ``` | |
| ### Ejecutar Pipeline | |
| ```bash | |
| # Ejecutar pipeline completo | |
| dvc repro | |
| # Ejecutar etapa específica | |
| dvc repro prepare_data | |
| # Ver pipeline | |
| dvc dag | |
| # Ver métricas | |
| dvc metrics show | |
| ``` | |
| ### Estructura de Datos | |
| ``` | |
| data/ | |
| ├── raw/ # Datos crudos | |
| ├── processed/ # Datos procesados (DVC tracked) | |
| │ ├── market_data.parquet | |
| │ └── indicators.parquet | |
| models/ # Modelos (DVC tracked) | |
| │ ├── top_strategy_model.pkl | |
| │ └── bottom_strategy_model.pkl | |
| metrics/ # Métricas (DVC tracked) | |
| │ ├── model_metrics.json | |
| │ └── evaluation_metrics.json | |
| ``` | |
| ## 🎯 Tracking de Modelos (MLflow) | |
| ### Iniciar Servidor MLflow | |
| ```bash | |
| # Servidor MLflow | |
| mlflow server --host 0.0.0.0 --port 5000 | |
| # O usar Docker | |
| docker run -p 5000:5000 ghcr.io/mlflow/mlflow:v2.11.1 | |
| ``` | |
| ### Entrenar Modelos | |
| ```bash | |
| # Entrenar modelos (registra automáticamente en MLflow) | |
| python scripts/train_model.py | |
| ``` | |
| ### Ver Experimentos | |
| Abrir en navegador: `http://localhost:5000` | |
| ### Registrar Modelo | |
| ```python | |
| import mlflow | |
| # En train_model.py ya está incluido | |
| mlflow.sklearn.log_model(model, "strategy_model") | |
| ``` | |
| ### Cargar Modelo para Producción | |
| ```python | |
| import mlflow.pyfunc | |
| # Cargar modelo por versión | |
| model = mlflow.pyfunc.load_model( | |
| model_uri=f"runs:/{run_id}/strategy_model" | |
| ) | |
| ``` | |
| ## 🔍 Monitoreo y Drift Detection | |
| ### Inicializar Monitoreo | |
| ```python | |
| from src.monitoring.monitoring_service import MonitoringService | |
| # Crear servicio de monitoreo | |
| monitoring = MonitoringService(drift_threshold=0.15) | |
| # Establecer baseline de referencia | |
| reference_data = pd.read_parquet("data/processed/indicators.parquet") | |
| monitoring.initialize_reference_baseline(reference_data) | |
| ``` | |
| ### Detectar Drift | |
| ```python | |
| # Obtener datos actuales | |
| current_data = fetch_current_market_data() | |
| # Detectar drift | |
| drift_result = monitoring.drift_detector.detect_drift(current_data) | |
| if drift_result["drift_detected"]: | |
| print("⚠️ Drift detectado! Re-entrenar modelo recomendado.") | |
| ``` | |
| ### Obtener Health Report | |
| ```python | |
| # Reporte de salud del sistema | |
| health = monitoring.get_health_report() | |
| print(f"Status: {health['status']}") | |
| print(f"Predictions 24h: {health['metrics']['predictions_last_24h']}") | |
| print(f"Drift alerts: {health['metrics']['drift_alerts_last_7d']}") | |
| ``` | |
| ## 🤖 Agentes con RAG | |
| ### Inicializar Agente | |
| ```python | |
| from openai import OpenAI | |
| from src.agents.investment_agent import InvestmentAgent | |
| from src.agents.rag_system import RAGSystem | |
| # Inicializar cliente OpenAI | |
| openai_client = OpenAI(api_key=os.getenv("OPENAI_API_KEY")) | |
| # Crear sistema RAG | |
| rag = RAGSystem(openai_client) | |
| # Crear agente | |
| agent = InvestmentAgent(openai_client, rag) | |
| ``` | |
| ### Usar Agente | |
| ```python | |
| # Chat con agente (usa RAG automáticamente) | |
| response = agent.chat( | |
| "¿Debería invertir en Bitcoin ahora?", | |
| context={"market_data": current_market_data} | |
| ) | |
| print(response) | |
| # Análisis de estrategia con agente | |
| analysis = agent.analyze_strategy( | |
| symbol="SPY", | |
| strategy_type="TOP", | |
| market_data=market_data, | |
| news_sentiment=sentiment | |
| ) | |
| ``` | |
| ### Agregar Conocimiento al RAG | |
| ```python | |
| # Agregar documentos al knowledge base | |
| documents = [ | |
| { | |
| "content": "Nueva estrategia de inversión...", | |
| "metadata": {"source": "research", "type": "strategy"} | |
| } | |
| ] | |
| rag.add_documents(documents) | |
| ``` | |
| ## ☸️ Despliegue en Kubernetes | |
| ### Prerrequisitos | |
| - Kubernetes cluster (local con Minikube o cloud) | |
| - `kubectl` configurado | |
| - Docker images construidas | |
| ### Construir Imágenes | |
| ```bash | |
| # Construir imagen Docker | |
| docker build -t investment-assistant:latest . | |
| # O usar docker-compose | |
| docker-compose build | |
| ``` | |
| ### Aplicar Configuraciones | |
| ```bash | |
| # Crear secrets (ajustar valores primero) | |
| kubectl apply -f kubernetes/secrets.yaml | |
| # Crear PVCs | |
| kubectl apply -f kubernetes/pvc.yaml | |
| # Crear servicios | |
| kubectl apply -f kubernetes/service.yaml | |
| # Crear deployments | |
| kubectl apply -f kubernetes/deployment.yaml | |
| # Crear HPA | |
| kubectl apply -f kubernetes/hpa.yaml | |
| ``` | |
| ### Verificar Despliegue | |
| ```bash | |
| # Ver pods | |
| kubectl get pods | |
| # Ver servicios | |
| kubectl get services | |
| # Ver logs | |
| kubectl logs -f deployment/investment-assistant-api | |
| # Ver HPA | |
| kubectl get hpa | |
| ``` | |
| ### Escalar Manualmente | |
| ```bash | |
| # Escalar deployment | |
| kubectl scale deployment investment-assistant-api --replicas=5 | |
| ``` | |
| ## 📈 Evaluación de Modelos | |
| ### Evaluación Básica | |
| ```bash | |
| # Evaluar modelos entrenados | |
| python scripts/evaluate_model.py | |
| ``` | |
| ### Evaluación Comprehensiva | |
| ```bash | |
| # Evaluación con drift detection y métricas completas | |
| python scripts/evaluate_model_enhanced.py | |
| ``` | |
| ### Ejecutar Tests | |
| ```bash | |
| # Todos los tests | |
| pytest | |
| # Tests unitarios | |
| pytest tests/test_agents.py tests/test_monitoring.py | |
| # Tests de integración | |
| pytest tests/test_integration.py | |
| # Con coverage | |
| pytest --cov=src --cov-report=html | |
| ``` | |
| ## 📊 Métricas y Reportes | |
| ### Ver Métricas DVC | |
| ```bash | |
| # Mostrar métricas | |
| dvc metrics show | |
| # Comparar métricas entre versiones | |
| dvc metrics diff HEAD~1 | |
| ``` | |
| ### Ver Métricas MLflow | |
| Abrir UI de MLflow: `http://localhost:5000` | |
| ### Métricas de Monitoreo | |
| ```python | |
| # Obtener métricas recientes | |
| recent_metrics = monitoring.metrics_collector.get_recent_metrics(hours=24) | |
| # Estadísticas de rendimiento | |
| stats = monitoring.metrics_collector.calculate_performance_stats(strategy_type="TOP") | |
| ``` | |
| ## 🔧 Mejores Prácticas | |
| ### Versionado de Datos | |
| 1. **Siempre versionar datos con DVC** antes de entrenar | |
| 2. **Tag releases** importantes: `dvc tag v1.0` | |
| 3. **Comparar versiones**: `dvc diff v1.0 v1.1` | |
| ### Tracking de Experimentos | |
| 1. **Registrar todos los parámetros** en MLflow | |
| 2. **Log métricas** durante entrenamiento | |
| 3. **Tag runs** importantes (production, baseline, etc.) | |
| ### Monitoreo Continuo | |
| 1. **Establecer baseline** con datos históricos | |
| 2. **Monitorear drift** diariamente | |
| 3. **Alertar** cuando drift > threshold | |
| ### Despliegue | |
| 1. **Usar versiones específicas** de imágenes | |
| 2. **Health checks** en todos los servicios | |
| 3. **Resource limits** apropiados | |
| 4. **HPA** para escalado automático | |
| ## 🆘 Troubleshooting | |
| ### DVC Issues | |
| ```bash | |
| # Verificar cache | |
| dvc cache dir | |
| # Limpiar cache | |
| dvc cache cleanup | |
| # Forzar reproducción | |
| dvc repro --force | |
| ``` | |
| ### MLflow Issues | |
| ```bash | |
| # Verificar servidor | |
| curl http://localhost:5000/health | |
| # Resetear base de datos (¡cuidado!) | |
| mlflow db upgrade | |
| ``` | |
| ### Kubernetes Issues | |
| ```bash | |
| # Ver eventos | |
| kubectl get events --sort-by=.metadata.creationTimestamp | |
| # Describir pod | |
| kubectl describe pod <pod-name> | |
| # Ver logs | |
| kubectl logs <pod-name> --previous | |
| ``` | |
| ## 📚 Recursos Adicionales | |
| - [DVC Documentation](https://dvc.org/doc) | |
| - [MLflow Documentation](https://mlflow.org/docs/latest/index.html) | |
| - [Kubernetes Documentation](https://kubernetes.io/docs/) | |
| - [LangChain RAG](https://python.langchain.com/docs/use_cases/question_answering/) | |
| --- | |
| **Versión**: 1.0 | |
| **Última actualización**: {{ fecha }} | |