# Service Integration Guide: Cognitive Theory & SHAP ## Quick Start ### 1. Install Dependencies ```bash cd ML pip install -r requirements.txt # For real SHAP support (optional but recommended): pip install shap ``` ### 2. Start ML Service ```bash cd ML python main.py ``` Service will be available at `http://localhost:8000` ### 3. Test Endpoints ```bash # Health check curl http://localhost:8000/health # Personality analysis curl -X POST http://localhost:8000/ml/personality/analyze \ -H "Content-Type: application/json" \ -d '{ "openness": 75, "conscientiousness": 80, "extraversion": 60, "agreeableness": 70, "neuroticism": 40 }' # Task prediction with SHAP explanation curl -X POST http://localhost:8000/ml/predict/explain \ -H "Content-Type: application/json" \ -d '{ "task": { "title": "Complete project report", "category": "WORK", "priority": "HIGH", "estimated_duration": 120, "complexity": 4, "due_date": "2026-03-10T17:00:00Z", "personality": { "openness": 75, "conscientiousness": 80, "extraversion": 60, "agreeableness": 70, "neuroticism": 40 } } }' ``` ## API Endpoints ### Personality Analysis **POST /ml/personality/analyze** Request: ```json { "openness": 75, "conscientiousness": 80, "extraversion": 60, "agreeableness": 70, "neuroticism": 40 } ``` Response: ```json { "personality_type": "ENTJ - The Commander", "type_code": "ENTJ", "strengths": ["Strong organization", "Reliable delivery", "Excellent communication"], "weaknesses": ["May face team collaboration challenges"], "work_style": "Thrives in collaborative environments | Benefits from detailed planning", "recommendations": "Use detailed task breakdowns to leverage your planning strength", "cognitive_style": { "primary_style": "Systematic", "scores": { "analytical": 67.5, "creative": 75.0, "systematic": 80.0, "social": 65.0 } }, "traits_analysis": { "conscientiousness": { "value": 80, "level": "high", "description": "Highly organized and disciplined...", "percentile": 84 } } } ``` ### Task Prediction with SHAP Explanation **POST /ml/predict/explain** Request: ```json { "task": { "title": "Complete project report", "description": "Write comprehensive analysis report", "category": "WORK", "priority": "HIGH", "estimated_duration": 120, "complexity": 4, "due_date": "2026-03-10T17:00:00Z", "personality": { "openness": 75, "conscientiousness": 80, "extraversion": 60, "agreeableness": 70, "neuroticism": 40 }, "historical_completion_rate": 0.78 } } ``` Response: ```json { "prediction_summary": { "completion_probability": 0.75, "stress_level": 6.5, "difficulty": "MODERATE", "outcome_assessment": "Likely to succeed with some attention" }, "feature_attribution": { "base_value": 0.5, "prediction": 0.75, "shap_values": { "completion_rate": 0.08, "trait_conscientiousness": 0.12, "time_pressure": -0.05, "complexity_normalized": -0.03, "pri_attention_demand": 0.06 }, "method": "tree_shap", "feature_ranking": [ { "feature": "trait_conscientiousness", "impact": 0.12, "direction": "positive", "plain_english": "Your high conscientiousness means you tend to be disciplined and organized, which helps task completion." }, { "feature": "completion_rate", "impact": 0.08, "direction": "positive", "plain_english": "You've completed 78% of past tasks on time - this strong track record boosts your predicted success." } ], "top_positive_features": [ { "feature": "Conscientiousness", "impact": 0.12, "plain_english": "Your high conscientiousness..." } ], "top_negative_features": [ { "feature": "Time Pressure", "impact": 0.05, "plain_english": "The deadline is approaching fast..." } ], "waterfall_data": [ {"name": "Base Probability", "value": 0.5, "cumulative": 0.5, "type": "base"}, {"name": "Conscientiousness", "value": 0.12, "cumulative": 0.62, "type": "positive"}, {"name": "Completion Rate", "value": 0.08, "cumulative": 0.70, "type": "positive"}, {"name": "Time Pressure", "value": -0.05, "cumulative": 0.65, "type": "negative"}, {"name": "Final Prediction", "value": 0.75, "cumulative": 0.75, "type": "total"} ] }, "counterfactual_scenarios": [ { "feature": "complexity_normalized", "current_value": 0.8, "suggested_value": 0.5, "action": "Break task into smaller subtasks", "expected_probability": 0.85, "feasibility": "high" } ], "recommendations": [ { "title": "Break Down Task", "description": "Split into smaller, manageable subtasks", "priority": "high", "risk_addressed": "completion_risk" }, { "title": "Time Block", "description": "Reserve dedicated time slots for this task", "priority": "medium", "risk_addressed": "focus_risk" } ], "confidence_assessment": { "data_quality": "high", "prediction_confidence": 0.85, "explanation_confidence": 0.82 }, "natural_language_summary": "This task has a moderate 75% completion probability. Your Conscientiousness is working in your favor. However, Time Pressure is a concern. Top recommendation: Reserve dedicated time slots for this task." } ``` ### Cognitive Theory Analysis **POST /ml/cognitive/analyze** Request: ```json { "task": { "title": "Complete project report", "category": "WORK", "priority": "HIGH", "estimated_duration": 120, "complexity": 4 }, "personality": { "openness": 75, "conscientiousness": 80, "extraversion": 60, "agreeableness": 70, "neuroticism": 40 }, "context": { "active_tasks_count": 5, "time_pressure": 0.3, "high_interruption_risk": false }, "historical_performance": { "completion_rate": 0.78, "on_time_rate": 0.82 } } ``` Response: ```json { "success_probability": 0.72, "cognitive_load_analysis": { "intrinsic_load": 0.68, "extraneous_load": 0.25, "germane_load": 0.35, "total_load": 0.52, "overload_risk": false, "working_memory_utilization": 52.0, "recommendations": [ "Minimize distractions and interruptions", "Schedule during low-interruption periods" ] }, "personality_task_fit": { "overall_fit": 0.15, "component_fits": { "conscientiousness_fit": 0.216, "stress_vulnerability": -0.09 }, "fit_level": "excellent", "recommendations": [ "Use external structure: timers, checklists" ] }, "motivation_analysis": { "intrinsic_motivation": 0.65, "motivation_type": "intrinsic", "needs_satisfaction": { "autonomy": 0.7, "competence": 0.75, "relatedness": 0.5 }, "recommendations": [ "Enhance relatedness: Find accountability partner" ] }, "flow_state_analysis": { "flow_potential": 0.85, "challenge_level": 0.72, "skill_level": 0.83, "challenge_skill_ratio": 0.87, "zone": "flow", "recommendations": [ "Optimal conditions for flow state!", "Minimize interruptions to maintain flow" ] }, "integrated_recommendations": [ "Minimize distractions and interruptions", "Schedule during low-interruption periods", "Optimal conditions for flow state!", "Minimize interruptions to maintain flow" ], "risk_factors": [] } ``` ## Backend Integration ### Java Service Layer **MLClientService.java** - Add cognitive theory endpoint: ```java public CognitiveAnalysisResponse analyzeCognitiveFactors( Task task, PersonalityProfile personality, Map context, Map historicalPerformance) { Map request = new HashMap<>(); request.put("task", buildTaskRequest(task)); request.put("personality", buildPersonalityMap(personality)); request.put("context", context); request.put("historical_performance", historicalPerformance); return webClient .post() .uri("/ml/cognitive/analyze") .bodyValue(request) .retrieve() .bodyToMono(CognitiveAnalysisResponse.class) .block(); } ``` **PredictionService.java** - Enhanced prediction with cognitive theory: ```java public PredictionDetailDTO getPredictionDetail(UUID taskId) { Task task = taskRepository.findById(taskId).orElseThrow(); User user = getCurrentUser(); PersonalityProfile personality = personalityRepository.findByUserId(user.getId()).orElse(null); // Get ML prediction with SHAP MLExplainabilityResponse mlExplanation = mlClientService.explainPrediction(task, personality); // Get cognitive theory analysis Map context = buildContext(user); Map historicalPerf = buildHistoricalPerformance(user); CognitiveAnalysisResponse cognitiveAnalysis = mlClientService.analyzeCognitiveFactors( task, personality, context, historicalPerf ); // Combine into comprehensive response return buildEnhancedPrediction(task, mlExplanation, cognitiveAnalysis); } ``` ### Frontend Integration **api.ts** - Add cognitive analysis endpoint: ```typescript async getCognitiveAnalysis(taskId: string): Promise { return this.backendRequest( `/predictions/task/${taskId}/cognitive` ); } ``` **PredictionExplainer.tsx** - Display SHAP and cognitive insights: ```typescript export function PredictionExplainer({ taskId }: { taskId: string }) { const { data: explanation } = useQuery(['explanation', taskId], () => api.getTaskExplanation(taskId) ); const { data: cognitive } = useQuery(['cognitive', taskId], () => api.getCognitiveAnalysis(taskId) ); return (
{/* SHAP Waterfall Chart */} {/* Feature Contributions */} {/* Cognitive Load Analysis */} {/* Flow State Indicator */} {/* Counterfactual Suggestions */} {/* Integrated Recommendations */}
); } ``` ## Model Training ### Train Ensemble Models ```bash cd ML python train_all_models.py ``` This will: 1. Load synthetic or real training data 2. Train ensemble models (GradientBoosting, RandomForest, XGBoost) 3. Evaluate with cross-validation 4. Save best models to `trained_models/` 5. Generate SHAP feature importance plots ### Collect Feedback for Continuous Learning ```python # In your application, collect ground truth from feedback import feedback_collector feedback_collector.collect_feedback( task_id="task-123", predicted_probability=0.75, actual_completed=True, actual_on_time=True, user_rating=4, features=feature_vector ) # Periodically retrain if feedback_collector.get_feedback_count() >= 1000: from training import train_ensemble_models train_ensemble_models(feedback_data) ``` ## Testing ### Unit Tests ```bash cd ML pytest tests/test_cognitive_theory.py -v pytest tests/test_explainability.py -v ``` ### Integration Tests ```bash cd server mvn test -Dtest=PredictionServiceTest mvn test -Dtest=MLClientServiceTest ``` ### End-to-End Tests ```bash cd client pnpm test:e2e tests/predictions.spec.ts ``` ## Monitoring ### Prometheus Metrics ```bash # View metrics curl http://localhost:8000/metrics ``` Key metrics: - `ml_http_requests_total` - Total requests - `ml_http_request_duration_seconds` - Latency - `ml_prediction_accuracy` - Model accuracy - `ml_shap_computation_time` - SHAP performance ### Logging ```python # ML service logs tail -f ML/logs/ml_service.log # Check SHAP method usage grep "SHAP" ML/logs/ml_service.log | grep "method" ``` ## Troubleshooting ### SHAP Library Not Available If SHAP library is not installed, the system automatically falls back to weighted approximation: ``` INFO: SHAP library not installed; using approximation-based explainability. ``` To enable real SHAP: ```bash pip install shap ``` ### Model Not Found If trained models are not found, the system uses heuristic predictors: ``` WARNING: Trained model not found, using heuristic predictor ``` To train models: ```bash cd ML python train_all_models.py ``` ### Slow SHAP Computation For faster SHAP computation: 1. Use TreeExplainer (faster than KernelExplainer) 2. Enable caching in SHAPExplainer 3. Use batch processing for multiple predictions ### Memory Issues If running out of memory: 1. Reduce batch size in predictions 2. Use model quantization 3. Increase server memory allocation ## Performance Optimization ### Caching ```python # Enable Redis caching for predictions REDIS_URL = "redis://localhost:6379" cache = redis.Redis.from_url(REDIS_URL) @cache_result(ttl=3600) def get_prediction(task_id): return predict_task(task_id) ``` ### Async Processing ```python # Use async for non-blocking predictions @app.post("/ml/predict/async") async def predict_async(request: TaskPredictionRequest, background_tasks: BackgroundTasks): task_id = str(uuid.uuid4()) background_tasks.add_task(process_prediction, task_id, request) return {"task_id": task_id, "status": "processing"} ``` ### Batch Optimization ```python # Process multiple tasks in one request @app.post("/ml/predict/batch") async def predict_batch(request: BatchPredictionRequest): # Extract features for all tasks at once feature_matrix = extract_features_batch(request.tasks) # Predict in batch (much faster) predictions = model.predict_proba(feature_matrix) # Compute SHAP in batch shap_values = explainer.shap_values(feature_matrix) return format_batch_response(predictions, shap_values) ``` ## Security Considerations ### Input Validation All inputs are validated using Pydantic models: ```python class TaskPredictionRequest(BaseModel): title: str = Field(..., min_length=1, max_length=500) complexity: Optional[int] = Field(None, ge=1, le=5) estimated_duration: Optional[int] = Field(None, ge=1, le=1440) ``` ### Rate Limiting ```python from slowapi import Limiter from slowapi.util import get_remote_address limiter = Limiter(key_func=get_remote_address) @app.post("/ml/predict") @limiter.limit("100/minute") async def predict(request: TaskPredictionRequest): ... ``` ### Authentication Integrate with backend JWT authentication: ```python from fastapi.security import HTTPBearer security = HTTPBearer() @app.post("/ml/predict") async def predict(request: TaskPredictionRequest, token: str = Depends(security)): # Verify token with backend user = verify_token(token) ... ``` ## Next Steps 1. **Deploy to Production**: Use Docker Compose or Kubernetes 2. **Monitor Performance**: Set up Grafana dashboards 3. **Collect Feedback**: Implement user feedback collection 4. **Retrain Models**: Schedule periodic retraining 5. **A/B Testing**: Test different model versions 6. **Expand Features**: Add more cognitive theory models ## Support For issues or questions: - Check logs: `ML/logs/ml_service.log` - Review documentation: `COGNITIVE_THEORY_SHAP_IMPLEMENTATION.md` - Run tests: `pytest tests/ -v` - Check metrics: `http://localhost:8000/metrics`