Spaces:

ASI-Engineer
/

oc_p5

Sleeping

App Files Files Community

ASI-Engineer commited on Dec 29, 2025

Commit

6f61606

verified ·

1 Parent(s): 2d2a6e8

Upload folder using huggingface_hub

Browse files

Files changed (13) hide show

.env.example +5 -0
.huggingface/space.yml +3 -0
README.md +295 -77
README_HF.md +1 -1
api.py +424 -0
app.py +8 -394
db_models.py +19 -0
requirements.txt +27 -122
requirements_full.txt +123 -0
Dockerfile → src/Dockerfile +11 -4
src/config.py +5 -0
src/gradio_ui.py +35 -1
src/schemas.py +13 -17

.env.example CHANGED Viewed

@@ -1,6 +1,11 @@
 # Configuration de l'API Employee Turnover
 # Copiez ce fichier vers .env et remplissez les valeurs
 # ===== SÉCURITÉ =====
 # Clé API pour protéger l'endpoint /predict
 # Générez une clé forte : python -c "import secrets; print(secrets.token_urlsafe(32))"

 # Configuration de l'API Employee Turnover
 # Copiez ce fichier vers .env et remplissez les valeurs
+# ===== BASE DE DONNÉES =====
+# URL de connexion PostgreSQL (avec credentials)
+# Format: postgresql://username:password@host:port/database
+DATABASE_URL=postgresql://ml_user:your-password-here@localhost:5432/oc_p5_db
 # ===== SÉCURITÉ =====
 # Clé API pour protéger l'endpoint /predict
 # Générez une clé forte : python -c "import secrets; print(secrets.token_urlsafe(32))"

.huggingface/space.yml ADDED Viewed

	@@ -0,0 +1,3 @@

+sdk: docker
+dockerfile: src/Dockerfile
+build_context: .

README.md CHANGED Viewed

@@ -1,106 +1,324 @@
----
-title: Employee Turnover Prediction API
-emoji: 👔
-colorFrom: blue
-colorTo: purple
-sdk: docker
-pinned: true
-license: mit
-app_port: 7860
----
-# Employee Turnover Prediction API 🚀 (v3.2.1)
-API de prédiction du turnover des employés (XGBoost + SMOTE) avec endpoints batch, validation stricte et documentation à jour.
-## 🎯 Fonctionnalités
-- ✅ Prédiction de turnover (0 = reste, 1 = part)
 - 📦 Endpoint batch CSV (3 fichiers bruts)
-- 🎛️ Sliders Gradio et schémas Pydantic alignés sur les min/max réels
-- 📊 Probabilités et niveau de risque (Low/Medium/High)
-- 🔐 Authentification API Key (obligatoire)
-- 📝 Logs structurés JSON
-- 🛡️ Rate limiting (20 req/min)
-- 📚 Documentation OpenAPI/Swagger
-## 🔗 Endpoints
-| Endpoint | Description |
-|----------|-------------|
-| `/docs` | Documentation interactive Swagger |
-| `/health` | Status de l'API |
-| `/ui` | Interface Gradio interactive |
-| `/predict` | Prédiction unitaire (JSON, contraintes réelles) |
-| `/predict/batch` | Prédiction batch (3 fichiers CSV bruts) |
-## 🚀 Utilisation
-### Prédiction unitaire (toutes contraintes appliquées)
 ```bash
-curl -X POST https://asi-engineer-oc-p5-dev.hf.space/predict \
   -H "Content-Type: application/json" \
-  -H "X-API-Key: your-key" \
-  -d '{
-    "nombre_participation_pee": 0,
-    "nb_formations_suivies": 2,
-    "nombre_employee_sous_responsabilite": 1,
-    "distance_domicile_travail": 15,
-    "niveau_education": 3,
-    "domaine_etude": "Infra & Cloud",
-    "ayant_enfants": "Y",
-    "frequence_deplacement": "Occasionnel",
-    "annees_depuis_la_derniere_promotion": 2,
-    "annes_sous_responsable_actuel": 5,
-    "satisfaction_employee_environnement": 3,
-    "note_evaluation_precedente": 4,
-    "niveau_hierarchique_poste": 2,
-    "satisfaction_employee_nature_travail": 3,
-    "satisfaction_employee_equipe": 3,
-    "satisfaction_employee_equilibre_pro_perso": 2,
-    "note_evaluation_actuelle": 4,
-    "heure_supplementaires": "Non",
-    "augementation_salaire_precedente": 5.5,
-    "age": 35,
-    "genre": "M",
-    "revenu_mensuel": 4500.0,
-    "statut_marital": "Marié(e)",
-    "departement": "Commercial",
-    "poste": "Manager",
-    "nombre_experiences_precedentes": 3,
-    "nombre_heures_travailless": 80,
-    "annee_experience_totale": 10,
-    "annees_dans_l_entreprise": 5,
-    "annees_dans_le_poste_actuel": 2
-  }'
 ```
-### Prédiction batch (3 fichiers CSV bruts)
 ```bash
-curl -X POST https://asi-engineer-oc-p5-dev.hf.space/predict/batch \
-  -H "X-API-Key: your-key" \
-  -F "sondage_file=@extrait_sondage.csv" \
-  -F "eval_file=@extrait_eval.csv" \
-  -F "sirh_file=@extrait_sirh.csv"
 ```
-**Réponse :**
-```json
 {
   "total_employees": 1470,
-  "predictions": [...],
   "summary": {
     "total_stay": 1169,
     "total_leave": 301,
-    "high_risk_count": 222
   }
 }
 ```
-## 📚 Documentation complète
-Voir [docs/API.md](docs/API.md) ou le [GitHub Repository](https://github.com/chaton59/OC_P5) pour la documentation complète et les contraintes détaillées (min/max, enums, etc).

+# 🚀 Employee Turnover Prediction API - v3.2.1
+## 📊 Vue d'ensemble
+API REST de prédiction du turnover des employés basée sur un modèle XGBoost avec SMOTE.
+**✨ Nouveautés v3.2.1** :
+- 🎛️ Sliders Gradio et schémas Pydantic alignés sur les min/max réels des données d'entraînement
 - 📦 Endpoint batch CSV (3 fichiers bruts)
+- 🔑 Authentification API Key (prod)
+- 🔧 Correction preprocessing (scaling, ordre des colonnes)
+- 📝 Documentation et exemples mis à jour
+## 🏗️ Architecture
+```
+OC_P5/
+├── app.py                    # Point d'entrée FastAPI
+├── src/
+│   ├── auth.py              # Authentification API Key
+│   ├── config.py            # Configuration centralisée
+│   ├── logger.py            # Logging structuré (NOUVEAU)
+│   ├── models.py            # Chargement modèle HF Hub
+│   ├── preprocessing.py     # Pipeline preprocessing
+│   ├── rate_limit.py        # Rate limiting (NOUVEAU)
+│   └── schemas.py           # Validation Pydantic
+├── tests/                   # Suite pytest (33 tests, 88% couverture)
+├── logs/                    # Logs JSON (NOUVEAU)
+│   ├── api.log              # Tous les logs
+│   └── error.log            # Erreurs uniquement
+├── docs/                    # Documentation
+├── ml_model/                # Scripts training
+└── data/                    # Données sources
+## 🗄️ Schéma de la Base de Données (PostgreSQL)
+Schéma UML pour traçabilité ML (basé sur P5 prédiction turnover employé) :
+![Schéma BDD](docs/schema.png)
+- **dataset** : Dataset original (référence pour tests/retraining). Colonnes adaptées au modèle de prédiction turnover.
+- **ml_logs** : Logs inputs/outputs (JSON pour flexibilité, timestamp pour audits).
+Choix : Structure relationnelle pour efficacité volume data ; sécurité via user dédié (ml_user).
+Instructions : Voir create_db.py pour création.
+📖 **Guide complet pour débutants** : [docs/database_guide.md](docs/database_guide.md)
+### 💾 Insertion du Dataset
+```bash
+# Insérer le dataset complet (1470 employés)
+poetry run python scripts/insert_dataset.py
+# Vérifier l'insertion
+psql -h localhost -U ml_user -d oc_p5_db -c "SELECT COUNT(*) FROM dataset;"
+```
+### Prérequis
+- Python 3.12+
+- Poetry 1.7+
+- Git
+### Setup rapide
+```bash
+# 1. Cloner le repo
+git clone https://github.com/chaton59/OC_P5.git
+cd OC_P5
+# 2. Installer les dépendances
+poetry install
+# 3. Configurer l'environnement
+cp .env.example .env
+# Éditer .env avec vos valeurs
+# 4. Lancer l'API
+poetry run uvicorn app:app --reload
+# 5. Accéder à la documentation
+# http://localhost:8000/docs
+```
+## 📝 Configuration (.env)
 ```bash
+# Mode développement (désactive auth + active logs détaillés)
+DEBUG=true
+# API Key (requis en production)
+API_KEY=your-secret-key-here
+# Logging (DEBUG, INFO, WARNING, ERROR, CRITICAL)
+LOG_LEVEL=INFO
+# HuggingFace Model
+HF_MODEL_REPO=ASI-Engineer/employee-turnover-model
+MODEL_FILENAME=model/model.pkl
+```
+## 🔒 Authentification
+### Mode DEBUG (développement)
+```bash
+# L'API Key n'est PAS requise
+curl http://localhost:8000/predict -H "Content-Type: application/json" -d '{...}'
+```
+### Mode PRODUCTION
+```bash
+# L'API Key est REQUISE
+curl http://localhost:8000/predict \
+  -H "X-API-Key: your-secret-key" \
   -H "Content-Type: application/json" \
+  -d '{...}'
 ```
+## 📡 Endpoints
+### 🏥 Health Check
 ```bash
+GET /health
+# Réponse
+{
+  "status": "healthy",
+  "model_loaded": true,
+  "model_type": "Pipeline",
+  "version": "3.2.1"
+}
 ```
+### 🔮 Prédiction unitaire
+```bash
+POST /predict
+Content-Type: application/json
+X-API-Key: your-key (en production)
+# Payload (exemple, contraintes réelles appliquées)
+{
+  "nombre_participation_pee": 0,
+  "nb_formations_suivies": 2,
+  "nombre_employee_sous_responsabilite": 1,
+  "distance_domicile_travail": 15,
+  "niveau_education": 3,
+  "domaine_etude": "Infra & Cloud",
+  "ayant_enfants": "Y",
+  "frequence_deplacement": "Occasionnel",
+  "annees_depuis_la_derniere_promotion": 2,
+  "annes_sous_responsable_actuel": 5,
+  "satisfaction_employee_environnement": 3,
+  "note_evaluation_precedente": 4,
+  "niveau_hierarchique_poste": 2,
+  "satisfaction_employee_nature_travail": 3,
+  "satisfaction_employee_equipe": 3,
+  "satisfaction_employee_equilibre_pro_perso": 2,
+  "note_evaluation_actuelle": 4,
+  "heure_supplementaires": "Non",
+  "augementation_salaire_precedente": 5.5,
+  "age": 35,
+  "genre": "M",
+  "revenu_mensuel": 4500.0,
+  "statut_marital": "Marié(e)",
+  "departement": "Commercial",
+  "poste": "Manager",
+  "nombre_experiences_precedentes": 3,
+  "nombre_heures_travailless": 80,
+  "annee_experience_totale": 10,
+  "annees_dans_l_entreprise": 5,
+  "annees_dans_le_poste_actuel": 2
+}
+# Réponse
+{
+  "prediction": 0,                    # 0 = reste, 1 = part
+  "probability_0": 0.85,              # Probabilité de rester
+  "probability_1": 0.15,              # Probabilité de partir
+  "risk_level": "Low"                 # Low, Medium, High
+}
+```
+### 📦 Prédiction batch (CSV)
+```bash
+POST /predict/batch
+X-API-Key: your-key (en production)
+# Envoi des 3 fichiers CSV bruts
+curl -X POST "http://localhost:8000/predict/batch" \
+  -H "X-API-Key: your-key" \
+  -F "sondage_file=@data/extrait_sondage.csv" \
+  -F "eval_file=@data/extrait_eval.csv" \
+  -F "sirh_file=@data/extrait_sirh.csv"
+# Réponse
 {
   "total_employees": 1470,
+  "predictions": [
+    {"employee_id": 1, "prediction": 1, "probability_leave": 0.84, "risk_level": "High"},
+    {"employee_id": 2, "prediction": 0, "probability_leave": 0.11, "risk_level": "Low"}
+  ],
   "summary": {
     "total_stay": 1169,
     "total_leave": 301,
+    "high_risk_count": 222,
+    "medium_risk_count": 233,
+    "low_risk_count": 1015
   }
 }
 ```
+## 📊 Logging
+### Logs structurés JSON
+**Fichiers** :
+- `logs/api.log` : Tous les logs
+- `logs/error.log` : Erreurs uniquement
+**Format** :
+```json
+{
+  "timestamp": "2025-12-26T10:30:45",
+  "level": "INFO",
+  "logger": "employee_turnover_api",
+  "message": "Request POST /predict",
+  "method": "POST",
+  "path": "/predict",
+  "status_code": 200,
+  "duration_ms": 23.45,
+  "client_host": "127.0.0.1"
+}
+```
+## 🛡️ Rate Limiting
+**Configuration** :
+- **Développement** : Désactivé (DEBUG=true)
+- **Production** : 20 requêtes/minute par IP ou API Key
+**En cas de dépassement** :
+```json
+{
+  "error": "Rate limit exceeded",
+  "message": "20 per 1 minute"
+}
+```
+## ✅ Tests
+```bash
+# Tous les tests
+poetry run pytest tests/ -v
+# Avec couverture
+poetry run pytest tests/ --cov --cov-report=html
+# Voir rapport HTML
+open htmlcov/index.html
+```
+**Résultats** :
+- ✅ 33 tests passés
+- 📊 88% de couverture globale
+## 🚀 Déploiement
+### Variables d'environnement requises
+```bash
+DEBUG=false
+API_KEY=<votre-clé-sécurisée>
+LOG_LEVEL=INFO
+```
+### HuggingFace Spaces
+Prêt pour déploiement avec `app.py` et `requirements.txt`
+## 📚 Documentation
+- **API Interactive** : http://localhost:8000/docs
+- **ReDoc** : http://localhost:8000/redoc
+- **Guide complet** : [docs/API_GUIDE.md](docs/API_GUIDE.md)
+- **Standards** : [docs/standards.md](docs/standards.md)
+- **Couverture tests** : [docs/TEST_COVERAGE.md](docs/TEST_COVERAGE.md)
+## 📦 Dépendances principales
+- **FastAPI** 0.115.14 : Framework web
+- **Pydantic** 2.12.5 : Validation données
+- **XGBoost** 2.1.3 : Modèle ML
+- **SlowAPI** 0.1.9 : Rate limiting
+- **python-json-logger** 4.0.0 : Logs structurés
+- **pytest** 9.0.2 : Tests
+## 🔄 Changelog
+### v3.2.1 (janvier 2026)
+- 🎛️ Sliders Gradio et schémas Pydantic alignés sur les min/max réels des données d'entraînement
+- 📦 Endpoint batch CSV (3 fichiers bruts)
+- 🔑 Authentification API Key (prod)
+- 🔧 Correction preprocessing (scaling, ordre des colonnes)
+- 📝 Documentation et exemples mis à jour
+### v2.2.0 (27 décembre 2025)
+- 📦 Nouvel endpoint `/predict/batch` pour traitement CSV direct
+- 🔧 Fix preprocessing : ajout du scaling des features
+- 🔧 Fix preprocessing : correction de l'ordre des colonnes
+- 📊 Amélioration précision des prédictions (~90%)
+### v2.1.0 (26 décembre 2025)
+- ✨ Système de logging structuré JSON
+- 🛡️ Rate limiting avec SlowAPI
+- ⚡ Amélioration gestion d'erreurs
+- 📊 Monitoring des performances
+### v2.0.0 (26 décembre 2025)
+- ✅ Suite de tests complète (36 tests)
+- 🔐 Authentification API Key
+- 📊 88% de couverture de code
+## 👥 Auteurs
+- **Projet** : OpenClassrooms P5
+- **Repo** : [github.com/chaton59/OC_P5](https://github.com/chaton59/OC_P5)

README_HF.md CHANGED Viewed

@@ -3,7 +3,7 @@ title: Employee Turnover Prediction API
 emoji: 👔
 colorFrom: blue
 colorTo: purple
-sdk: docker
 pinned: true
 license: mit
 app_port: 7860

 emoji: 👔
 colorFrom: blue
 colorTo: purple
+sdk: gradio
 pinned: true
 license: mit
 app_port: 7860

api.py ADDED Viewed

	@@ -0,0 +1,424 @@

+#!/usr/bin/env python3
+"""
+API FastAPI pour le modèle Employee Turnover.
+Cette API expose le modèle de prédiction de départ des employés avec :
+- Validation stricte des inputs via Pydantic
+- Preprocessing automatique
+- Health check pour monitoring
+- Documentation OpenAPI/Swagger automatique
+- Interface Gradio pour utilisation interactive
+- Endpoint batch pour traitement de fichiers CSV
+"""
+import io
+import time
+from contextlib import asynccontextmanager
+import gradio as gr
+import pandas as pd
+from fastapi import Depends, FastAPI, File, HTTPException, Request, UploadFile
+from fastapi.middleware.cors import CORSMiddleware
+from slowapi import _rate_limit_exceeded_handler
+from slowapi.errors import RateLimitExceeded
+from src.auth import verify_api_key
+from src.config import get_settings
+from src.gradio_ui import create_gradio_interface
+from src.logger import logger, log_model_load, log_request
+from src.models import get_model_info, load_model
+from src.preprocessing import (
+    merge_csv_dataframes,
+    preprocess_dataframe_for_prediction,
+    preprocess_for_prediction,
+)
+from src.rate_limit import limiter
+from src.schemas import (
+    BatchPredictionOutput,
+    EmployeeInput,
+    EmployeePrediction,
+    HealthCheck,
+    PredictionOutput,
+)
+# Charger la configuration
+settings = get_settings()
+API_VERSION = settings.API_VERSION
+@asynccontextmanager
+async def lifespan(app: FastAPI):
+    """
+    Gestion du cycle de vie de l'application.
+    Charge le modèle au démarrage et le garde en cache.
+    """
+    logger.info(
+        "🚀 Démarrage de l'API Employee Turnover...", extra={"version": API_VERSION}
+    )
+    start_time = time.time()
+    try:
+        # Pré-charger le modèle au démarrage
+        model = load_model()
+        duration_ms = (time.time() - start_time) * 1000
+        model_type = type(model).__name__
+        log_model_load(model_type, duration_ms, True)
+        logger.info("✅ Modèle chargé avec succès")
+    except Exception as e:
+        duration_ms = (time.time() - start_time) * 1000
+        log_model_load("Unknown", duration_ms, False)
+        logger.error("Le modèle n'a pas pu être chargé", extra={"error": str(e)})
+    yield  # L'application tourne
+    logger.info("🛑 Arrêt de l'API")
+# Créer l'application FastAPI
+app = FastAPI(
+    title="Employee Turnover Prediction API",
+    description="API de prédiction du turnover des employés avec XGBoost + SMOTE",
+    version=API_VERSION,
+    lifespan=lifespan,
+    docs_url="/docs",
+    redoc_url="/redoc",
+)
+# Ajouter rate limiting
+app.state.limiter = limiter
+app.add_exception_handler(RateLimitExceeded, _rate_limit_exceeded_handler)
+# Configurer CORS (autoriser tous les domaines en dev)
+app.add_middleware(
+    CORSMiddleware,
+    allow_origins=["*"],
+    allow_credentials=True,
+    allow_methods=["*"],
+    allow_headers=["*"],
+)
+# Middleware de logging des requêtes
+@app.middleware("http")
+async def log_requests(request: Request, call_next):
+    """
+    Middleware pour logger toutes les requêtes HTTP.
+    """
+    start_time = time.time()
+    # Traiter la requête
+    response = await call_next(request)
+    # Calculer la durée
+    duration_ms = (time.time() - start_time) * 1000
+    # Logger
+    log_request(
+        method=request.method,
+        path=request.url.path,
+        status_code=response.status_code,
+        duration_ms=duration_ms,
+        client_host=request.client.host if request.client else None,
+    )
+    return response
+@app.get("/health", response_model=HealthCheck, tags=["Monitoring"])
+async def health_check():
+    """
+    Health check endpoint pour monitoring.
+    Vérifie que l'API est opérationnelle et que le modèle est chargé.
+    Returns:
+        HealthCheck: Status de l'API et du modèle.
+    Raises:
+        HTTPException: 503 si le modèle n'est pas disponible.
+    """
+    try:
+        model_info = get_model_info()
+        return HealthCheck(
+            status="healthy",
+            model_loaded=model_info.get("cached", False),
+            model_type=model_info.get("model_type", "Unknown"),
+            version=API_VERSION,
+        )
+    except Exception as e:
+        raise HTTPException(
+            status_code=503,
+            detail={
+                "status": "unhealthy",
+                "error": "Model not available",
+                "message": str(e),
+            },
+        )
+@app.post(
+    "/predict",
+    response_model=PredictionOutput,
+    tags=["Prediction"],
+    dependencies=[Depends(verify_api_key)] if settings.is_api_key_required else [],
+)
+@limiter.limit("20/minute")
+async def predict(request: Request, employee: EmployeeInput):
+    """
+    Endpoint de prédiction du turnover d'un employé.
+    **PROTÉGÉ PAR API KEY** : Requiert le header `X-API-Key` en production.
+    Prend en entrée les données d'un employé, applique le preprocessing
+    et retourne la prédiction avec les probabilités.
+    Args:
+        employee: Données de l'employé validées par Pydantic.
+    Returns:
+        PredictionOutput: Prédiction et probabilités.
+    Raises:
+        HTTPException: 401 si API key invalide ou manquante.
+        HTTPException: 500 si erreur lors de la prédiction.
+    Examples:
+        ```bash
+        # Avec authentification
+        curl -X POST http://localhost:8000/predict \\
+          -H "X-API-Key: your-secret-key" \\
+          -H "Content-Type: application/json" \\
+          -d '{...}'
+        ```
+    """
+    try:
+        # 1. Charger le modèle
+        model = load_model()
+        # 2. Préprocessing
+        X = preprocess_for_prediction(employee)
+        # 3. Prédiction
+        prediction = int(model.predict(X)[0])
+        # 4. Probabilités (si le modèle supporte predict_proba)
+        try:
+            probabilities = model.predict_proba(X)[0]
+            prob_0 = float(probabilities[0])
+            prob_1 = float(probabilities[1])
+        except AttributeError:
+            # Si le modèle ne supporte pas predict_proba
+            prob_0 = 1.0 if prediction == 0 else 0.0
+            prob_1 = 1.0 if prediction == 1 else 0.0
+        # 5. Niveau de risque
+        if prob_1 < 0.3:
+            risk_level = "Low"
+        elif prob_1 < 0.7:
+            risk_level = "Medium"
+        else:
+            risk_level = "High"
+        # 6. Enregistrer dans la base de données
+        try:
+            from sqlalchemy import create_engine
+            from sqlalchemy.orm import sessionmaker
+            from db_models import MLLog
+            engine = create_engine(settings.DATABASE_URL)
+            Session = sessionmaker(bind=engine)
+            session = Session()
+            log_entry = MLLog(
+                input_json=employee.model_dump(),
+                prediction="Oui" if prediction == 1 else "Non",
+            )
+            session.add(log_entry)
+            session.commit()
+            session.close()
+            logger.info(f"Prediction logged to database: {prediction}")
+        except Exception as db_error:
+            logger.warning(f"Failed to log prediction to database: {db_error}")
+        return PredictionOutput(
+            prediction=prediction,
+            probability_0=prob_0,
+            probability_1=prob_1,
+            risk_level=risk_level,
+        )
+    except Exception:
+        logger.exception("Unexpected error during prediction")
+        raise HTTPException(
+            status_code=500,
+            detail={
+                "error": "Prediction failed",
+                "message": "An unexpected error occurred. Please contact support.",
+            },
+        )
+@app.post(
+    "/predict/batch",
+    response_model=BatchPredictionOutput,
+    tags=["Prediction"],
+    dependencies=[Depends(verify_api_key)] if settings.is_api_key_required else [],
+)
+@limiter.limit("5/minute")
+async def predict_batch(
+    request: Request,
+    sondage_file: UploadFile = File(..., description="Fichier CSV du sondage"),
+    eval_file: UploadFile = File(..., description="Fichier CSV des évaluations"),
+    sirh_file: UploadFile = File(..., description="Fichier CSV SIRH"),
+):
+    """
+    Endpoint de prédiction batch à partir de fichiers CSV.
+    **PROTÉGÉ PAR API KEY** : Requiert le header `X-API-Key` en production.
+    Prend en entrée les 3 fichiers CSV (sondage, évaluation, SIRH),
+    les fusionne, applique le preprocessing et retourne les prédictions
+    pour tous les employés.
+    Args:
+        sondage_file: Fichier CSV contenant les données de sondage.
+        eval_file: Fichier CSV contenant les données d'évaluation.
+        sirh_file: Fichier CSV contenant les données SIRH.
+    Returns:
+        BatchPredictionOutput: Prédictions pour tous les employés.
+    Raises:
+        HTTPException: 400 si les fichiers sont invalides.
+        HTTPException: 500 si erreur lors du traitement.
+    """
+    try:
+        # 1. Lire les fichiers CSV
+        sondage_content = await sondage_file.read()
+        eval_content = await eval_file.read()
+        sirh_content = await sirh_file.read()
+        sondage_df = pd.read_csv(io.BytesIO(sondage_content))
+        eval_df = pd.read_csv(io.BytesIO(eval_content))
+        sirh_df = pd.read_csv(io.BytesIO(sirh_content))
+        logger.info(
+            f"Fichiers CSV chargés: sondage={len(sondage_df)}, "
+            f"eval={len(eval_df)}, sirh={len(sirh_df)} lignes"
+        )
+        # 2. Fusionner les DataFrames
+        merged_df = merge_csv_dataframes(sondage_df, eval_df, sirh_df)
+        employee_ids = merged_df["original_employee_id"].tolist()
+        merged_df = merged_df.drop(columns=["original_employee_id"])
+        # Supprimer la colonne cible si présente
+        if "a_quitte_l_entreprise" in merged_df.columns:
+            merged_df = merged_df.drop(columns=["a_quitte_l_entreprise"])
+        logger.info(f"DataFrame fusionné: {len(merged_df)} employés")
+        # 3. Preprocessing
+        X = preprocess_dataframe_for_prediction(merged_df)
+        # 4. Charger le modèle et prédire
+        model = load_model()
+        predictions = model.predict(X.values)
+        probabilities = model.predict_proba(X.values)
+        # 5. Construire la réponse
+        results = []
+        risk_counts = {"Low": 0, "Medium": 0, "High": 0}
+        leave_count = 0
+        for i, emp_id in enumerate(employee_ids):
+            prob_stay = float(probabilities[i][0])
+            prob_leave = float(probabilities[i][1])
+            pred = int(predictions[i])
+            if prob_leave < 0.3:
+                risk = "Low"
+            elif prob_leave < 0.7:
+                risk = "Medium"
+            else:
+                risk = "High"
+            risk_counts[risk] += 1
+            if pred == 1:
+                leave_count += 1
+            results.append(
+                EmployeePrediction(
+                    employee_id=int(emp_id),
+                    prediction=pred,
+                    probability_stay=prob_stay,
+                    probability_leave=prob_leave,
+                    risk_level=risk,
+                )
+            )
+        summary = {
+            "total_stay": len(results) - leave_count,
+            "total_leave": leave_count,
+            "high_risk_count": risk_counts["High"],
+            "medium_risk_count": risk_counts["Medium"],
+            "low_risk_count": risk_counts["Low"],
+        }
+        logger.info(f"Prédictions terminées: {summary}")
+        return BatchPredictionOutput(
+            total_employees=len(results),
+            predictions=results,
+            summary=summary,
+        )
+    except pd.errors.EmptyDataError:
+        raise HTTPException(
+            status_code=400,
+            detail={
+                "error": "Empty CSV file",
+                "message": "Un des fichiers CSV est vide.",
+            },
+        )
+    except KeyError as e:
+        raise HTTPException(
+            status_code=400,
+            detail={
+                "error": "Missing column",
+                "message": f"Colonne manquante dans les CSV: {e}",
+            },
+        )
+    except Exception as e:
+        logger.exception("Unexpected error during batch prediction")
+        raise HTTPException(
+            status_code=500,
+            detail={
+                "error": "Batch prediction failed",
+                "message": str(e),
+            },
+        )
+# Monter l'interface Gradio sur / (racine pour HuggingFace Spaces)
+gradio_app = create_gradio_interface()
+app = gr.mount_gradio_app(app, gradio_app, path="/")
+if __name__ == "__main__":
+    import uvicorn
+    print("\U0001f680 Lancement de l'API en mode d\u00e9veloppement...")
+    print("\U0001f4d6 Documentation : http://localhost:8000/docs")
+    print("\U0001f3a8 Interface Gradio : http://localhost:8000/")
+    uvicorn.run(
+        "app:app",
+        host="0.0.0.0",
+        port=8000,
+        reload=True,
+        log_level="info",
+    )

app.py CHANGED Viewed

@@ -1,402 +1,16 @@
 #!/usr/bin/env python3
 """
-API FastAPI pour le modèle Employee Turnover.
-Cette API expose le modèle de prédiction de départ des employés avec :
-- Validation stricte des inputs via Pydantic
-- Preprocessing automatique
-- Health check pour monitoring
-- Documentation OpenAPI/Swagger automatique
-- Interface Gradio pour utilisation interactive
-- Endpoint batch pour traitement de fichiers CSV
 """
-import io
-import time
-from contextlib import asynccontextmanager
-import gradio as gr
-import pandas as pd
-from fastapi import Depends, FastAPI, File, HTTPException, Request, UploadFile
-from fastapi.middleware.cors import CORSMiddleware
-from slowapi import _rate_limit_exceeded_handler
-from slowapi.errors import RateLimitExceeded
-from src.auth import verify_api_key
-from src.config import get_settings
-from src.gradio_ui import create_gradio_interface
-from src.logger import logger, log_model_load, log_request
-from src.models import get_model_info, load_model
-from src.preprocessing import (
-    merge_csv_dataframes,
-    preprocess_dataframe_for_prediction,
-    preprocess_for_prediction,
-)
-from src.rate_limit import limiter
-from src.schemas import (
-    BatchPredictionOutput,
-    EmployeeInput,
-    EmployeePrediction,
-    HealthCheck,
-    PredictionOutput,
-)
-# Charger la configuration
-settings = get_settings()
-API_VERSION = settings.API_VERSION
-@asynccontextmanager
-async def lifespan(app: FastAPI):
-    """
-    Gestion du cycle de vie de l'application.
-    Charge le modèle au démarrage et le garde en cache.
-    """
-    logger.info(
-        "🚀 Démarrage de l'API Employee Turnover...", extra={"version": API_VERSION}
-    )
-    start_time = time.time()
-    try:
-        # Pré-charger le modèle au démarrage
-        model = load_model()
-        duration_ms = (time.time() - start_time) * 1000
-        model_type = type(model).__name__
-        log_model_load(model_type, duration_ms, True)
-        logger.info("✅ Modèle chargé avec succès")
-    except Exception as e:
-        duration_ms = (time.time() - start_time) * 1000
-        log_model_load("Unknown", duration_ms, False)
-        logger.error("Le modèle n'a pas pu être chargé", extra={"error": str(e)})
-    yield  # L'application tourne
-    logger.info("🛑 Arrêt de l'API")
-# Créer l'application FastAPI
-app = FastAPI(
-    title="Employee Turnover Prediction API",
-    description="API de prédiction du turnover des employés avec XGBoost + SMOTE",
-    version=API_VERSION,
-    lifespan=lifespan,
-    docs_url="/docs",
-    redoc_url="/redoc",
-)
-# Ajouter rate limiting
-app.state.limiter = limiter
-app.add_exception_handler(RateLimitExceeded, _rate_limit_exceeded_handler)
-# Configurer CORS (autoriser tous les domaines en dev)
-app.add_middleware(
-    CORSMiddleware,
-    allow_origins=["*"],
-    allow_credentials=True,
-    allow_methods=["*"],
-    allow_headers=["*"],
-)
-# Middleware de logging des requêtes
-@app.middleware("http")
-async def log_requests(request: Request, call_next):
-    """
-    Middleware pour logger toutes les requêtes HTTP.
-    """
-    start_time = time.time()
-    # Traiter la requête
-    response = await call_next(request)
-    # Calculer la durée
-    duration_ms = (time.time() - start_time) * 1000
-    # Logger
-    log_request(
-        method=request.method,
-        path=request.url.path,
-        status_code=response.status_code,
-        duration_ms=duration_ms,
-        client_host=request.client.host if request.client else None,
-    )
-    return response
-@app.get("/health", response_model=HealthCheck, tags=["Monitoring"])
-async def health_check():
-    """
-    Health check endpoint pour monitoring.
-    Vérifie que l'API est opérationnelle et que le modèle est chargé.
-    Returns:
-        HealthCheck: Status de l'API et du modèle.
-    Raises:
-        HTTPException: 503 si le modèle n'est pas disponible.
-    """
-    try:
-        model_info = get_model_info()
-        return HealthCheck(
-            status="healthy",
-            model_loaded=model_info.get("cached", False),
-            model_type=model_info.get("model_type", "Unknown"),
-            version=API_VERSION,
-        )
-    except Exception as e:
-        raise HTTPException(
-            status_code=503,
-            detail={
-                "status": "unhealthy",
-                "error": "Model not available",
-                "message": str(e),
-            },
-        )
-@app.post(
-    "/predict",
-    response_model=PredictionOutput,
-    tags=["Prediction"],
-    dependencies=[Depends(verify_api_key)] if settings.is_api_key_required else [],
-)
-@limiter.limit("20/minute")
-async def predict(request: Request, employee: EmployeeInput):
-    """
-    Endpoint de prédiction du turnover d'un employé.
-    **PROTÉGÉ PAR API KEY** : Requiert le header `X-API-Key` en production.
-    Prend en entrée les données d'un employé, applique le preprocessing
-    et retourne la prédiction avec les probabilités.
-    Args:
-        employee: Données de l'employé validées par Pydantic.
-    Returns:
-        PredictionOutput: Prédiction et probabilités.
-    Raises:
-        HTTPException: 401 si API key invalide ou manquante.
-        HTTPException: 500 si erreur lors de la prédiction.
-    Examples:
-        ```bash
-        # Avec authentification
-        curl -X POST http://localhost:8000/predict \\
-          -H "X-API-Key: your-secret-key" \\
-          -H "Content-Type: application/json" \\
-          -d '{...}'
-        ```
-    """
-    try:
-        # 1. Charger le modèle
-        model = load_model()
-        # 2. Préprocessing
-        X = preprocess_for_prediction(employee)
-        # 3. Prédiction
-        prediction = int(model.predict(X)[0])
-        # 4. Probabilités (si le modèle supporte predict_proba)
-        try:
-            probabilities = model.predict_proba(X)[0]
-            prob_0 = float(probabilities[0])
-            prob_1 = float(probabilities[1])
-        except AttributeError:
-            # Si le modèle ne supporte pas predict_proba
-            prob_0 = 1.0 if prediction == 0 else 0.0
-            prob_1 = 1.0 if prediction == 1 else 0.0
-        # 5. Niveau de risque
-        if prob_1 < 0.3:
-            risk_level = "Low"
-        elif prob_1 < 0.7:
-            risk_level = "Medium"
-        else:
-            risk_level = "High"
-        return PredictionOutput(
-            prediction=prediction,
-            probability_0=prob_0,
-            probability_1=prob_1,
-            risk_level=risk_level,
-        )
-    except Exception:
-        logger.exception("Unexpected error during prediction")
-        raise HTTPException(
-            status_code=500,
-            detail={
-                "error": "Prediction failed",
-                "message": "An unexpected error occurred. Please contact support.",
-            },
-        )
-@app.post(
-    "/predict/batch",
-    response_model=BatchPredictionOutput,
-    tags=["Prediction"],
-    dependencies=[Depends(verify_api_key)] if settings.is_api_key_required else [],
-)
-@limiter.limit("5/minute")
-async def predict_batch(
-    request: Request,
-    sondage_file: UploadFile = File(..., description="Fichier CSV du sondage"),
-    eval_file: UploadFile = File(..., description="Fichier CSV des évaluations"),
-    sirh_file: UploadFile = File(..., description="Fichier CSV SIRH"),
-):
-    """
-    Endpoint de prédiction batch à partir de fichiers CSV.
-    **PROTÉGÉ PAR API KEY** : Requiert le header `X-API-Key` en production.
-    Prend en entrée les 3 fichiers CSV (sondage, évaluation, SIRH),
-    les fusionne, applique le preprocessing et retourne les prédictions
-    pour tous les employés.
-    Args:
-        sondage_file: Fichier CSV contenant les données de sondage.
-        eval_file: Fichier CSV contenant les données d'évaluation.
-        sirh_file: Fichier CSV contenant les données SIRH.
-    Returns:
-        BatchPredictionOutput: Prédictions pour tous les employés.
-    Raises:
-        HTTPException: 400 si les fichiers sont invalides.
-        HTTPException: 500 si erreur lors du traitement.
-    """
-    try:
-        # 1. Lire les fichiers CSV
-        sondage_content = await sondage_file.read()
-        eval_content = await eval_file.read()
-        sirh_content = await sirh_file.read()
-        sondage_df = pd.read_csv(io.BytesIO(sondage_content))
-        eval_df = pd.read_csv(io.BytesIO(eval_content))
-        sirh_df = pd.read_csv(io.BytesIO(sirh_content))
-        logger.info(
-            f"Fichiers CSV chargés: sondage={len(sondage_df)}, "
-            f"eval={len(eval_df)}, sirh={len(sirh_df)} lignes"
-        )
-        # 2. Fusionner les DataFrames
-        merged_df = merge_csv_dataframes(sondage_df, eval_df, sirh_df)
-        employee_ids = merged_df["original_employee_id"].tolist()
-        merged_df = merged_df.drop(columns=["original_employee_id"])
-        # Supprimer la colonne cible si présente
-        if "a_quitte_l_entreprise" in merged_df.columns:
-            merged_df = merged_df.drop(columns=["a_quitte_l_entreprise"])
-        logger.info(f"DataFrame fusionné: {len(merged_df)} employés")
-        # 3. Preprocessing
-        X = preprocess_dataframe_for_prediction(merged_df)
-        # 4. Charger le modèle et prédire
-        model = load_model()
-        predictions = model.predict(X.values)
-        probabilities = model.predict_proba(X.values)
-        # 5. Construire la réponse
-        results = []
-        risk_counts = {"Low": 0, "Medium": 0, "High": 0}
-        leave_count = 0
-        for i, emp_id in enumerate(employee_ids):
-            prob_stay = float(probabilities[i][0])
-            prob_leave = float(probabilities[i][1])
-            pred = int(predictions[i])
-            if prob_leave < 0.3:
-                risk = "Low"
-            elif prob_leave < 0.7:
-                risk = "Medium"
-            else:
-                risk = "High"
-            risk_counts[risk] += 1
-            if pred == 1:
-                leave_count += 1
-            results.append(
-                EmployeePrediction(
-                    employee_id=int(emp_id),
-                    prediction=pred,
-                    probability_stay=prob_stay,
-                    probability_leave=prob_leave,
-                    risk_level=risk,
-                )
-            )
-        summary = {
-            "total_stay": len(results) - leave_count,
-            "total_leave": leave_count,
-            "high_risk_count": risk_counts["High"],
-            "medium_risk_count": risk_counts["Medium"],
-            "low_risk_count": risk_counts["Low"],
-        }
-        logger.info(f"Prédictions terminées: {summary}")
-        return BatchPredictionOutput(
-            total_employees=len(results),
-            predictions=results,
-            summary=summary,
-        )
-    except pd.errors.EmptyDataError:
-        raise HTTPException(
-            status_code=400,
-            detail={
-                "error": "Empty CSV file",
-                "message": "Un des fichiers CSV est vide.",
-            },
-        )
-    except KeyError as e:
-        raise HTTPException(
-            status_code=400,
-            detail={
-                "error": "Missing column",
-                "message": f"Colonne manquante dans les CSV: {e}",
-            },
-        )
-    except Exception as e:
-        logger.exception("Unexpected error during batch prediction")
-        raise HTTPException(
-            status_code=500,
-            detail={
-                "error": "Batch prediction failed",
-                "message": str(e),
-            },
-        )
-# Monter l'interface Gradio sur / (racine pour HuggingFace Spaces)
-gradio_app = create_gradio_interface()
-app = gr.mount_gradio_app(app, gradio_app, path="/")
 if __name__ == "__main__":
-    import uvicorn
-    print("\U0001f680 Lancement de l'API en mode d\u00e9veloppement...")
-    print("\U0001f4d6 Documentation : http://localhost:8000/docs")
-    print("\U0001f3a8 Interface Gradio : http://localhost:8000/")
-    uvicorn.run(
-        "app:app",
-        host="0.0.0.0",
-        port=8000,
-        reload=True,
-        log_level="info",
-    )

 #!/usr/bin/env python3
 """
+App Gradio pour Hugging Face Spaces.
+Lance l'interface Gradio pour la prédiction de turnover.
 """
+import sys
+import os
+from src.gradio_ui import launch_standalone
+# Ajouter le répertoire src au path
+sys.path.insert(0, os.path.join(os.path.dirname(__file__), "src"))
 if __name__ == "__main__":
+    launch_standalone()

db_models.py ADDED Viewed

	@@ -0,0 +1,19 @@

+from sqlalchemy import Column, Integer, String, JSON, DateTime, func
+from sqlalchemy.ext.declarative import declarative_base
+Base = declarative_base()
+class Dataset(Base):
+    __tablename__ = "dataset"
+    id = Column(Integer, primary_key=True)
+    features_json = Column(JSON)  # Features from sondage, eval, sirh data
+    target = Column(String)  # Target: 'Oui' or 'Non' for turnover
+class MLLog(Base):
+    __tablename__ = "ml_logs"
+    id = Column(Integer, primary_key=True)
+    input_json = Column(JSON)  # Inputs flexibles (JSON for features variables)
+    prediction = Column(String)  # Output ML ('Oui' or 'Non')
+    created_at = Column(DateTime, default=func.now())  # Timestamp auto pour traçabilité

requirements.txt CHANGED Viewed

@@ -1,122 +1,27 @@
-aiofiles==24.1.0 ; python_version >= "3.12" and python_version < "4.0"
-alembic==1.17.2 ; python_version >= "3.12" and python_version < "4.0"
-annotated-doc==0.0.4 ; python_version >= "3.12" and python_version < "4.0"
-annotated-types==0.7.0 ; python_version >= "3.12" and python_version < "4.0"
-anyio==4.12.0 ; python_version >= "3.12" and python_version < "4.0"
-audioop-lts==0.2.2 ; python_version >= "3.13" and python_version < "4.0"
-blinker==1.9.0 ; python_version >= "3.12" and python_version < "4.0"
-brotli==1.2.0 ; python_version >= "3.12" and python_version < "4.0"
-cachetools==6.2.4 ; python_version >= "3.12" and python_version < "4.0"
-certifi==2025.11.12 ; python_version >= "3.12" and python_version < "4.0"
-cffi==2.0.0 ; python_version >= "3.12" and python_version < "4.0" and platform_python_implementation != "PyPy"
-charset-normalizer==3.4.4 ; python_version >= "3.12" and python_version < "4.0"
-click==8.3.1 ; python_version >= "3.12" and python_version < "4.0"
-cloudpickle==3.1.2 ; python_version >= "3.12" and python_version < "4.0"
-colorama==0.4.6 ; python_version >= "3.12" and python_version < "4.0" and (platform_system == "Windows" or sys_platform == "win32")
-contourpy==1.3.3 ; python_version >= "3.12" and python_version < "4.0"
-cryptography==46.0.3 ; python_version >= "3.12" and python_version < "4.0"
-cycler==0.12.1 ; python_version >= "3.12" and python_version < "4.0"
-databricks-sdk==0.76.0 ; python_version >= "3.12" and python_version < "4.0"
-deprecated==1.3.1 ; python_version >= "3.12" and python_version < "4.0"
-docker==7.1.0 ; python_version >= "3.12" and python_version < "4.0"
-fastapi==0.127.1 ; python_version >= "3.12" and python_version < "4.0"
-ffmpy==1.0.0 ; python_version >= "3.12" and python_version < "4.0"
-filelock==3.20.1 ; python_version >= "3.12" and python_version < "4.0"
-flask-cors==6.0.2 ; python_version >= "3.12" and python_version < "4.0"
-flask==3.1.2 ; python_version >= "3.12" and python_version < "4.0"
-fonttools==4.61.1 ; python_version >= "3.12" and python_version < "4.0"
-fsspec==2025.12.0 ; python_version >= "3.12" and python_version < "4.0"
-gitdb==4.0.12 ; python_version >= "3.12" and python_version < "4.0"
-gitpython==3.1.45 ; python_version >= "3.12" and python_version < "4.0"
-google-auth==2.45.0 ; python_version >= "3.12" and python_version < "4.0"
-gradio-client==2.0.2 ; python_version >= "3.12" and python_version < "4.0"
-gradio==6.2.0 ; python_version >= "3.12" and python_version < "4.0"
-graphene==3.4.3 ; python_version >= "3.12" and python_version < "4.0"
-graphql-core==3.2.7 ; python_version >= "3.12" and python_version < "4.0"
-graphql-relay==3.2.0 ; python_version >= "3.12" and python_version < "4.0"
-greenlet==3.3.0 ; python_version >= "3.12" and python_version < "4.0" and (platform_machine == "aarch64" or platform_machine == "ppc64le" or platform_machine == "x86_64" or platform_machine == "amd64" or platform_machine == "AMD64" or platform_machine == "win32" or platform_machine == "WIN32")
-groovy==0.1.2 ; python_version >= "3.12" and python_version < "4.0"
-gunicorn==23.0.0 ; python_version >= "3.12" and python_version < "4.0" and platform_system != "Windows"
-h11==0.16.0 ; python_version >= "3.12" and python_version < "4.0"
-hf-xet==1.2.0 ; python_version >= "3.12" and python_version < "4.0" and (platform_machine == "x86_64" or platform_machine == "amd64" or platform_machine == "AMD64" or platform_machine == "arm64" or platform_machine == "aarch64")
-httpcore==1.0.9 ; python_version >= "3.12" and python_version < "4.0"
-httptools==0.7.1 ; python_version >= "3.12" and python_version < "4.0"
-httpx==0.28.1 ; python_version >= "3.12" and python_version < "4.0"
-huey==2.5.5 ; python_version >= "3.12" and python_version < "4.0"
-huggingface-hub==1.2.3 ; python_version >= "3.12" and python_version < "4.0"
-idna==3.11 ; python_version >= "3.12" and python_version < "4.0"
-imbalanced-learn==0.13.0 ; python_version >= "3.12" and python_version < "4.0"
-importlib-metadata==8.7.1 ; python_version >= "3.12" and python_version < "4.0"
-itsdangerous==2.2.0 ; python_version >= "3.12" and python_version < "4.0"
-jinja2==3.1.6 ; python_version >= "3.12" and python_version < "4.0"
-joblib==1.5.3 ; python_version >= "3.12" and python_version < "4.0"
-kiwisolver==1.4.9 ; python_version >= "3.12" and python_version < "4.0"
-limits==5.6.0 ; python_version >= "3.12" and python_version < "4.0"
-mako==1.3.10 ; python_version >= "3.12" and python_version < "4.0"
-markdown-it-py==4.0.0 ; python_version >= "3.12" and python_version < "4.0"
-markupsafe==3.0.3 ; python_version >= "3.12" and python_version < "4.0"
-matplotlib==3.10.8 ; python_version >= "3.12" and python_version < "4.0"
-mdurl==0.1.2 ; python_version >= "3.12" and python_version < "4.0"
-mlflow-skinny==3.8.1 ; python_version >= "3.12" and python_version < "4.0"
-mlflow-tracing==3.8.1 ; python_version >= "3.12" and python_version < "4.0"
-mlflow==3.8.1 ; python_version >= "3.12" and python_version < "4.0"
-numpy==2.4.0 ; python_version >= "3.12" and python_version < "4.0"
-nvidia-nccl-cu12==2.28.9 ; python_version >= "3.12" and python_version < "4.0" and platform_system == "Linux" and platform_machine != "aarch64"
-opentelemetry-api==1.39.1 ; python_version >= "3.12" and python_version < "4.0"
-opentelemetry-proto==1.39.1 ; python_version >= "3.12" and python_version < "4.0"
-opentelemetry-sdk==1.39.1 ; python_version >= "3.12" and python_version < "4.0"
-opentelemetry-semantic-conventions==0.60b1 ; python_version >= "3.12" and python_version < "4.0"
-orjson==3.11.5 ; python_version >= "3.12" and python_version < "4.0"
-packaging==25.0 ; python_version >= "3.12" and python_version < "4.0"
-pandas==2.3.3 ; python_version >= "3.12" and python_version < "4.0"
-pillow==12.0.0 ; python_version >= "3.12" and python_version < "4.0"
-protobuf==6.33.2 ; python_version >= "3.12" and python_version < "4.0"
-pyarrow==22.0.0 ; python_version >= "3.12" and python_version < "4.0"
-pyasn1-modules==0.4.2 ; python_version >= "3.12" and python_version < "4.0"
-pyasn1==0.6.1 ; python_version >= "3.12" and python_version < "4.0"
-pycparser==2.23 ; python_version >= "3.12" and python_version < "4.0" and platform_python_implementation != "PyPy" and implementation_name != "PyPy"
-pydantic-core==2.41.5 ; python_version >= "3.12" and python_version < "4.0"
-pydantic==2.12.5 ; python_version >= "3.12" and python_version < "4.0"
-pydub==0.25.1 ; python_version >= "3.12" and python_version < "4.0"
-pygments==2.19.2 ; python_version >= "3.12" and python_version < "4.0"
-pyparsing==3.3.1 ; python_version >= "3.12" and python_version < "4.0"
-python-dateutil==2.9.0.post0 ; python_version >= "3.12" and python_version < "4.0"
-python-dotenv==1.2.1 ; python_version >= "3.12" and python_version < "4.0"
-python-json-logger==4.0.0 ; python_version >= "3.12" and python_version < "4.0"
-python-multipart==0.0.21 ; python_version >= "3.12" and python_version < "4.0"
-pytz==2025.2 ; python_version >= "3.12" and python_version < "4.0"
-pywin32==311 ; python_version >= "3.12" and python_version < "4.0" and sys_platform == "win32"
-pyyaml==6.0.3 ; python_version >= "3.12" and python_version < "4.0"
-requests==2.32.5 ; python_version >= "3.12" and python_version < "4.0"
-rich==14.2.0 ; python_version >= "3.12" and python_version < "4.0"
-rsa==4.9.1 ; python_version >= "3.12" and python_version < "4.0"
-safehttpx==0.1.7 ; python_version >= "3.12" and python_version < "4.0"
-scikit-learn==1.6.1 ; python_version >= "3.12" and python_version < "4.0"
-scipy==1.16.3 ; python_version >= "3.12" and python_version < "4.0"
-semantic-version==2.10.0 ; python_version >= "3.12" and python_version < "4.0"
-shellingham==1.5.4 ; python_version >= "3.12" and python_version < "4.0"
-six==1.17.0 ; python_version >= "3.12" and python_version < "4.0"
-sklearn-compat==0.1.5 ; python_version >= "3.12" and python_version < "4.0"
-slowapi==0.1.9 ; python_version >= "3.12" and python_version < "4.0"
-smmap==5.0.2 ; python_version >= "3.12" and python_version < "4.0"
-sqlalchemy==2.0.45 ; python_version >= "3.12" and python_version < "4.0"
-sqlparse==0.5.5 ; python_version >= "3.12" and python_version < "4.0"
-starlette==0.50.0 ; python_version >= "3.12" and python_version < "4.0"
-threadpoolctl==3.6.0 ; python_version >= "3.12" and python_version < "4.0"
-tomlkit==0.13.3 ; python_version >= "3.12" and python_version < "4.0"
-tqdm==4.67.1 ; python_version >= "3.12" and python_version < "4.0"
-typer-slim==0.21.0 ; python_version >= "3.12" and python_version < "4.0"
-typer==0.21.0 ; python_version >= "3.12" and python_version < "4.0"
-typing-extensions==4.15.0 ; python_version >= "3.12" and python_version < "4.0"
-typing-inspection==0.4.2 ; python_version >= "3.12" and python_version < "4.0"
-tzdata==2025.3 ; python_version >= "3.12" and python_version < "4.0"
-urllib3==2.6.2 ; python_version >= "3.12" and python_version < "4.0"
-uvicorn==0.32.1 ; python_version >= "3.12" and python_version < "4.0"
-uvloop==0.22.1 ; python_version >= "3.12" and python_version < "4.0" and sys_platform != "win32" and sys_platform != "cygwin" and platform_python_implementation != "PyPy"
-waitress==3.0.2 ; python_version >= "3.12" and python_version < "4.0" and platform_system == "Windows"
-watchfiles==1.1.1 ; python_version >= "3.12" and python_version < "4.0"
-websockets==15.0.1 ; python_version >= "3.12" and python_version < "4.0"
-werkzeug==3.1.4 ; python_version >= "3.12" and python_version < "4.0"
-wrapt==2.0.1 ; python_version >= "3.12" and python_version < "4.0"
-xgboost==2.1.4 ; python_version >= "3.12" and python_version < "4.0"
-zipp==3.23.0 ; python_version >= "3.12" and python_version < "4.0"

+# Requirements for Hugging Face Spaces (Gradio app)
+# Minimal dependencies needed for the Gradio interface
+# Generated from pyproject.toml with essential packages only
+# Core ML libraries
+scikit-learn>=1.6.0,<1.7.0
+xgboost>=2.1.0,<3.0.0
+numpy>=2.0.0,<3.0.0
+pandas>=2.2.0,<3.0.0
+joblib>=1.4.0,<2.0.0
+scipy>=1.14.0,<2.0.0
+# Gradio and web framework
+gradio>=6.2.0,<7.0.0
+fastapi>=0.127.0,<1.0.0
+uvicorn[standard]>=0.32.0,<1.0.0
+pydantic>=2.10.0,<3.0.0
+# Data processing
+imbalanced-learn>=0.13.0,<1.0.0
+# Hugging Face
+huggingface-hub>=1.2.0,<2.0.0
+# Utilities
+python-dotenv>=1.0.0,<2.0.0
+python-json-logger>=4.0.0,<5.0.0

requirements_full.txt ADDED Viewed

	@@ -0,0 +1,123 @@

+aiofiles==24.1.0 ; python_version >= "3.12" and python_version < "4.0"
+alembic==1.17.2 ; python_version >= "3.12" and python_version < "4.0"
+annotated-doc==0.0.4 ; python_version >= "3.12" and python_version < "4.0"
+annotated-types==0.7.0 ; python_version >= "3.12" and python_version < "4.0"
+anyio==4.12.0 ; python_version >= "3.12" and python_version < "4.0"
+audioop-lts==0.2.2 ; python_version >= "3.13" and python_version < "4.0"
+blinker==1.9.0 ; python_version >= "3.12" and python_version < "4.0"
+brotli==1.2.0 ; python_version >= "3.12" and python_version < "4.0"
+cachetools==6.2.4 ; python_version >= "3.12" and python_version < "4.0"
+certifi==2025.11.12 ; python_version >= "3.12" and python_version < "4.0"
+cffi==2.0.0 ; python_version >= "3.12" and python_version < "4.0" and platform_python_implementation != "PyPy"
+charset-normalizer==3.4.4 ; python_version >= "3.12" and python_version < "4.0"
+click==8.3.1 ; python_version >= "3.12" and python_version < "4.0"
+cloudpickle==3.1.2 ; python_version >= "3.12" and python_version < "4.0"
+colorama==0.4.6 ; python_version >= "3.12" and python_version < "4.0" and (platform_system == "Windows" or sys_platform == "win32")
+contourpy==1.3.3 ; python_version >= "3.12" and python_version < "4.0"
+cryptography==46.0.3 ; python_version >= "3.12" and python_version < "4.0"
+cycler==0.12.1 ; python_version >= "3.12" and python_version < "4.0"
+databricks-sdk==0.76.0 ; python_version >= "3.12" and python_version < "4.0"
+deprecated==1.3.1 ; python_version >= "3.12" and python_version < "4.0"
+docker==7.1.0 ; python_version >= "3.12" and python_version < "4.0"
+fastapi==0.127.1 ; python_version >= "3.12" and python_version < "4.0"
+ffmpy==1.0.0 ; python_version >= "3.12" and python_version < "4.0"
+filelock==3.20.1 ; python_version >= "3.12" and python_version < "4.0"
+flask-cors==6.0.2 ; python_version >= "3.12" and python_version < "4.0"
+flask==3.1.2 ; python_version >= "3.12" and python_version < "4.0"
+fonttools==4.61.1 ; python_version >= "3.12" and python_version < "4.0"
+fsspec==2025.12.0 ; python_version >= "3.12" and python_version < "4.0"
+gitdb==4.0.12 ; python_version >= "3.12" and python_version < "4.0"
+gitpython==3.1.45 ; python_version >= "3.12" and python_version < "4.0"
+google-auth==2.45.0 ; python_version >= "3.12" and python_version < "4.0"
+gradio-client==2.0.2 ; python_version >= "3.12" and python_version < "4.0"
+gradio==6.2.0 ; python_version >= "3.12" and python_version < "4.0"
+graphene==3.4.3 ; python_version >= "3.12" and python_version < "4.0"
+graphql-core==3.2.7 ; python_version >= "3.12" and python_version < "4.0"
+graphql-relay==3.2.0 ; python_version >= "3.12" and python_version < "4.0"
+greenlet==3.3.0 ; python_version >= "3.12" and python_version < "4.0" and (platform_machine == "aarch64" or platform_machine == "ppc64le" or platform_machine == "x86_64" or platform_machine == "amd64" or platform_machine == "AMD64" or platform_machine == "win32" or platform_machine == "WIN32")
+groovy==0.1.2 ; python_version >= "3.12" and python_version < "4.0"
+gunicorn==23.0.0 ; python_version >= "3.12" and python_version < "4.0" and platform_system != "Windows"
+h11==0.16.0 ; python_version >= "3.12" and python_version < "4.0"
+hf-xet==1.2.0 ; python_version >= "3.12" and python_version < "4.0" and (platform_machine == "x86_64" or platform_machine == "amd64" or platform_machine == "AMD64" or platform_machine == "arm64" or platform_machine == "aarch64")
+httpcore==1.0.9 ; python_version >= "3.12" and python_version < "4.0"
+httptools==0.7.1 ; python_version >= "3.12" and python_version < "4.0"
+httpx==0.28.1 ; python_version >= "3.12" and python_version < "4.0"
+huey==2.5.5 ; python_version >= "3.12" and python_version < "4.0"
+huggingface-hub==1.2.3 ; python_version >= "3.12" and python_version < "4.0"
+idna==3.11 ; python_version >= "3.12" and python_version < "4.0"
+imbalanced-learn==0.13.0 ; python_version >= "3.12" and python_version < "4.0"
+importlib-metadata==8.7.1 ; python_version >= "3.12" and python_version < "4.0"
+itsdangerous==2.2.0 ; python_version >= "3.12" and python_version < "4.0"
+jinja2==3.1.6 ; python_version >= "3.12" and python_version < "4.0"
+joblib==1.5.3 ; python_version >= "3.12" and python_version < "4.0"
+kiwisolver==1.4.9 ; python_version >= "3.12" and python_version < "4.0"
+limits==5.6.0 ; python_version >= "3.12" and python_version < "4.0"
+mako==1.3.10 ; python_version >= "3.12" and python_version < "4.0"
+markdown-it-py==4.0.0 ; python_version >= "3.12" and python_version < "4.0"
+markupsafe==3.0.3 ; python_version >= "3.12" and python_version < "4.0"
+matplotlib==3.10.8 ; python_version >= "3.12" and python_version < "4.0"
+mdurl==0.1.2 ; python_version >= "3.12" and python_version < "4.0"
+mlflow-skinny==3.8.1 ; python_version >= "3.12" and python_version < "4.0"
+mlflow-tracing==3.8.1 ; python_version >= "3.12" and python_version < "4.0"
+mlflow==3.8.1 ; python_version >= "3.12" and python_version < "4.0"
+numpy==2.4.0 ; python_version >= "3.12" and python_version < "4.0"
+nvidia-nccl-cu12==2.28.9 ; python_version >= "3.12" and python_version < "4.0" and platform_system == "Linux" and platform_machine != "aarch64"
+opentelemetry-api==1.39.1 ; python_version >= "3.12" and python_version < "4.0"
+opentelemetry-proto==1.39.1 ; python_version >= "3.12" and python_version < "4.0"
+opentelemetry-sdk==1.39.1 ; python_version >= "3.12" and python_version < "4.0"
+opentelemetry-semantic-conventions==0.60b1 ; python_version >= "3.12" and python_version < "4.0"
+orjson==3.11.5 ; python_version >= "3.12" and python_version < "4.0"
+packaging==25.0 ; python_version >= "3.12" and python_version < "4.0"
+pandas==2.3.3 ; python_version >= "3.12" and python_version < "4.0"
+pillow==12.0.0 ; python_version >= "3.12" and python_version < "4.0"
+protobuf==6.33.2 ; python_version >= "3.12" and python_version < "4.0"
+psycopg2-binary==2.9.9 ; python_version >= "3.12" and python_version < "4.0"
+pyarrow==22.0.0 ; python_version >= "3.12" and python_version < "4.0"
+pyasn1-modules==0.4.2 ; python_version >= "3.12" and python_version < "4.0"
+pyasn1==0.6.1 ; python_version >= "3.12" and python_version < "4.0"
+pycparser==2.23 ; python_version >= "3.12" and python_version < "4.0" and platform_python_implementation != "PyPy" and implementation_name != "PyPy"
+pydantic-core==2.41.5 ; python_version >= "3.12" and python_version < "4.0"
+pydantic==2.12.5 ; python_version >= "3.12" and python_version < "4.0"
+pydub==0.25.1 ; python_version >= "3.12" and python_version < "4.0"
+pygments==2.19.2 ; python_version >= "3.12" and python_version < "4.0"
+pyparsing==3.3.1 ; python_version >= "3.12" and python_version < "4.0"
+python-dateutil==2.9.0.post0 ; python_version >= "3.12" and python_version < "4.0"
+python-dotenv==1.0.0 ; python_version >= "3.12" and python_version < "4.0"
+python-json-logger==4.0.0 ; python_version >= "3.12" and python_version < "4.0"
+python-multipart==0.0.21 ; python_version >= "3.12" and python_version < "4.0"
+pytz==2025.2 ; python_version >= "3.12" and python_version < "4.0"
+pywin32==311 ; python_version >= "3.12" and python_version < "4.0" and sys_platform == "win32"
+pyyaml==6.0.3 ; python_version >= "3.12" and python_version < "4.0"
+requests==2.32.5 ; python_version >= "3.12" and python_version < "4.0"
+rich==14.2.0 ; python_version >= "3.12" and python_version < "4.0"
+rsa==4.9.1 ; python_version >= "3.12" and python_version < "4.0"
+safehttpx==0.1.7 ; python_version >= "3.12" and python_version < "4.0"
+scikit-learn==1.6.1 ; python_version >= "3.12" and python_version < "4.0"
+scipy==1.16.3 ; python_version >= "3.12" and python_version < "4.0"
+semantic-version==2.10.0 ; python_version >= "3.12" and python_version < "4.0"
+shellingham==1.5.4 ; python_version >= "3.12" and python_version < "4.0"
+six==1.17.0 ; python_version >= "3.12" and python_version < "4.0"
+sklearn-compat==0.1.5 ; python_version >= "3.12" and python_version < "4.0"
+slowapi==0.1.9 ; python_version >= "3.12" and python_version < "4.0"
+smmap==5.0.2 ; python_version >= "3.12" and python_version < "4.0"
+sqlalchemy==2.0.23 ; python_version >= "3.12" and python_version < "4.0"
+sqlparse==0.5.5 ; python_version >= "3.12" and python_version < "4.0"
+starlette==0.50.0 ; python_version >= "3.12" and python_version < "4.0"
+threadpoolctl==3.6.0 ; python_version >= "3.12" and python_version < "4.0"
+tomlkit==0.13.3 ; python_version >= "3.12" and python_version < "4.0"
+tqdm==4.67.1 ; python_version >= "3.12" and python_version < "4.0"
+typer-slim==0.21.0 ; python_version >= "3.12" and python_version < "4.0"
+typer==0.21.0 ; python_version >= "3.12" and python_version < "4.0"
+typing-extensions==4.15.0 ; python_version >= "3.12" and python_version < "4.0"
+typing-inspection==0.4.2 ; python_version >= "3.12" and python_version < "4.0"
+tzdata==2025.3 ; python_version >= "3.12" and python_version < "4.0"
+urllib3==2.6.2 ; python_version >= "3.12" and python_version < "4.0"
+uvicorn==0.32.1 ; python_version >= "3.12" and python_version < "4.0"
+uvloop==0.22.1 ; python_version >= "3.12" and python_version < "4.0" and sys_platform != "win32" and sys_platform != "cygwin" and platform_python_implementation != "PyPy"
+waitress==3.0.2 ; python_version >= "3.12" and python_version < "4.0" and platform_system == "Windows"
+watchfiles==1.1.1 ; python_version >= "3.12" and python_version < "4.0"
+websockets==15.0.1 ; python_version >= "3.12" and python_version < "4.0"
+werkzeug==3.1.4 ; python_version >= "3.12" and python_version < "4.0"
+wrapt==2.0.1 ; python_version >= "3.12" and python_version < "4.0"
+xgboost==2.1.4 ; python_version >= "3.12" and python_version < "4.0"
+zipp==3.23.0 ; python_version >= "3.12" and python_version < "4.0"

Dockerfile → src/Dockerfile RENAMED Viewed

@@ -2,19 +2,26 @@ FROM python:3.12-slim
 WORKDIR /app
 # Installer les dépendances système
 RUN apt-get update && apt-get install -y \
     curl \
     && rm -rf /var/lib/apt/lists/*
-# Copier les fichiers de dépendances
-COPY requirements.txt .
-# Installer les dépendances Python
-RUN pip install --no-cache-dir -r requirements.txt
 # Copier le code de l'application
 COPY app.py .
 COPY src/ ./src/
 COPY .env.example .env

 WORKDIR /app
+# Installer Poetry
+RUN pip install poetry
 # Installer les dépendances système
 RUN apt-get update && apt-get install -y \
     curl \
     && rm -rf /var/lib/apt/lists/*
+# Copier les fichiers de dépendances Poetry
+COPY pyproject.toml poetry.lock ./
+# Configurer Poetry pour ne pas créer d'environnement virtuel
+RUN poetry config virtualenvs.create false
+# Installer les dépendances Python via Poetry
+RUN poetry install --no-dev --no-interaction --no-ansi
 # Copier le code de l'application
 COPY app.py .
+COPY db_models.py .
 COPY src/ ./src/
 COPY .env.example .env

src/config.py CHANGED Viewed

@@ -40,6 +40,11 @@ class Settings:
     DEBUG: bool = os.getenv("DEBUG", "False").lower() == "true"
     LOG_LEVEL: str = os.getenv("LOG_LEVEL", "INFO")
     @property
     def is_api_key_required(self) -> bool:
         """

     DEBUG: bool = os.getenv("DEBUG", "False").lower() == "true"
     LOG_LEVEL: str = os.getenv("LOG_LEVEL", "INFO")
+    # ===== BASE DE DONNÉES =====
+    DATABASE_URL: str = os.getenv(
+        "DATABASE_URL", "postgresql://ml_user:15975359320@localhost:5432/oc_p5_db"
+    )
     @property
     def is_api_key_required(self) -> bool:
         """

src/gradio_ui.py CHANGED Viewed

@@ -8,6 +8,7 @@ Cette interface permet de:
 - Comprendre les champs requis
 """
 import gradio as gr
 from src.models import get_model_info, load_model
 from src.preprocessing import preprocess_for_prediction
@@ -123,6 +124,36 @@ def predict_turnover(
         confidence = max(prob_0, prob_1) * 100
         result = f"""
 ## {risk_emoji}
@@ -132,6 +163,9 @@ def predict_turnover(
 - **Probabilité de départ**: {prob_1 * 100:.1f}%
 - **Probabilité de maintien**: {prob_0 * 100:.1f}%
 ### Interprétation
 {"⚠️ Cet employé présente des facteurs de risque de départ. Il est recommandé d'engager un dialogue pour comprendre ses attentes." if prediction == 1 else "✅ Cet employé semble stable. Continuez à maintenir un environnement de travail positif."}
 """
@@ -567,7 +601,7 @@ def launch_standalone():
     demo.launch(
         server_name="0.0.0.0",
         server_port=7860,
-        share=False,  # Pas de tunnel Gradio sur HF Spaces
         show_error=True,
     )

 - Comprendre les champs requis
 """
 import gradio as gr
+import os
 from src.models import get_model_info, load_model
 from src.preprocessing import preprocess_for_prediction
         confidence = max(prob_0, prob_1) * 100
+        # Enregistrer dans la base de données (uniquement en local)
+        db_status = "ℹ️ DB désactivée sur HF Spaces"
+        try:
+            # Vérifier si on est sur HF Spaces (variable d'environnement)
+            if os.getenv("SPACE_ID") is None:  # Pas sur HF Spaces
+                from sqlalchemy import create_engine
+                from sqlalchemy.orm import sessionmaker
+                from src.config import get_settings
+                settings = get_settings()
+                engine = create_engine(settings.DATABASE_URL)
+                Session = sessionmaker(bind=engine)
+                session = Session()
+                # Importer le modèle MLLog
+                from db_models import MLLog
+                # Créer le log
+                log_entry = MLLog(
+                    input_json=employee.dict(),  # Convertir Pydantic en dict
+                    prediction="Oui" if prediction == 1 else "Non",
+                )
+                session.add(log_entry)
+                session.commit()
+                session.close()
+                db_status = "✅ Enregistré en DB"
+        except Exception as db_error:
+            db_status = f"⚠️ Erreur DB: {str(db_error)}"
         result = f"""
 ## {risk_emoji}
 - **Probabilité de départ**: {prob_1 * 100:.1f}%
 - **Probabilité de maintien**: {prob_0 * 100:.1f}%
+### Base de données
+{db_status}
 ### Interprétation
 {"⚠️ Cet employé présente des facteurs de risque de départ. Il est recommandé d'engager un dialogue pour comprendre ses attentes." if prediction == 1 else "✅ Cet employé semble stable. Continuez à maintenir un environnement de travail positif."}
 """
     demo.launch(
         server_name="0.0.0.0",
         server_port=7860,
+        share=False,
         show_error=True,
     )

src/schemas.py CHANGED Viewed

@@ -8,7 +8,7 @@ permettant une validation stricte des inputs avec messages d'erreur clairs.
 from enum import Enum
 from typing import Literal
-from pydantic import BaseModel, Field, field_validator
 # Enums pour les valeurs catégorielles
@@ -172,10 +172,8 @@ class EmployeeInput(BaseModel):
             v = float(v.replace(" %", "").replace("%", ""))
         return v
-    class Config:
-        """Configuration Pydantic."""
-        json_schema_extra = {
             "example": {
                 # Exemple basé sur la première ligne des CSV
                 "nombre_participation_pee": 0,
@@ -210,6 +208,7 @@ class EmployeeInput(BaseModel):
                 "annees_dans_le_poste_actuel": 4,
             }
         }
 class PredictionOutput(BaseModel):
@@ -224,10 +223,8 @@ class PredictionOutput(BaseModel):
     )
     risk_level: str = Field(..., description="Niveau de risque (Low/Medium/High)")
-    class Config:
-        """Configuration Pydantic."""
-        json_schema_extra = {
             "example": {
                 "prediction": 1,
                 "probability_0": 0.35,
@@ -235,6 +232,7 @@ class PredictionOutput(BaseModel):
                 "risk_level": "High",
             }
         }
 class HealthCheck(BaseModel):
@@ -245,10 +243,8 @@ class HealthCheck(BaseModel):
     model_type: str = Field(..., description="Type du modèle")
     version: str = Field(..., description="Version de l'API")
-    class Config:
-        """Configuration Pydantic."""
-        json_schema_extra = {
             "example": {
                 "status": "healthy",
                 "model_loaded": True,
@@ -256,6 +252,7 @@ class HealthCheck(BaseModel):
                 "version": "1.0.0",
             }
         }
 class EmployeePrediction(BaseModel):
@@ -281,10 +278,8 @@ class BatchPredictionOutput(BaseModel):
     )
     summary: dict = Field(..., description="Résumé des prédictions")
-    class Config:
-        """Configuration Pydantic."""
-        json_schema_extra = {
             "example": {
                 "total_employees": 100,
                 "predictions": [
@@ -305,3 +300,4 @@ class BatchPredictionOutput(BaseModel):
                 },
             }
         }

 from enum import Enum
 from typing import Literal
+from pydantic import BaseModel, Field, field_validator, ConfigDict
 # Enums pour les valeurs catégorielles
             v = float(v.replace(" %", "").replace("%", ""))
         return v
+    model_config = ConfigDict(
+        json_schema_extra={
             "example": {
                 # Exemple basé sur la première ligne des CSV
                 "nombre_participation_pee": 0,
                 "annees_dans_le_poste_actuel": 4,
             }
         }
+    )
 class PredictionOutput(BaseModel):
     )
     risk_level: str = Field(..., description="Niveau de risque (Low/Medium/High)")
+    model_config = ConfigDict(
+        json_schema_extra={
             "example": {
                 "prediction": 1,
                 "probability_0": 0.35,
                 "risk_level": "High",
             }
         }
+    )
 class HealthCheck(BaseModel):
     model_type: str = Field(..., description="Type du modèle")
     version: str = Field(..., description="Version de l'API")
+    model_config = ConfigDict(
+        json_schema_extra={
             "example": {
                 "status": "healthy",
                 "model_loaded": True,
                 "version": "1.0.0",
             }
         }
+    )
 class EmployeePrediction(BaseModel):
     )
     summary: dict = Field(..., description="Résumé des prédictions")
+    model_config = ConfigDict(
+        json_schema_extra={
             "example": {
                 "total_employees": 100,
                 "predictions": [
                 },
             }
         }
+    )