--- title: Credit Scoring - Home Credit Default Risk emoji: 📊 colorFrom: blue colorTo: green sdk: gradio sdk_version: "4.44.1" python_version: "3.12" app_file: app.py pinned: false --- # OC_P6 - API Scoring Credit (MLOps) ## 🚀 Demo live https://huggingface.co/spaces/ASI-Engineer/OC_P8_prod https://huggingface.co/spaces/ASI-Engineer/OC_P8_test ## Resultats optimisation etape 4 - Gain latence : **15.7x** (0.64 ms -> 0.04 ms par requete) - Precision : 100 % identique - Voir [reports/rapport_optimisation.md](reports/rapport_optimisation.md) complet ## Architecture finale - FastAPI/Gradio + Docker (entrypoint : [app.py](app.py)) - Monitoring logs + Evidently (drift) - Optimisation : VectorizedPreprocessor (15.7x) ## Etapes realisees - Etape 2 : API + Docker + CI/CD - Etape 3 : Stockage + analyse prod - Etape 4 : Optimisation perfs (terminee) ## Apercu du projet (audit rapide) - Donnees brutes et features : [data/raw](data/raw), [data/processed](data/processed) - Pipeline data/model : [src/load_data.py](src/load_data.py), [src/preprocessing.py](src/preprocessing.py) - Experiments et artefacts : [mlruns](mlruns), [models](models) - Notebooks MLOps : [notebooks](notebooks) - Monitoring prod : [logs/predictions.jsonl](logs/predictions.jsonl), [reports](reports) - Tests : [tests](tests) - Conteneurisation : [Dockerfile](Dockerfile) ## Structure du projet ``` OC_P6/ ├── app.py ├── Dockerfile ├── pyproject.toml ├── requirements.txt ├── requirements-inference.txt ├── data/ │ ├── raw/ │ └── processed/ ├── logs/ │ └── predictions.jsonl ├── mlruns/ ├── models/ │ ├── export_model.py │ ├── export_preprocessor.py │ ├── lightgbm.txt │ └── preprocessor.joblib ├── notebooks/ │ ├── 01_exploration.ipynb │ ├── 02_preparation_features.ipynb │ ├── 03_LGBM.ipynb │ ├── 04_regression.ipynb │ ├── 05_model_interpretation.ipynb │ ├── 06_analyse_logs.ipynb │ ├── 07_detect_data_drift.ipynb │ ├── 08_analyze_logs_2.ipynb │ ├── 09_profiling.ipynb │ └── 10_optimisation.ipynb ├── reference/ │ ├── reference.csv │ └── simulate_production_calls.py ├── reports/ │ ├── data_drift_report.html │ ├── monitoring_study.md │ └── plots/ ├── src/ │ ├── __init__.py │ ├── load_data.py │ ├── mlflow_config.py │ └── preprocessing.py └── tests/ ├── conftest.py ├── test_predict.py └── test_preprocessing.py ``` ## Installation (UV recommande) ```bash curl -LsSf https://astral.sh/uv/install.sh | sh uv sync ``` ## Donnees Source : Kaggle Home Credit Default Risk. Placer les fichiers dans [data/raw](data/raw) : - application_train.csv - application_test.csv - bureau.csv - bureau_balance.csv - credit_card_balance.csv - installments_payments.csv - POS_CASH_balance.csv - previous_application.csv ## Notebooks (resume) - Exploration : [notebooks/01_exploration.ipynb](notebooks/01_exploration.ipynb) - Feature engineering : [notebooks/02_preparation_features.ipynb](notebooks/02_preparation_features.ipynb) - Modelling LGBM + MLflow : [notebooks/03_LGBM.ipynb](notebooks/03_LGBM.ipynb) - Baseline regression : [notebooks/04_regression.ipynb](notebooks/04_regression.ipynb) - Interpretation : [notebooks/05_model_interpretation.ipynb](notebooks/05_model_interpretation.ipynb) - Monitoring et drift : [notebooks/06_analyse_logs.ipynb](notebooks/06_analyse_logs.ipynb), [notebooks/07_detect_data_drift.ipynb](notebooks/07_detect_data_drift.ipynb) - Profiling et optimisation : [notebooks/09_profiling.ipynb](notebooks/09_profiling.ipynb), [notebooks/10_optimisation.ipynb](notebooks/10_optimisation.ipynb) ## Comment tester localement ```bash uv sync uv run python app.py ``` Option Docker : ```bash docker build -t oc_p6:latest . docker run --rm -it -p 7860:7860 oc_p6:latest ``` ## Usage API (local ou HF Space) Exemple JSON minimal : ```json {"SK_ID_CURR": 100001, "AMT_INCOME_TOTAL": 202500.0, "AMT_CREDIT": 80000.0, "CODE_GENDER": "M", "DAYS_BIRTH": -12000} ``` Requete vers la Space de production : ```bash curl -s -X POST "https://huggingface.co/spaces/ASI-Engineer/OC_P8_prod/api/predict" \ -H "Content-Type: application/json" \ -d '{"data":["{\"SK_ID_CURR\":100001,\"AMT_INCOME_TOTAL\":202500.0,\"AMT_CREDIT\":80000.0,\"CODE_GENDER\":\"M\",\"DAYS_BIRTH\":-12000}"]}' ``` ## Monitoring et data drift - Rapport monitoring : [reports/monitoring_study.md](reports/monitoring_study.md) - Rapport drift Evidently : [reports/data_drift_report.html](reports/data_drift_report.html) - Plots latence et scores : [reports/plots](reports/plots) - Simulation d'appels prod : [reference/simulate_production_calls.py](reference/simulate_production_calls.py) ## Tests ```bash uv run pytest ``` **Date** : 25 fevrier 2026 **Statut** : Projet termine OK, pret pour soutenance