Spaces:
Running
Running
File size: 5,041 Bytes
42a08fb | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 | ---
title: Credit Scoring - Home Credit Default Risk
emoji: π
colorFrom: blue
colorTo: green
sdk: gradio
sdk_version: "4.44.1"
python_version: "3.12"
app_file: app.py
pinned: false
---
# OC_P6 - API Scoring Credit (MLOps)
## π Demo live
https://huggingface.co/spaces/ASI-Engineer/OC_P8_prod
https://huggingface.co/spaces/ASI-Engineer/OC_P8_test
## Resultats optimisation etape 4
- Gain latence : **15.7x** (0.64 ms -> 0.04 ms par requete)
- Precision : 100 % identique
- Voir [reports/rapport_optimisation.md](reports/rapport_optimisation.md) complet
## Architecture finale
- FastAPI/Gradio + Docker (entrypoint : [app.py](app.py))
- Monitoring logs + Evidently (drift)
- Optimisation : VectorizedPreprocessor (15.7x)
## Etapes realisees
- Etape 2 : API + Docker + CI/CD
- Etape 3 : Stockage + analyse prod
- Etape 4 : Optimisation perfs (terminee)
## Apercu du projet (audit rapide)
- Donnees brutes et features : [data/raw](data/raw), [data/processed](data/processed)
- Pipeline data/model : [src/load_data.py](src/load_data.py), [src/preprocessing.py](src/preprocessing.py)
- Experiments et artefacts : [mlruns](mlruns), [models](models)
- Notebooks MLOps : [notebooks](notebooks)
- Monitoring prod : [logs/predictions.jsonl](logs/predictions.jsonl), [reports](reports)
- Tests : [tests](tests)
- Conteneurisation : [Dockerfile](Dockerfile)
## Structure du projet
```
OC_P6/
βββ app.py
βββ Dockerfile
βββ pyproject.toml
βββ requirements.txt
βββ requirements-inference.txt
βββ data/
β βββ raw/
β βββ processed/
βββ logs/
β βββ predictions.jsonl
βββ mlruns/
βββ models/
β βββ export_model.py
β βββ export_preprocessor.py
β βββ lightgbm.txt
β βββ preprocessor.joblib
βββ notebooks/
β βββ 01_exploration.ipynb
β βββ 02_preparation_features.ipynb
β βββ 03_LGBM.ipynb
β βββ 04_regression.ipynb
β βββ 05_model_interpretation.ipynb
β βββ 06_analyse_logs.ipynb
β βββ 07_detect_data_drift.ipynb
β βββ 08_analyze_logs_2.ipynb
β βββ 09_profiling.ipynb
β βββ 10_optimisation.ipynb
βββ reference/
β βββ reference.csv
β βββ simulate_production_calls.py
βββ reports/
β βββ data_drift_report.html
β βββ monitoring_study.md
β βββ plots/
βββ src/
β βββ __init__.py
β βββ load_data.py
β βββ mlflow_config.py
β βββ preprocessing.py
βββ tests/
βββ conftest.py
βββ test_predict.py
βββ test_preprocessing.py
```
## Installation (UV recommande)
```bash
curl -LsSf https://astral.sh/uv/install.sh | sh
uv sync
```
## Donnees
Source : Kaggle Home Credit Default Risk.
Placer les fichiers dans [data/raw](data/raw) :
- application_train.csv
- application_test.csv
- bureau.csv
- bureau_balance.csv
- credit_card_balance.csv
- installments_payments.csv
- POS_CASH_balance.csv
- previous_application.csv
## Notebooks (resume)
- Exploration : [notebooks/01_exploration.ipynb](notebooks/01_exploration.ipynb)
- Feature engineering : [notebooks/02_preparation_features.ipynb](notebooks/02_preparation_features.ipynb)
- Modelling LGBM + MLflow : [notebooks/03_LGBM.ipynb](notebooks/03_LGBM.ipynb)
- Baseline regression : [notebooks/04_regression.ipynb](notebooks/04_regression.ipynb)
- Interpretation : [notebooks/05_model_interpretation.ipynb](notebooks/05_model_interpretation.ipynb)
- Monitoring et drift : [notebooks/06_analyse_logs.ipynb](notebooks/06_analyse_logs.ipynb), [notebooks/07_detect_data_drift.ipynb](notebooks/07_detect_data_drift.ipynb)
- Profiling et optimisation : [notebooks/09_profiling.ipynb](notebooks/09_profiling.ipynb), [notebooks/10_optimisation.ipynb](notebooks/10_optimisation.ipynb)
## Comment tester localement
```bash
uv sync
uv run python app.py
```
Option Docker :
```bash
docker build -t oc_p6:latest .
docker run --rm -it -p 7860:7860 oc_p6:latest
```
## Usage API (local ou HF Space)
Exemple JSON minimal :
```json
{"SK_ID_CURR": 100001, "AMT_INCOME_TOTAL": 202500.0, "AMT_CREDIT": 80000.0, "CODE_GENDER": "M", "DAYS_BIRTH": -12000}
```
Requete vers la Space de production :
```bash
curl -s -X POST "https://huggingface.co/spaces/ASI-Engineer/OC_P8_prod/api/predict" \
-H "Content-Type: application/json" \
-d '{"data":["{\"SK_ID_CURR\":100001,\"AMT_INCOME_TOTAL\":202500.0,\"AMT_CREDIT\":80000.0,\"CODE_GENDER\":\"M\",\"DAYS_BIRTH\":-12000}"]}'
```
## Monitoring et data drift
- Rapport monitoring : [reports/monitoring_study.md](reports/monitoring_study.md)
- Rapport drift Evidently : [reports/data_drift_report.html](reports/data_drift_report.html)
- Plots latence et scores : [reports/plots](reports/plots)
- Simulation d'appels prod : [reference/simulate_production_calls.py](reference/simulate_production_calls.py)
## Tests
```bash
uv run pytest
```
**Date** : 25 fevrier 2026
**Statut** : Projet termine OK, pret pour soutenance
|