Spaces:
Running
Running
| title: Credit Scoring - Home Credit Default Risk | |
| emoji: π | |
| colorFrom: blue | |
| colorTo: green | |
| sdk: gradio | |
| sdk_version: "4.44.1" | |
| python_version: "3.12" | |
| app_file: app.py | |
| pinned: false | |
| # OC_P6 - API Scoring Credit (MLOps) | |
| ## π Demo live | |
| https://huggingface.co/spaces/ASI-Engineer/OC_P8_prod | |
| https://huggingface.co/spaces/ASI-Engineer/OC_P8_test | |
| ## Resultats optimisation etape 4 | |
| - Gain latence : **15.7x** (0.64 ms -> 0.04 ms par requete) | |
| - Precision : 100 % identique | |
| - Voir [reports/rapport_optimisation.md](reports/rapport_optimisation.md) complet | |
| ## Architecture finale | |
| - FastAPI/Gradio + Docker (entrypoint : [app.py](app.py)) | |
| - Monitoring logs + Evidently (drift) | |
| - Optimisation : VectorizedPreprocessor (15.7x) | |
| ## Etapes realisees | |
| - Etape 2 : API + Docker + CI/CD | |
| - Etape 3 : Stockage + analyse prod | |
| - Etape 4 : Optimisation perfs (terminee) | |
| ## Apercu du projet (audit rapide) | |
| - Donnees brutes et features : [data/raw](data/raw), [data/processed](data/processed) | |
| - Pipeline data/model : [src/load_data.py](src/load_data.py), [src/preprocessing.py](src/preprocessing.py) | |
| - Experiments et artefacts : [mlruns](mlruns), [models](models) | |
| - Notebooks MLOps : [notebooks](notebooks) | |
| - Monitoring prod : [logs/predictions.jsonl](logs/predictions.jsonl), [reports](reports) | |
| - Tests : [tests](tests) | |
| - Conteneurisation : [Dockerfile](Dockerfile) | |
| ## Structure du projet | |
| ``` | |
| OC_P6/ | |
| βββ app.py | |
| βββ Dockerfile | |
| βββ pyproject.toml | |
| βββ requirements.txt | |
| βββ requirements-inference.txt | |
| βββ data/ | |
| β βββ raw/ | |
| β βββ processed/ | |
| βββ logs/ | |
| β βββ predictions.jsonl | |
| βββ mlruns/ | |
| βββ models/ | |
| β βββ export_model.py | |
| β βββ export_preprocessor.py | |
| β βββ lightgbm.txt | |
| β βββ preprocessor.joblib | |
| βββ notebooks/ | |
| β βββ 01_exploration.ipynb | |
| β βββ 02_preparation_features.ipynb | |
| β βββ 03_LGBM.ipynb | |
| β βββ 04_regression.ipynb | |
| β βββ 05_model_interpretation.ipynb | |
| β βββ 06_analyse_logs.ipynb | |
| β βββ 07_detect_data_drift.ipynb | |
| β βββ 08_analyze_logs_2.ipynb | |
| β βββ 09_profiling.ipynb | |
| β βββ 10_optimisation.ipynb | |
| βββ reference/ | |
| β βββ reference.csv | |
| β βββ simulate_production_calls.py | |
| βββ reports/ | |
| β βββ data_drift_report.html | |
| β βββ monitoring_study.md | |
| β βββ plots/ | |
| βββ src/ | |
| β βββ __init__.py | |
| β βββ load_data.py | |
| β βββ mlflow_config.py | |
| β βββ preprocessing.py | |
| βββ tests/ | |
| βββ conftest.py | |
| βββ test_predict.py | |
| βββ test_preprocessing.py | |
| ``` | |
| ## Installation (UV recommande) | |
| ```bash | |
| curl -LsSf https://astral.sh/uv/install.sh | sh | |
| uv sync | |
| ``` | |
| ## Donnees | |
| Source : Kaggle Home Credit Default Risk. | |
| Placer les fichiers dans [data/raw](data/raw) : | |
| - application_train.csv | |
| - application_test.csv | |
| - bureau.csv | |
| - bureau_balance.csv | |
| - credit_card_balance.csv | |
| - installments_payments.csv | |
| - POS_CASH_balance.csv | |
| - previous_application.csv | |
| ## Notebooks (resume) | |
| - Exploration : [notebooks/01_exploration.ipynb](notebooks/01_exploration.ipynb) | |
| - Feature engineering : [notebooks/02_preparation_features.ipynb](notebooks/02_preparation_features.ipynb) | |
| - Modelling LGBM + MLflow : [notebooks/03_LGBM.ipynb](notebooks/03_LGBM.ipynb) | |
| - Baseline regression : [notebooks/04_regression.ipynb](notebooks/04_regression.ipynb) | |
| - Interpretation : [notebooks/05_model_interpretation.ipynb](notebooks/05_model_interpretation.ipynb) | |
| - Monitoring et drift : [notebooks/06_analyse_logs.ipynb](notebooks/06_analyse_logs.ipynb), [notebooks/07_detect_data_drift.ipynb](notebooks/07_detect_data_drift.ipynb) | |
| - Profiling et optimisation : [notebooks/09_profiling.ipynb](notebooks/09_profiling.ipynb), [notebooks/10_optimisation.ipynb](notebooks/10_optimisation.ipynb) | |
| ## Comment tester localement | |
| ```bash | |
| uv sync | |
| uv run python app.py | |
| ``` | |
| Option Docker : | |
| ```bash | |
| docker build -t oc_p6:latest . | |
| docker run --rm -it -p 7860:7860 oc_p6:latest | |
| ``` | |
| ## Usage API (local ou HF Space) | |
| Exemple JSON minimal : | |
| ```json | |
| {"SK_ID_CURR": 100001, "AMT_INCOME_TOTAL": 202500.0, "AMT_CREDIT": 80000.0, "CODE_GENDER": "M", "DAYS_BIRTH": -12000} | |
| ``` | |
| Requete vers la Space de production : | |
| ```bash | |
| curl -s -X POST "https://huggingface.co/spaces/ASI-Engineer/OC_P8_prod/api/predict" \ | |
| -H "Content-Type: application/json" \ | |
| -d '{"data":["{\"SK_ID_CURR\":100001,\"AMT_INCOME_TOTAL\":202500.0,\"AMT_CREDIT\":80000.0,\"CODE_GENDER\":\"M\",\"DAYS_BIRTH\":-12000}"]}' | |
| ``` | |
| ## Monitoring et data drift | |
| - Rapport monitoring : [reports/monitoring_study.md](reports/monitoring_study.md) | |
| - Rapport drift Evidently : [reports/data_drift_report.html](reports/data_drift_report.html) | |
| - Plots latence et scores : [reports/plots](reports/plots) | |
| - Simulation d'appels prod : [reference/simulate_production_calls.py](reference/simulate_production_calls.py) | |
| ## Tests | |
| ```bash | |
| uv run pytest | |
| ``` | |
| **Date** : 25 fevrier 2026 | |
| **Statut** : Projet termine OK, pret pour soutenance | |