Spaces:

stephmnt
/

projet_05

Sleeping

@@ -1,5 +1,3 @@
-# projet_05
 ---
 title: OCR_Projet05
 emoji: 🔥
@@ -12,6 +10,8 @@ pinned: true
 short_description: Projet 05 formation Openclassrooms
 ---
 <a target="_blank" href="https://cookiecutter-data-science.drivendata.org/">
     <img src="https://img.shields.io/badge/CCDS-Project%20template-328F97?logo=cookiecutter" />
 </a>
@@ -76,17 +76,6 @@ Déployez un modèle de Machine Learning
 --------
----
-title: Projet 05
-emoji: 👀
-colorFrom: indigo
-colorTo: green
-sdk: gradio
-sdk_version: 5.49.1
-app_file: app.py
-pinned: false
----
 Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 <!-- Improved compatibility of back to top link: See: https://github.com/othneildrew/Best-README-Template/pull/73 -->
@@ -99,8 +88,6 @@ Check out the configuration reference at https://huggingface.co/docs/hub/spaces-
 *** Thanks again! Now go create something AMAZING! :D
 -->
 <!-- PROJECT SHIELDS -->
 <!--
 *** I'm using markdown "reference style" links for readability.
@@ -118,8 +105,6 @@ Check out the configuration reference at https://huggingface.co/docs/hub/spaces-
 [![LinkedIn][linkedin-shield]][linkedin-url]
 ![GitHub Actions Workflow Status](https://img.shields.io/github/actions/workflow/status/:user/:repo/:workflow)
 <!-- PROJECT LOGO -->
 <br />
 <div align="center">
@@ -143,8 +128,6 @@ Check out the configuration reference at https://huggingface.co/docs/hub/spaces-
   </p>
 </div>
 <!-- TABLE OF CONTENTS -->
 <details>
   <summary>Table of Contents</summary>
@@ -191,8 +174,6 @@ Here's a blank template to get started. To avoid retyping too much info, do a se
 <p align="right">(<a href="#readme-top">back to top</a>)</p>
 <!-- GETTING STARTED -->
 ## Getting Started
@@ -212,20 +193,19 @@ This is an example of how to list things you need to use the software and how to
 pip install -r requirements.txt
 uvicorn app.main:app --reload
-1. Get a free API Key at [https://example.com](https://example.com)
-2. Clone the repo
    ```sh
-   git clone https://github.com/github_username/repo_name.git
    ```
-3. Install NPM packages
    ```sh
    npm install
    ```
-4. Enter your API in `config.js`
    ```js
    const API_KEY = 'ENTER YOUR API';
    ```
-5. Change git remote url to avoid accidental pushes to base project
    ```sh
    git remote set-url origin github_username/repo_name
    git remote -v # confirm the changes

 ---
 title: OCR_Projet05
 emoji: 🔥
 short_description: Projet 05 formation Openclassrooms
 ---
+# projet_05
 <a target="_blank" href="https://cookiecutter-data-science.drivendata.org/">
     <img src="https://img.shields.io/badge/CCDS-Project%20template-328F97?logo=cookiecutter" />
 </a>
 --------
 Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 <!-- Improved compatibility of back to top link: See: https://github.com/othneildrew/Best-README-Template/pull/73 -->
 *** Thanks again! Now go create something AMAZING! :D
 -->
 <!-- PROJECT SHIELDS -->
 <!--
 *** I'm using markdown "reference style" links for readability.
 [![LinkedIn][linkedin-shield]][linkedin-url]
 ![GitHub Actions Workflow Status](https://img.shields.io/github/actions/workflow/status/:user/:repo/:workflow)
 <!-- PROJECT LOGO -->
 <br />
 <div align="center">
   </p>
 </div>
 <!-- TABLE OF CONTENTS -->
 <details>
   <summary>Table of Contents</summary>
 <p align="right">(<a href="#readme-top">back to top</a>)</p>
 <!-- GETTING STARTED -->
 ## Getting Started
 pip install -r requirements.txt
 uvicorn app.main:app --reload
+1. Clone the repo
    ```sh
+   git clone https://github.com/stephmnt/OCR_Projet05.git
    ```
+2. Install NPM packages
    ```sh
    npm install
    ```
+3. Enter your API in `config.js`
    ```js
    const API_KEY = 'ENTER YOUR API';
    ```
+4. Change git remote url to avoid accidental pushes to base project
    ```sh
    git remote set-url origin github_username/repo_name
    git remote -v # confirm the changes

hf_space/docs/docs/greeter.md ADDED Viewed

	@@ -0,0 +1,3 @@


1	+ # Exemple de classe Greeter
2	+
3	+ ::: references.test.Greeter

hf_space/docs/docs/index.md CHANGED Viewed

@@ -1,10 +1,5 @@
-# projet_05 documentation!
 ## Description
-Déployez un modèle de Machine Learning
-## Commands
-The Makefile contains the central entry points for common tasks related to this project.

+# Déployez un modèle de Machine Learning
 ## Description
+Cette documentation présente la réalisation du projet 05 du master Data scientist Machine Learning

hf_space/docs/mkdocs.yml CHANGED Viewed

@@ -1,4 +1,20 @@
-site_name: projet_05
-#
 site_author: Stéphane Manet
-#

+site_name: Documentation du projet
 site_author: Stéphane Manet
+theme:
+  name: mkdocs
+plugins:
+  - search
+  - mkdocstrings:
+      handlers:
+        python:
+          options:
+            show_source: true
+            docstring_style: google
+            merge_init_into_class: true
+nav:
+  - Accueil: index.md
+  - Guide de démarrage: getting-started.md
+  - Référence API:
+      - Greeter: greeter.md

hf_space/hf_space/hf_space/.github/workflows/deploy.yml CHANGED Viewed

@@ -33,8 +33,8 @@ jobs:
           git config --global user.email "actions@github.com"
           git config --global user.name "GitHub Actions"
           git clone https://huggingface.co/spaces/stephmnt/projet_05 hf_space
-          rsync -av --exclude '.git' ./ hf_space/
           cd hf_space
           git add .
           git commit -m "🚀 Auto-deploy from GitHub Actions" || echo "No changes to commit"
-          git push https://stephmnt:$HF_TOKEN@huggingface.co/spaces/stephmnt/projet_05 main

           git config --global user.email "actions@github.com"
           git config --global user.name "GitHub Actions"
           git clone https://huggingface.co/spaces/stephmnt/projet_05 hf_space
+          rsync -av --exclude '.git' --exclude 'output/' --exclude 'models/' ./ hf_space/
           cd hf_space
           git add .
           git commit -m "🚀 Auto-deploy from GitHub Actions" || echo "No changes to commit"
+          git push https://stephmnt:$HF_TOKEN@huggingface.co/spaces/stephmnt/projet_05 main

hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/README.md CHANGED Viewed

@@ -1,5 +1,17 @@
 # projet_05
 <a target="_blank" href="https://cookiecutter-data-science.drivendata.org/">
     <img src="https://img.shields.io/badge/CCDS-Project%20template-328F97?logo=cookiecutter" />
 </a>
@@ -57,6 +69,11 @@ Déployez un modèle de Machine Learning
     └── plots.py                <- Code to create visualizations
 ```
 --------
 ---
@@ -93,6 +110,7 @@ Check out the configuration reference at https://huggingface.co/docs/hub/spaces-
 *** https://www.markdownguide.org/basic-syntax/#reference-style-links
 -->
 [![Contributors][contributors-shield]][contributors-url]
 [![Forks][forks-shield]][forks-url]
 [![Stargazers][stars-shield]][stars-url]
 [![Issues][issues-shield]][issues-url]
@@ -236,7 +254,7 @@ _For more examples, please refer to the [Documentation](https://example.com)_
 - [ ] Feature 3
     - [ ] Nested Feature
-See the [open issues](https://github.com/github_username/repo_name/issues) for a full list of proposed features (and known issues).
 <p align="right">(<a href="#readme-top">back to top</a>)</p>
@@ -299,18 +317,18 @@ Project Link: [https://github.com/github_username/repo_name](https://github.com/
 <!-- MARKDOWN LINKS & IMAGES -->
 <!-- https://www.markdownguide.org/basic-syntax/#reference-style-links -->
-[contributors-shield]: https://img.shields.io/github/contributors/github_username/repo_name.svg?style=for-the-badge
-[contributors-url]: https://github.com/github_username/repo_name/graphs/contributors
-[forks-shield]: https://img.shields.io/github/forks/github_username/repo_name.svg?style=for-the-badge
-[forks-url]: https://github.com/github_username/repo_name/network/members
-[stars-shield]: https://img.shields.io/github/stars/github_username/repo_name.svg?style=for-the-badge
-[stars-url]: https://github.com/github_username/repo_name/stargazers
-[issues-shield]: https://img.shields.io/github/issues/github_username/repo_name.svg?style=for-the-badge
-[issues-url]: https://github.com/github_username/repo_name/issues
-[license-shield]: https://img.shields.io/github/license/github_username/repo_name.svg?style=for-the-badge
-[license-url]: https://github.com/github_username/repo_name/blob/master/LICENSE.txt
 [linkedin-shield]: https://img.shields.io/badge/-LinkedIn-black.svg?style=for-the-badge&logo=linkedin&colorB=555
-[linkedin-url]: https://linkedin.com/in/linkedin_username
 [product-screenshot]: images/screenshot.png
 [Noobie]: https://img.shields.io/badge/Data%20Science%20for%20Beginners-84CC16?style=for-the-badge&labelColor=E5E7EB&color=84CC16
 <!-- Shields.io badges. You can a comprehensive list with many more badges at: https://github.com/inttter/md-badges -->
@@ -331,10 +349,8 @@ Project Link: [https://github.com/github_username/repo_name](https://github.com/
 [JQuery.com]: https://img.shields.io/badge/jQuery-0769AD?style=for-the-badge&logo=jquery&logoColor=white
 [JQuery-url]: https://jquery.com
 <!-- TODO: -->
-[![Postgres](https://img.shields.io/badge/Postgres-%23316192.svg?logo=postgresql&logoColor=white)](#)
-[![Python](https://img.shields.io/badge/Python-3776AB?logo=python&logoColor=fff)](#)
-[![Sphinx](https://img.shields.io/badge/Sphinx-000?logo=sphinx&logoColor=fff)](#)
-[![MkDocs](https://img.shields.io/badge/MkDocs-526CFE?logo=materialformkdocs&logoColor=fff)](#)
-[![NumPy](https://img.shields.io/badge/NumPy-4DABCF?logo=numpy&logoColor=fff)](#)
 [![Pandas](https://img.shields.io/badge/Pandas-150458?logo=pandas&logoColor=fff)](#)
-[![Slack](https://img.shields.io/badge/Slack-4A154B?logo=slack&logoColor=fff)](#)[text](../projet_04/.gitignore)

 # projet_05
+---
+title: OCR_Projet05
+emoji: 🔥
+colorFrom: purple
+colorTo: purple
+sdk: gradio
+sdk_version: 5.49.1
+app_file: app.py
+pinned: true
+short_description: Projet 05 formation Openclassrooms
+---
 <a target="_blank" href="https://cookiecutter-data-science.drivendata.org/">
     <img src="https://img.shields.io/badge/CCDS-Project%20template-328F97?logo=cookiecutter" />
 </a>
     └── plots.py                <- Code to create visualizations
 ```
+## Code hérité réutilisé
+- `scripts_projet04/brand` : charte graphique OpenClassrooms (classe `Theme`, palettes, YAML). Le module `projet_05/branding.py` en est la porte d'entrée et applique automatiquement le thème.
+- `scripts_projet04/manet_projet04/shap_generator.py` : fonctions `shap_global` / `shap_local` utilisées par `projet_05/modeling/train.py` pour reproduire les visualisations SHAP.
 --------
 ---
 *** https://www.markdownguide.org/basic-syntax/#reference-style-links
 -->
 [![Contributors][contributors-shield]][contributors-url]
+[![Python][python]][python]
 [![Forks][forks-shield]][forks-url]
 [![Stargazers][stars-shield]][stars-url]
 [![Issues][issues-shield]][issues-url]
 - [ ] Feature 3
     - [ ] Nested Feature
+See the [open issues](https://github.com/stephmnt/OCR_projet05/issues) for a full list of proposed features (and known issues).
 <p align="right">(<a href="#readme-top">back to top</a>)</p>
 <!-- MARKDOWN LINKS & IMAGES -->
 <!-- https://www.markdownguide.org/basic-syntax/#reference-style-links -->
+[contributors-shield]: https://img.shields.io/github/contributors/stephmnt/OCR_projet05.svg?style=for-the-badge
+[contributors-url]: https://github.com/stephmnt/OCR_projet05/graphs/contributors
+[forks-shield]: https://img.shields.io/github/forks/stephmnt/OCR_projet05.svg?style=for-the-badge
+[forks-url]: https://github.com/stephmnt/OCR_projet05/network/members
+[stars-shield]: https://img.shields.io/github/stars/stephmnt/OCR_projet05.svg?style=for-the-badge
+[stars-url]: https://github.com/stephmnt/OCR_projet05/stargazers
+[issues-shield]: https://img.shields.io/github/issues/stephmnt/OCR_projet05.svg?style=for-the-badge
+[issues-url]: https://github.com/stephmnt/OCR_projet05/issues
+[license-shield]: https://img.shields.io/github/license/stephmnt/OCR_projet05.svg?style=for-the-badge
+[license-url]: https://github.com/stephmnt/OCR_projet05/blob/master/LICENSE.txt
 [linkedin-shield]: https://img.shields.io/badge/-LinkedIn-black.svg?style=for-the-badge&logo=linkedin&colorB=555
+[linkedin-url]: https://linkedin.com/in/stephanemanet
 [product-screenshot]: images/screenshot.png
 [Noobie]: https://img.shields.io/badge/Data%20Science%20for%20Beginners-84CC16?style=for-the-badge&labelColor=E5E7EB&color=84CC16
 <!-- Shields.io badges. You can a comprehensive list with many more badges at: https://github.com/inttter/md-badges -->
 [JQuery.com]: https://img.shields.io/badge/jQuery-0769AD?style=for-the-badge&logo=jquery&logoColor=white
 [JQuery-url]: https://jquery.com
 <!-- TODO: -->
+[Postgres]: https://img.shields.io/badge/Postgres-%23316192.svg?logo=postgresql&logoColor=white
+[Python]: https://img.shields.io/badge/Python-3776AB?logo=python&logoColor=fff)
+[MkDocs]: https://img.shields.io/badge/MkDocs-526CFE?logo=materialformkdocs&logoColor=fff
+[NumPy]: https://img.shields.io/badge/NumPy-4DABCF?logo=numpy&logoColor=fff
 [![Pandas](https://img.shields.io/badge/Pandas-150458?logo=pandas&logoColor=fff)](#)

hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/LICENSE ADDED Viewed

	@@ -0,0 +1,10 @@

+The MIT License (MIT)
+Copyright (c) 2025, Stéphane Manet
+Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/Makefile ADDED Viewed

	@@ -0,0 +1,85 @@

+#################################################################################
+# GLOBALS                                                                       #
+#################################################################################
+PROJECT_NAME = OCR_projet05
+PYTHON_VERSION = 3.10
+PYTHON_INTERPRETER = python
+#################################################################################
+# COMMANDS                                                                      #
+#################################################################################
+## Install Python dependencies
+.PHONY: requirements
+requirements:
+	pip install -e .
+## Delete all compiled Python files
+.PHONY: clean
+clean:
+	find . -type f -name "*.py[co]" -delete
+	find . -type d -name "__pycache__" -delete
+## Lint using ruff (use `make format` to do formatting)
+.PHONY: lint
+lint:
+	ruff format --check
+	ruff check
+## Format source code with ruff
+.PHONY: format
+format:
+	ruff check --fix
+	ruff format
+## Run tests
+.PHONY: test
+test:
+	python -m pytest tests
+## Set up Python interpreter environment
+.PHONY: create_environment
+create_environment:
+	@bash -c "if [ ! -z `which virtualenvwrapper.sh` ]; then source `which virtualenvwrapper.sh`; mkvirtualenv $(PROJECT_NAME) --python=$(PYTHON_INTERPRETER); else mkvirtualenv.bat $(PROJECT_NAME) --python=$(PYTHON_INTERPRETER); fi"
+	@echo ">>> New virtualenv created. Activate with:\nworkon $(PROJECT_NAME)"
+#################################################################################
+# PROJECT RULES                                                                 #
+#################################################################################
+## Make dataset
+.PHONY: data
+data: requirements
+	$(PYTHON_INTERPRETER) projet_05/dataset.py
+#################################################################################
+# Self Documenting Commands                                                     #
+#################################################################################
+.DEFAULT_GOAL := help
+define PRINT_HELP_PYSCRIPT
+import re, sys; \
+lines = '\n'.join([line for line in sys.stdin]); \
+matches = re.findall(r'\n## (.*)\n[\s\S]+?\n([a-zA-Z_-]+):', lines); \
+print('Available rules:\n'); \
+print('\n'.join(['{:25}{}'.format(*reversed(match)) for match in matches]))
+endef
+export PRINT_HELP_PYSCRIPT
+help:
+	@$(PYTHON_INTERPRETER) -c "${PRINT_HELP_PYSCRIPT}" < $(MAKEFILE_LIST)

hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/app.py CHANGED Viewed

@@ -1,7 +1,181 @@
 import gradio as gr
-def greet(name):
-    return "Hello " + name + "!!"
-demo = gr.Interface(fn=greet, inputs="text", outputs="text")
-demo.launch()

+from __future__ import annotations
+import json
+from pathlib import Path
+from typing import Any
 import gradio as gr
+import pandas as pd
+from loguru import logger
+from projet_05.branding import apply_brand_theme
+from projet_05.modeling.predict import load_metadata, load_pipeline, run_inference
+MODEL_PATH = Path("models/best_model.joblib")
+METADATA_PATH = Path("models/best_model_meta.json")
+SCHEMA_PATH = Path("data/processed/schema.json")
+def _load_schema(path: Path) -> dict[str, Any]:
+    if not path.exists():
+        return {}
+    return json.loads(path.read_text(encoding="utf-8"))
+def _infer_features(metadata: dict, schema: dict, pipeline) -> list[str]:
+    if schema:
+        candidates = schema.get("numerical_features", []) + schema.get("categorical_features", [])
+        if candidates:
+            return candidates
+    features = metadata.get("features", {})
+    explicit = (features.get("numerical") or []) + (features.get("categorical") or [])
+    if explicit:
+        return explicit
+    if pipeline is not None and hasattr(pipeline, "feature_names_in_"):
+        return list(pipeline.feature_names_in_)
+    return []
+def _convert_input(payload: Any, headers: list[str]) -> pd.DataFrame:
+    if isinstance(payload, pd.DataFrame):
+        df = payload.copy()
+    elif payload is None:
+        df = pd.DataFrame(columns=headers)
+    else:
+        df = pd.DataFrame(payload, columns=headers if headers else None)
+    df = df.dropna(how="all")
+    if df.empty:
+        raise gr.Error("Merci de saisir au moins une ligne complète.")
+    return df
+def _ensure_model():
+    if PIPELINE is None:
+        raise gr.Error(
+            "Aucun modèle entrainé n'a été trouvé. Lancez `python projet_05/modeling/train.py` puis relancez l'application."
+        )
+def score_table(table):
+    _ensure_model()
+    df = _convert_input(table, FEATURE_ORDER)
+    drop_cols = [TARGET_COLUMN] if TARGET_COLUMN else None
+    return run_inference(
+        df,
+        PIPELINE,
+        THRESHOLD,
+        drop_columns=drop_cols,
+        required_features=FEATURE_ORDER or None,
+    )
+def score_csv(upload):
+    _ensure_model()
+    if upload is None:
+        raise gr.Error("Veuillez déposer un fichier CSV.")
+    df = pd.read_csv(upload.name)
+    drop_cols = [TARGET_COLUMN] if TARGET_COLUMN else None
+    return run_inference(
+        df,
+        PIPELINE,
+        THRESHOLD,
+        drop_columns=drop_cols,
+        required_features=FEATURE_ORDER or None,
+    )
+def predict_from_form(*values):
+    _ensure_model()
+    if not FEATURE_ORDER:
+        raise gr.Error("Impossible de générer le formulaire sans configuration des features.")
+    payload = {feature: value for feature, value in zip(FEATURE_ORDER, values)}
+    df = pd.DataFrame([payload])
+    scored = run_inference(
+        df,
+        PIPELINE,
+        THRESHOLD,
+        required_features=FEATURE_ORDER or None,
+    )
+    row = scored.iloc[0]
+    label = "Risque de départ" if int(row["prediction"]) == 1 else "Reste probable"
+    return {
+        "probability": round(float(row["proba_depart"]), 4),
+        "decision": label,
+        "threshold": THRESHOLD,
+    }
+# Chargement des artéfacts
+apply_brand_theme()
+PIPELINE = None
+METADATA: dict[str, Any] = {}
+THRESHOLD = 0.5
+TARGET_COLUMN: str | None = None
+SCHEMA = _load_schema(SCHEMA_PATH)
+try:
+    PIPELINE = load_pipeline(MODEL_PATH)
+    METADATA = load_metadata(METADATA_PATH)
+    THRESHOLD = float(METADATA.get("best_threshold", THRESHOLD))
+    TARGET_COLUMN = METADATA.get("target")
+except FileNotFoundError as exc:
+    logger.warning("Artéfact manquant: {}", exc)
+FEATURE_ORDER = _infer_features(METADATA, SCHEMA, PIPELINE)
+with gr.Blocks(title="Prédicteur d'attrition") as demo:
+    gr.Markdown("# API Gradio – Prédiction de départ employé")
+    gr.Markdown(
+        "Le modèle applique le pipeline entraîné hors-notebook pour fournir une probabilité de départ ainsi qu'une décision binaire."
+    )
+    if PIPELINE is None:
+        gr.Markdown(
+            "⚠️ **Aucun modèle disponible.** Lancez les scripts `dataset.py`, `features.py` puis `modeling/train.py`."
+        )
+    else:
+        gr.Markdown(f"Seuil de décision actuel : **{THRESHOLD:.2f}**")
+    with gr.Tab("Formulaire unitaire"):
+        if not FEATURE_ORDER:
+            gr.Markdown("Aucune configuration de features détectée. Utilisez l'onglet CSV pour scorer vos données.")
+        else:
+            form_inputs: list[gr.components.Component] = [] # type: ignore
+            for feature in FEATURE_ORDER:
+                form_inputs.append(
+                    gr.Textbox(label=feature, placeholder=f"Saisir {feature.replace('_', ' ')}")
+                )
+            form_output = gr.JSON(label="Résultat")
+            gr.Button("Prédire").click(
+                fn=predict_from_form,
+                inputs=form_inputs,
+                outputs=form_output,
+            )
+    with gr.Tab("Tableau interactif"):
+        table_input = gr.Dataframe(
+            headers=FEATURE_ORDER if FEATURE_ORDER else None,
+            row_count=(1, "dynamic"),
+            col_count=(len(FEATURE_ORDER), "dynamic") if FEATURE_ORDER else (5, "dynamic"),
+            type="pandas",
+        )
+        table_output = gr.Dataframe(label="Prédictions", type="pandas")
+        gr.Button("Scorer les lignes").click(
+            fn=score_table,
+            inputs=table_input,
+            outputs=table_output,
+        )
+    with gr.Tab("Fichier CSV"):
+        file_input = gr.File(file_types=[".csv"], label="Déposez votre fichier CSV")
+        file_output = gr.Dataframe(label="Résultats CSV", type="pandas")
+        gr.Button("Scorer le fichier").click(
+            fn=score_csv,
+            inputs=file_input,
+            outputs=file_output,
+        )
+if __name__ == "__main__":
+    demo.launch()

hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/docs/.gitkeep ADDED Viewed

File without changes

hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/docs/README.md ADDED Viewed

	@@ -0,0 +1,12 @@

+Generating the docs
+----------
+Use [mkdocs](http://www.mkdocs.org/) structure to update the documentation.
+Build locally with:
+    mkdocs build
+Serve locally with:
+    mkdocs serve

hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/docs/docs/getting-started.md ADDED Viewed

	@@ -0,0 +1,6 @@

+Getting started
+===============
+This is where you describe how to get set up on a clean install, including the
+commands necessary to get the raw data (using the `sync_data_from_s3` command,
+for example), and then how to make the cleaned, final data sets.

hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/docs/docs/index.md ADDED Viewed

	@@ -0,0 +1,10 @@

+# projet_05 documentation!
+## Description
+Déployez un modèle de Machine Learning
+## Commands
+The Makefile contains the central entry points for common tasks related to this project.

hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/docs/mkdocs.yml ADDED Viewed

	@@ -0,0 +1,4 @@

+site_name: projet_05
+#
+site_author: Stéphane Manet
+#

hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/.github/workflows/deploy.yml CHANGED Viewed

@@ -1,10 +1,13 @@
-name: Déployer vers Hugging Face Spaces
 on:
   push:
     branches:
       - main
 jobs:
   deploy:
     runs-on: ubuntu-latest
@@ -23,7 +26,7 @@ jobs:
           python -m pip install --upgrade pip
           if [ -f requirements.txt ]; then pip install -r requirements.txt; fi
-      - name: Push to Hugging Face Space
         env:
           HF_TOKEN: ${{ secrets.HF_TOKEN }}
         run: |
@@ -33,5 +36,5 @@ jobs:
           rsync -av --exclude '.git' ./ hf_space/
           cd hf_space
           git add .
-          git commit -m "🚀 Auto-deploy from GitHub Actions"
-          git push https://stephmnt:$HF_TOKEN@huggingface.co/spaces/stephmnt/projet_05 main

+name: Deploy to Hugging Face Spaces
 on:
   push:
     branches:
       - main
+permissions:
+  contents: write
 jobs:
   deploy:
     runs-on: ubuntu-latest
           python -m pip install --upgrade pip
           if [ -f requirements.txt ]; then pip install -r requirements.txt; fi
+      - name: Deploy to Hugging Face Space
         env:
           HF_TOKEN: ${{ secrets.HF_TOKEN }}
         run: |
           rsync -av --exclude '.git' ./ hf_space/
           cd hf_space
           git add .
+          git commit -m "🚀 Auto-deploy from GitHub Actions" || echo "No changes to commit"
+          git push https://stephmnt:$HF_TOKEN@huggingface.co/spaces/stephmnt/projet_05 main

hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/.gitignore CHANGED Viewed

@@ -1,2 +1,192 @@
 *.code-workspace
-.venv/

+# Data
+/data/
+# Mac OS-specific storage files
+.DS_Store
 *.code-workspace
+# vim
+*.swp
+*.swo
+## https://github.com/github/gitignore/blob/e8554d85bf62e38d6db966a50d2064ac025fd82a/Python.gitignore
+# Byte-compiled / optimized / DLL files
+__pycache__/
+*.py[cod]
+*$py.class
+# C extensions
+*.so
+# Distribution / packaging
+.Python
+build/
+develop-eggs/
+dist/
+downloads/
+eggs/
+.eggs/
+lib/
+lib64/
+parts/
+sdist/
+var/
+wheels/
+share/python-wheels/
+*.egg-info/
+.installed.cfg
+*.egg
+MANIFEST
+# PyInstaller
+#  Usually these files are written by a python script from a template
+#  before PyInstaller builds the exe, so as to inject date/other infos into it.
+*.manifest
+*.spec
+# Installer logs
+pip-log.txt
+pip-delete-this-directory.txt
+# Unit test / coverage reports
+htmlcov/
+.tox/
+.nox/
+.coverage
+.coverage.*
+.cache
+nosetests.xml
+coverage.xml
+*.cover
+*.py,cover
+.hypothesis/
+.pytest_cache/
+cover/
+# Translations
+*.mo
+*.pot
+# Django stuff:
+*.log
+local_settings.py
+db.sqlite3
+db.sqlite3-journal
+# Flask stuff:
+instance/
+.webassets-cache
+# Scrapy stuff:
+.scrapy
+# MkDocs documentation
+docs/site/
+# PyBuilder
+.pybuilder/
+target/
+# Jupyter Notebook
+.ipynb_checkpoints
+# IPython
+profile_default/
+ipython_config.py
+# pyenv
+#   For a library or package, you might want to ignore these files since the code is
+#   intended to run in multiple environments; otherwise, check them in:
+# .python-version
+# pipenv
+#   According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
+#   However, in case of collaboration, if having platform-specific dependencies or dependencies
+#   having no cross-platform support, pipenv may install dependencies that don't work, or not
+#   install all needed dependencies.
+#Pipfile.lock
+# UV
+#   Similar to Pipfile.lock, it is generally recommended to include uv.lock in version control.
+#   This is especially recommended for binary packages to ensure reproducibility, and is more
+#   commonly ignored for libraries.
+#uv.lock
+# poetry
+#   Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
+#   This is especially recommended for binary packages to ensure reproducibility, and is more
+#   commonly ignored for libraries.
+#   https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
+#poetry.lock
+# pdm
+#   Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
+#pdm.lock
+#   pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
+#   in version control.
+#   https://pdm.fming.dev/latest/usage/project/#working-with-version-control
+.pdm.toml
+.pdm-python
+.pdm-build/
+# pixi
+#   pixi.lock should be committed to version control for reproducibility
+#   .pixi/ contains the environments and should not be committed
+.pixi/
+# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
+__pypackages__/
+# Celery stuff
+celerybeat-schedule
+celerybeat.pid
+# SageMath parsed files
+*.sage.py
+# Environments
+.env
+.venv
+env/
+venv/
+ENV/
+env.bak/
+venv.bak/
+# Spyder project settings
+.spyderproject
+.spyproject
+# Rope project settings
+.ropeproject
+# mkdocs documentation
+/site
+# mypy
+.mypy_cache/
+.dmypy.json
+dmypy.json
+# Pyre type checker
+.pyre/
+# pytype static type analyzer
+.pytype/
+# Cython debug symbols
+cython_debug/
+# PyCharm
+#  JetBrains specific template is maintained in a separate JetBrains.gitignore that can
+#  be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
+#  and can be added to the global gitignore or merged into this file.  For a more nuclear
+#  option (not recommended) you can uncomment the following to ignore the entire idea folder.
+#.idea/
+# Ruff stuff:
+.ruff_cache/
+# PyPI configuration file
+.pypirc

hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/README.md CHANGED Viewed

@@ -1,3 +1,64 @@
 ---
 title: Projet 05
 emoji: 👀
@@ -10,3 +71,270 @@ pinned: false
 ---
 Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

+# projet_05
+<a target="_blank" href="https://cookiecutter-data-science.drivendata.org/">
+    <img src="https://img.shields.io/badge/CCDS-Project%20template-328F97?logo=cookiecutter" />
+</a>
+Déployez un modèle de Machine Learning
+## Organisation du projet
+```
+├── LICENSE            <- Open-source license if one is chosen
+├── Makefile           <- Makefile with convenience commands like `make data` or `make train`
+├── README.md          <- The top-level README for developers using this project.
+├── data
+│   ├── external       <- Data from third party sources.
+│   ├── interim        <- Intermediate data that has been transformed.
+│   ├── processed      <- The final, canonical data sets for modeling.
+│   └── raw            <- The original, immutable data dump.
+│
+├── docs               <- A default mkdocs project; see www.mkdocs.org for details
+│
+├── models             <- Trained and serialized models, model predictions, or model summaries
+│
+├── notebooks          <- Jupyter notebooks. Naming convention is a number (for ordering),
+│                         the creator's initials, and a short `-` delimited description, e.g.
+│                         `1.0-jqp-initial-data-exploration`.
+│
+├── pyproject.toml     <- Project configuration file with package metadata for
+│                         projet_05 and configuration for tools like black
+│
+├── references         <- Data dictionaries, manuals, and all other explanatory materials.
+│
+├── reports            <- Generated analysis as HTML, PDF, LaTeX, etc.
+│   └── figures        <- Generated graphics and figures to be used in reporting
+│
+├── requirements.txt   <- The requirements file for reproducing the analysis environment, e.g.
+│                         generated with `pip freeze > requirements.txt`
+│
+├── setup.cfg          <- Configuration file for flake8
+│
+└── projet_05   <- Source code for use in this project.
+    │
+    ├── __init__.py             <- Makes projet_05 a Python module
+    │
+    ├── config.py               <- Store useful variables and configuration
+    │
+    ├── dataset.py              <- Scripts to download or generate data
+    │
+    ├── features.py             <- Code to create features for modeling
+    │
+    ├── modeling
+    │   ├── __init__.py
+    │   ├── predict.py          <- Code to run model inference with trained models
+    │   └── train.py            <- Code to train models
+    │
+    └── plots.py                <- Code to create visualizations
+```
+--------
 ---
 title: Projet 05
 emoji: 👀
 ---
 Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
+<!-- Improved compatibility of back to top link: See: https://github.com/othneildrew/Best-README-Template/pull/73 -->
+<a id="readme-top"></a>
+<!--
+*** Thanks for checking out the Best-README-Template. If you have a suggestion
+*** that would make this better, please fork the repo and create a pull request
+*** or simply open an issue with the tag "enhancement".
+*** Don't forget to give the project a star!
+*** Thanks again! Now go create something AMAZING! :D
+-->
+<!-- PROJECT SHIELDS -->
+<!--
+*** I'm using markdown "reference style" links for readability.
+*** Reference links are enclosed in brackets [ ] instead of parentheses ( ).
+*** See the bottom of this document for the declaration of the reference variables
+*** for contributors-url, forks-url, etc. This is an optional, concise syntax you may use.
+*** https://www.markdownguide.org/basic-syntax/#reference-style-links
+-->
+[![Contributors][contributors-shield]][contributors-url]
+[![Forks][forks-shield]][forks-url]
+[![Stargazers][stars-shield]][stars-url]
+[![Issues][issues-shield]][issues-url]
+[![project_license][license-shield]][license-url]
+[![LinkedIn][linkedin-shield]][linkedin-url]
+![GitHub Actions Workflow Status](https://img.shields.io/github/actions/workflow/status/:user/:repo/:workflow)
+<!-- PROJECT LOGO -->
+<br />
+<div align="center">
+  <a href="https://github.com/github_username/repo_name">
+    <img src="images/logo.png" alt="Logo" width="80" height="80">
+  </a>
+<h3 align="center">project_title</h3>
+  <p align="center">
+    project_description
+    <br />
+    <a href="https://github.com/github_username/repo_name"><strong>Explore the docs »</strong></a>
+    <br />
+    <br />
+    <a href="https://github.com/github_username/repo_name">View Demo</a>
+    &middot;
+    <a href="https://github.com/github_username/repo_name/issues/new?labels=bug&template=bug-report---.md">Report Bug</a>
+    &middot;
+    <a href="https://github.com/github_username/repo_name/issues/new?labels=enhancement&template=feature-request---.md">Request Feature</a>
+  </p>
+</div>
+<!-- TABLE OF CONTENTS -->
+<details>
+  <summary>Table of Contents</summary>
+  <ol>
+    <li>
+      <a href="#about-the-project">About The Project</a>
+      <ul>
+        <li><a href="#built-with">Built With</a></li>
+      </ul>
+    </li>
+    <li>
+      <a href="#getting-started">Getting Started</a>
+      <ul>
+        <li><a href="#prerequisites">Prerequisites</a></li>
+        <li><a href="#installation">Installation</a></li>
+      </ul>
+    </li>
+    <li><a href="#usage">Usage</a></li>
+    <li><a href="#roadmap">Roadmap</a></li>
+    <li><a href="#contributing">Contributing</a></li>
+    <li><a href="#license">License</a></li>
+    <li><a href="#contact">Contact</a></li>
+    <li><a href="#acknowledgments">Acknowledgments</a></li>
+  </ol>
+</details>
+<!-- ABOUT THE PROJECT -->
+## About The Project
+[![Product Name Screen Shot][product-screenshot]](https://example.com)
+Here's a blank template to get started. To avoid retyping too much info, do a search and replace with your text editor for the following: `github_username`, `repo_name`, `twitter_handle`, `linkedin_username`, `email_client`, `email`, `project_title`, `project_description`, `project_license`
+<p align="right">(<a href="#readme-top">back to top</a>)</p>
+### Built With
+* [![Python][Python]][Python-url]
+* [![SQL][SQL]][SQL-url]
+<p align="right">(<a href="#readme-top">back to top</a>)</p>
+<!-- GETTING STARTED -->
+## Getting Started
+This is an example of how you may give instructions on setting up your project locally.
+To get a local copy up and running follow these simple example steps.
+### Prerequisites
+This is an example of how to list things you need to use the software and how to install them.
+* npm
+  ```sh
+  npm install npm@latest -g
+  ```
+### Installation
+pip install -r requirements.txt
+uvicorn app.main:app --reload
+1. Get a free API Key at [https://example.com](https://example.com)
+2. Clone the repo
+   ```sh
+   git clone https://github.com/github_username/repo_name.git
+   ```
+3. Install NPM packages
+   ```sh
+   npm install
+   ```
+4. Enter your API in `config.js`
+   ```js
+   const API_KEY = 'ENTER YOUR API';
+   ```
+5. Change git remote url to avoid accidental pushes to base project
+   ```sh
+   git remote set-url origin github_username/repo_name
+   git remote -v # confirm the changes
+   ```
+<p align="right">(<a href="#readme-top">back to top</a>)</p>
+<!-- USAGE EXAMPLES -->
+## Usage
+Use this space to show useful examples of how a project can be used. Additional screenshots, code examples and demos work well in this space. You may also link to more resources.
+_For more examples, please refer to the [Documentation](https://example.com)_
+<p align="right">(<a href="#readme-top">back to top</a>)</p>
+<!-- ROADMAP -->
+## Roadmap
+- [ ] Feature 1
+- [ ] Feature 2
+- [ ] Feature 3
+    - [ ] Nested Feature
+See the [open issues](https://github.com/github_username/repo_name/issues) for a full list of proposed features (and known issues).
+<p align="right">(<a href="#readme-top">back to top</a>)</p>
+<!-- CONTRIBUTING -->
+## Contributing
+Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are **greatly appreciated**.
+If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement".
+Don't forget to give the project a star! Thanks again!
+1. Fork the Project
+2. Create your Feature Branch (`git checkout -b feature/AmazingFeature`)
+3. Commit your Changes (`git commit -m 'Add some AmazingFeature'`)
+4. Push to the Branch (`git push origin feature/AmazingFeature`)
+5. Open a Pull Request
+<p align="right">(<a href="#readme-top">back to top</a>)</p>
+### Top contributors:
+<a href="https://github.com/github_username/repo_name/graphs/contributors">
+  <img src="https://contrib.rocks/image?repo=github_username/repo_name" alt="contrib.rocks image" />
+</a>
+<!-- LICENSE -->
+## License
+Distributed under the project_license. See `LICENSE.txt` for more information.
+<p align="right">(<a href="#readme-top">back to top</a>)</p>
+<!-- CONTACT -->
+## Contact
+Your Name - [@twitter_handle](https://twitter.com/twitter_handle) - email@email_client.com
+Project Link: [https://github.com/github_username/repo_name](https://github.com/github_username/repo_name)
+<p align="right">(<a href="#readme-top">back to top</a>)</p>
+<!-- ACKNOWLEDGMENTS -->
+## Acknowledgments
+* []()
+* []()
+* []()
+<p align="right">(<a href="#readme-top">back to top</a>)</p>
+<!-- MARKDOWN LINKS & IMAGES -->
+<!-- https://www.markdownguide.org/basic-syntax/#reference-style-links -->
+[contributors-shield]: https://img.shields.io/github/contributors/github_username/repo_name.svg?style=for-the-badge
+[contributors-url]: https://github.com/github_username/repo_name/graphs/contributors
+[forks-shield]: https://img.shields.io/github/forks/github_username/repo_name.svg?style=for-the-badge
+[forks-url]: https://github.com/github_username/repo_name/network/members
+[stars-shield]: https://img.shields.io/github/stars/github_username/repo_name.svg?style=for-the-badge
+[stars-url]: https://github.com/github_username/repo_name/stargazers
+[issues-shield]: https://img.shields.io/github/issues/github_username/repo_name.svg?style=for-the-badge
+[issues-url]: https://github.com/github_username/repo_name/issues
+[license-shield]: https://img.shields.io/github/license/github_username/repo_name.svg?style=for-the-badge
+[license-url]: https://github.com/github_username/repo_name/blob/master/LICENSE.txt
+[linkedin-shield]: https://img.shields.io/badge/-LinkedIn-black.svg?style=for-the-badge&logo=linkedin&colorB=555
+[linkedin-url]: https://linkedin.com/in/linkedin_username
+[product-screenshot]: images/screenshot.png
+[Noobie]: https://img.shields.io/badge/Data%20Science%20for%20Beginners-84CC16?style=for-the-badge&labelColor=E5E7EB&color=84CC16
+<!-- Shields.io badges. You can a comprehensive list with many more badges at: https://github.com/inttter/md-badges -->
+[Next.js]: https://img.shields.io/badge/next.js-000000?style=for-the-badge&logo=nextdotjs&logoColor=white
+[Next-url]: https://nextjs.org/
+[React.js]: https://img.shields.io/badge/React-20232A?style=for-the-badge&logo=react&logoColor=61DAFB
+[React-url]: https://reactjs.org/
+[Vue.js]: https://img.shields.io/badge/Vue.js-35495E?style=for-the-badge&logo=vuedotjs&logoColor=4FC08D
+[Vue-url]: https://vuejs.org/
+[Angular.io]: https://img.shields.io/badge/Angular-DD0031?style=for-the-badge&logo=angular&logoColor=white
+[Angular-url]: https://angular.io/
+[Svelte.dev]: https://img.shields.io/badge/Svelte-4A4A55?style=for-the-badge&logo=svelte&logoColor=FF3E00
+[Svelte-url]: https://svelte.dev/
+[Laravel.com]: https://img.shields.io/badge/Laravel-FF2D20?style=for-the-badge&logo=laravel&logoColor=white
+[Laravel-url]: https://laravel.com
+[Bootstrap.com]: https://img.shields.io/badge/Bootstrap-563D7C?style=for-the-badge&logo=bootstrap&logoColor=white
+[Bootstrap-url]: https://getbootstrap.com
+[JQuery.com]: https://img.shields.io/badge/jQuery-0769AD?style=for-the-badge&logo=jquery&logoColor=white
+[JQuery-url]: https://jquery.com
+<!-- TODO: -->
+[![Postgres](https://img.shields.io/badge/Postgres-%23316192.svg?logo=postgresql&logoColor=white)](#)
+[![Python](https://img.shields.io/badge/Python-3776AB?logo=python&logoColor=fff)](#)
+[![Sphinx](https://img.shields.io/badge/Sphinx-000?logo=sphinx&logoColor=fff)](#)
+[![MkDocs](https://img.shields.io/badge/MkDocs-526CFE?logo=materialformkdocs&logoColor=fff)](#)
+[![NumPy](https://img.shields.io/badge/NumPy-4DABCF?logo=numpy&logoColor=fff)](#)
+[![Pandas](https://img.shields.io/badge/Pandas-150458?logo=pandas&logoColor=fff)](#)
+[![Slack](https://img.shields.io/badge/Slack-4A154B?logo=slack&logoColor=fff)](#)[text](../projet_04/.gitignore)

hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/app/__init__.py ADDED Viewed

File without changes

hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/app/main.py ADDED Viewed

	@@ -0,0 +1,7 @@

+import gradio as gr
+def greet(name):
+    return "Hello " + name + "!!"
+demo = gr.Interface(fn=greet, inputs="text", outputs="text")
+demo.launch()

hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/.github/workflows/deploy.yml ADDED Viewed

	@@ -0,0 +1,37 @@

+name: Déployer vers Hugging Face Spaces
+on:
+  push:
+    branches:
+      - main
+jobs:
+  deploy:
+    runs-on: ubuntu-latest
+    steps:
+      - name: Checkout repository
+        uses: actions/checkout@v4
+      - name: Setup Python
+        uses: actions/setup-python@v5
+        with:
+          python-version: "3.10"
+      - name: Install dependencies
+        run: |
+          python -m pip install --upgrade pip
+          if [ -f requirements.txt ]; then pip install -r requirements.txt; fi
+      - name: Push to Hugging Face Space
+        env:
+          HF_TOKEN: ${{ secrets.HF_TOKEN }}
+        run: |
+          git config --global user.email "actions@github.com"
+          git config --global user.name "GitHub Actions"
+          git clone https://huggingface.co/spaces/stephmnt/projet_05 hf_space
+          rsync -av --exclude '.git' ./ hf_space/
+          cd hf_space
+          git add .
+          git commit -m "🚀 Auto-deploy from GitHub Actions"
+          git push https://stephmnt:$HF_TOKEN@huggingface.co/spaces/stephmnt/projet_05 main

hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/.gitattributes ADDED Viewed

	@@ -0,0 +1,35 @@

+*.7z filter=lfs diff=lfs merge=lfs -text
+*.arrow filter=lfs diff=lfs merge=lfs -text
+*.bin filter=lfs diff=lfs merge=lfs -text
+*.bz2 filter=lfs diff=lfs merge=lfs -text
+*.ckpt filter=lfs diff=lfs merge=lfs -text
+*.ftz filter=lfs diff=lfs merge=lfs -text
+*.gz filter=lfs diff=lfs merge=lfs -text
+*.h5 filter=lfs diff=lfs merge=lfs -text
+*.joblib filter=lfs diff=lfs merge=lfs -text
+*.lfs.* filter=lfs diff=lfs merge=lfs -text
+*.mlmodel filter=lfs diff=lfs merge=lfs -text
+*.model filter=lfs diff=lfs merge=lfs -text
+*.msgpack filter=lfs diff=lfs merge=lfs -text
+*.npy filter=lfs diff=lfs merge=lfs -text
+*.npz filter=lfs diff=lfs merge=lfs -text
+*.onnx filter=lfs diff=lfs merge=lfs -text
+*.ot filter=lfs diff=lfs merge=lfs -text
+*.parquet filter=lfs diff=lfs merge=lfs -text
+*.pb filter=lfs diff=lfs merge=lfs -text
+*.pickle filter=lfs diff=lfs merge=lfs -text
+*.pkl filter=lfs diff=lfs merge=lfs -text
+*.pt filter=lfs diff=lfs merge=lfs -text
+*.pth filter=lfs diff=lfs merge=lfs -text
+*.rar filter=lfs diff=lfs merge=lfs -text
+*.safetensors filter=lfs diff=lfs merge=lfs -text
+saved_model/**/* filter=lfs diff=lfs merge=lfs -text
+*.tar.* filter=lfs diff=lfs merge=lfs -text
+*.tar filter=lfs diff=lfs merge=lfs -text
+*.tflite filter=lfs diff=lfs merge=lfs -text
+*.tgz filter=lfs diff=lfs merge=lfs -text
+*.wasm filter=lfs diff=lfs merge=lfs -text
+*.xz filter=lfs diff=lfs merge=lfs -text
+*.zip filter=lfs diff=lfs merge=lfs -text
+*.zst filter=lfs diff=lfs merge=lfs -text
+*tfevents* filter=lfs diff=lfs merge=lfs -text

hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/.gitignore ADDED Viewed

	@@ -0,0 +1,2 @@


1	+ *.code-workspace
2	+ .venv/

hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/README.md ADDED Viewed

	@@ -0,0 +1,12 @@

+---
+title: Projet 05
+emoji: 👀
+colorFrom: indigo
+colorTo: green
+sdk: gradio
+sdk_version: 5.49.1
+app_file: app.py
+pinned: false
+---
+Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/app.py ADDED Viewed

	@@ -0,0 +1,7 @@

+import gradio as gr
+def greet(name):
+    return "Hello " + name + "!!"
+demo = gr.Interface(fn=greet, inputs="text", outputs="text")
+demo.launch()

hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/tests/test_app.py ADDED Viewed

	@@ -0,0 +1,17 @@

+import pytest
+from app.main import greet
+def test_greet_returns_string():
+    """Vérifie que la fonction retourne bien une chaîne de caractères."""
+    result = greet("Alice")
+    assert isinstance(result, str), "Le résultat doit être une chaîne de caractères."
+def test_greet_output_content():
+    """Vérifie que la fonction génère la phrase attendue."""
+    result = greet("Bob")
+    assert result == "Hello Bob!!", f"Résultat inattendu : {result}"
+def test_greet_with_empty_string():
+    """Vérifie le comportement si l’entrée est vide."""
+    result = greet("")
+    assert result == "Hello !!", "Le résultat doit gérer les entrées vides."

hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/notebooks/.gitkeep ADDED Viewed

File without changes

hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/notebooks/Manet_stephane_notebook_112025.ipynb ADDED Viewed

The diff for this file is too large to render. See raw diff

hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/poetry.lock ADDED Viewed

The diff for this file is too large to render. See raw diff

hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/poetry.toml ADDED Viewed

	@@ -0,0 +1,2 @@


1	+ [virtualenvs]
2	+ in-project = true

hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/projet_05/__init__.py ADDED Viewed

	@@ -0,0 +1 @@


1	+ from projet_05 import config # noqa: F401

hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/projet_05/config.py ADDED Viewed

	@@ -0,0 +1,32 @@

+from pathlib import Path
+from dotenv import load_dotenv
+from loguru import logger
+# Load environment variables from .env file if it exists
+load_dotenv()
+# Paths
+PROJ_ROOT = Path(__file__).resolve().parents[1]
+logger.info(f"PROJ_ROOT path is: {PROJ_ROOT}")
+DATA_DIR = PROJ_ROOT / "data"
+RAW_DATA_DIR = DATA_DIR / "raw"
+INTERIM_DATA_DIR = DATA_DIR / "interim"
+PROCESSED_DATA_DIR = DATA_DIR / "processed"
+EXTERNAL_DATA_DIR = DATA_DIR / "external"
+MODELS_DIR = PROJ_ROOT / "models"
+REPORTS_DIR = PROJ_ROOT / "reports"
+FIGURES_DIR = REPORTS_DIR / "figures"
+# If tqdm is installed, configure loguru with tqdm.write
+# https://github.com/Delgan/loguru/issues/135
+try:
+    from tqdm import tqdm
+    logger.remove(0)
+    logger.add(lambda msg: tqdm.write(msg, end=""), colorize=True)
+except ModuleNotFoundError:
+    pass

hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/projet_05/dataset.py ADDED Viewed

	@@ -0,0 +1,29 @@

+from pathlib import Path
+from loguru import logger
+from tqdm import tqdm
+import typer
+from projet_05.config import PROCESSED_DATA_DIR, RAW_DATA_DIR
+app = typer.Typer()
+@app.command()
+def main(
+    # ---- REPLACE DEFAULT PATHS AS APPROPRIATE ----
+    input_path: Path = RAW_DATA_DIR / "dataset.csv",
+    output_path: Path = PROCESSED_DATA_DIR / "dataset.csv",
+    # ----------------------------------------------
+):
+    # ---- REPLACE THIS WITH YOUR OWN CODE ----
+    logger.info("Processing dataset...")
+    for i in tqdm(range(10), total=10):
+        if i == 5:
+            logger.info("Something happened for iteration 5.")
+    logger.success("Processing dataset complete.")
+    # -----------------------------------------
+if __name__ == "__main__":
+    app()

hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/projet_05/features.py ADDED Viewed

	@@ -0,0 +1,29 @@

+from pathlib import Path
+from loguru import logger
+from tqdm import tqdm
+import typer
+from projet_05.config import PROCESSED_DATA_DIR
+app = typer.Typer()
+@app.command()
+def main(
+    # ---- REPLACE DEFAULT PATHS AS APPROPRIATE ----
+    input_path: Path = PROCESSED_DATA_DIR / "dataset.csv",
+    output_path: Path = PROCESSED_DATA_DIR / "features.csv",
+    # -----------------------------------------
+):
+    # ---- REPLACE THIS WITH YOUR OWN CODE ----
+    logger.info("Generating features from dataset...")
+    for i in tqdm(range(10), total=10):
+        if i == 5:
+            logger.info("Something happened for iteration 5.")
+    logger.success("Features generation complete.")
+    # -----------------------------------------
+if __name__ == "__main__":
+    app()

hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/projet_05/modeling/__init__.py ADDED Viewed

File without changes

hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/projet_05/modeling/predict.py ADDED Viewed

	@@ -0,0 +1,30 @@

+from pathlib import Path
+from loguru import logger
+from tqdm import tqdm
+import typer
+from projet_05.config import MODELS_DIR, PROCESSED_DATA_DIR
+app = typer.Typer()
+@app.command()
+def main(
+    # ---- REPLACE DEFAULT PATHS AS APPROPRIATE ----
+    features_path: Path = PROCESSED_DATA_DIR / "test_features.csv",
+    model_path: Path = MODELS_DIR / "model.pkl",
+    predictions_path: Path = PROCESSED_DATA_DIR / "test_predictions.csv",
+    # -----------------------------------------
+):
+    # ---- REPLACE THIS WITH YOUR OWN CODE ----
+    logger.info("Performing inference for model...")
+    for i in tqdm(range(10), total=10):
+        if i == 5:
+            logger.info("Something happened for iteration 5.")
+    logger.success("Inference complete.")
+    # -----------------------------------------
+if __name__ == "__main__":
+    app()

hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/projet_05/modeling/train.py ADDED Viewed

	@@ -0,0 +1,30 @@

+from pathlib import Path
+from loguru import logger
+from tqdm import tqdm
+import typer
+from projet_05.config import MODELS_DIR, PROCESSED_DATA_DIR
+app = typer.Typer()
+@app.command()
+def main(
+    # ---- REPLACE DEFAULT PATHS AS APPROPRIATE ----
+    features_path: Path = PROCESSED_DATA_DIR / "features.csv",
+    labels_path: Path = PROCESSED_DATA_DIR / "labels.csv",
+    model_path: Path = MODELS_DIR / "model.pkl",
+    # -----------------------------------------
+):
+    # ---- REPLACE THIS WITH YOUR OWN CODE ----
+    logger.info("Training some model...")
+    for i in tqdm(range(10), total=10):
+        if i == 5:
+            logger.info("Something happened for iteration 5.")
+    logger.success("Modeling training complete.")
+    # -----------------------------------------
+if __name__ == "__main__":
+    app()

hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/projet_05/plots.py ADDED Viewed

	@@ -0,0 +1,29 @@

+from pathlib import Path
+from loguru import logger
+from tqdm import tqdm
+import typer
+from projet_05.config import FIGURES_DIR, PROCESSED_DATA_DIR
+app = typer.Typer()
+@app.command()
+def main(
+    # ---- REPLACE DEFAULT PATHS AS APPROPRIATE ----
+    input_path: Path = PROCESSED_DATA_DIR / "dataset.csv",
+    output_path: Path = FIGURES_DIR / "plot.png",
+    # -----------------------------------------
+):
+    # ---- REPLACE THIS WITH YOUR OWN CODE ----
+    logger.info("Generating plot from data...")
+    for i in tqdm(range(10), total=10):
+        if i == 5:
+            logger.info("Something happened for iteration 5.")
+    logger.success("Plot generation complete.")
+    # -----------------------------------------
+if __name__ == "__main__":
+    app()

hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/pyproject.toml ADDED Viewed

	@@ -0,0 +1,53 @@

+[build-system]
+requires = ["flit_core >=3.2,<4"]
+build-backend = "flit_core.buildapi"
+[project]
+name = "projet_05"
+version = "0.0.1"
+description = "D\u00e9ployez un mod\u00e8le de Machine Learning"
+authors = [
+  { name = "St\u00e9phane Manet" },
+]
+license = { file = "LICENSE" }
+readme = "README.md"
+classifiers = [
+    "Programming Language :: Python :: 3",
+    "License :: OSI Approved :: MIT License"
+]
+dependencies = [
+    "loguru",
+    "mkdocs",
+    "pip",
+    "pytest",
+    "python-dotenv",
+    "ruff",
+    "tqdm",
+    "typer",
+    "imbalanced-learn (>=0.14.0,<0.15.0)",
+    "scikit-learn (>=1.4.2,<2.0.0)",
+    "matplotlib (>=3.10.7,<4.0.0)",
+    "numpy (>=2.3.4,<3.0.0)",
+    "pandas (>=2.3.3,<3.0.0)",
+    "pyyaml (>=6.0.3,<7.0.0)",
+    "scipy (>=1.16.3,<2.0.0)",
+    "seaborn (>=0.13.2,<0.14.0)",
+    "shap (>=0.49.1,<0.50.0)",
+    "gradio (>=5.49.1,<6.0.0)",
+    "joblib (>=1.4.2,<2.0.0)"
+]
+requires-python = ">=3.11,<3.13"
+[tool.ruff]
+line-length = 99
+src = ["projet_05"]
+include = ["pyproject.toml", "projet_05/**/*.py"]
+[tool.ruff.lint]
+extend-select = ["I"]  # Add import sorting
+[tool.ruff.lint.isort]
+known-first-party = ["projet_05"]
+force-sort-within-sections = true

hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/references/.gitkeep ADDED Viewed

File without changes

hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/reports/.gitkeep ADDED Viewed

File without changes

hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/reports/figures/.gitkeep ADDED Viewed

File without changes

hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/tests/test_data.py ADDED Viewed

	@@ -0,0 +1,5 @@

+import pytest
+def test_code_is_tested():
+    assert False

hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/notebooks/Manet_stephane_notebook_112025.ipynb CHANGED Viewed

The diff for this file is too large to render. See raw diff

hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/projet_05/__init__.py CHANGED Viewed

	@@ -1 +1,4 @@
1	from projet_05 import config # noqa: F401

 from projet_05 import config  # noqa: F401
+from projet_05.settings import Settings, load_settings  # noqa: F401
+__all__ = ["config", "Settings", "load_settings"]

hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/projet_05/branding.py ADDED Viewed

	@@ -0,0 +1,52 @@

+from __future__ import annotations
+from functools import lru_cache
+from pathlib import Path
+from typing import Union
+from scripts_projet04.brand.brand import (  # type: ignore[import-not-found]
+    Theme,
+    ThemeConfig,
+    configure_brand,
+    load_brand,
+    make_diverging_cmap,
+)
+ROOT_DIR = Path(__file__).resolve().parents[1]
+DEFAULT_BRAND_PATH = ROOT_DIR / "scripts_projet04" / "brand" / "brand.yml"
+def _resolve_path(path: Union[str, Path, None]) -> Path:
+    if path is None:
+        return DEFAULT_BRAND_PATH
+    return Path(path).expanduser().resolve()
+@lru_cache(maxsize=1)
+def load_brand_config(path: Union[str, Path, None] = None) -> ThemeConfig:
+    """Load the brand YAML once and return the parsed ThemeConfig."""
+    cfg_path = _resolve_path(path)
+    return load_brand(cfg_path)
+@lru_cache(maxsize=1)
+def apply_brand_theme(path: Union[str, Path, None] = None) -> ThemeConfig:
+    """
+    Apply the OpenClassrooms/TechNova brand theme globally.
+    Returns the ThemeConfig so callers can inspect colors if needed.
+    """
+    cfg_path = _resolve_path(path)
+    cfg = configure_brand(cfg_path)
+    Theme.apply()
+    return cfg
+__all__ = [
+    "Theme",
+    "ThemeConfig",
+    "apply_brand_theme",
+    "load_brand_config",
+    "make_diverging_cmap",
+    "DEFAULT_BRAND_PATH",
+]

hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/projet_05/dataset.py CHANGED Viewed

@@ -1,28 +1,202 @@
 from pathlib import Path
 from loguru import logger
-from tqdm import tqdm
 import typer
-from projet_05.config import PROCESSED_DATA_DIR, RAW_DATA_DIR
-app = typer.Typer()
 @app.command()
 def main(
-    # ---- REPLACE DEFAULT PATHS AS APPROPRIATE ----
-    input_path: Path = RAW_DATA_DIR / "dataset.csv",
-    output_path: Path = PROCESSED_DATA_DIR / "dataset.csv",
-    # ----------------------------------------------
 ):
-    # ---- REPLACE THIS WITH YOUR OWN CODE ----
-    logger.info("Processing dataset...")
-    for i in tqdm(range(10), total=10):
-        if i == 5:
-            logger.info("Something happened for iteration 5.")
-    logger.success("Processing dataset complete.")
-    # -----------------------------------------
 if __name__ == "__main__":

+from __future__ import annotations
+import sqlite3
 from pathlib import Path
+import numpy as np
+import pandas as pd
 from loguru import logger
 import typer
+from projet_05.config import INTERIM_DATA_DIR
+from projet_05.settings import Settings, load_settings
+app = typer.Typer(help="Préparation et fusion des données sources.")
+# ---------------------------------------------------------------------------
+# Utilitaires
+# ---------------------------------------------------------------------------
+def safe_read_csv(path: Path, *, dtype=None) -> pd.DataFrame:
+    """Read a CSV file and return an empty frame when it fails."""
+    try:
+        logger.info("Lecture du fichier {}", path)
+        return pd.read_csv(path, dtype=dtype)
+    except FileNotFoundError:
+        logger.warning("Fichier absent: {}", path)
+        return pd.DataFrame()
+    except Exception as exc:  # pragma: no cover - log + empty dataframe
+        logger.error("Impossible de lire {} ({})", path, exc)
+        return pd.DataFrame()
+def clean_text_values(df: pd.DataFrame) -> pd.DataFrame:
+    """Normalize textual values that often materialize missing values."""
+    replace_tokens = [
+        "",
+        " ",
+        "  ",
+        "   ",
+        "nan",
+        "NaN",
+        "NAN",
+        "None",
+        "JE ne sais pas",
+        "je ne sais pas",
+        "Je ne sais pas",
+        "Unknow",
+        "Unknown",
+        "non pertinent",
+        "Non pertinent",
+        "NON PERTINENT",
+    ]
+    normalized = df.copy()
+    normalized = normalized.replace(replace_tokens, np.nan)
+    for column in normalized.select_dtypes(include="object"):
+        normalized[column] = (
+            normalized[column].replace(replace_tokens, np.nan).astype("string").str.strip()
+        )
+    return normalized
+def _harmonize_id_column(df: pd.DataFrame, column: str, *, digits_only: bool = True) -> pd.DataFrame:
+    data = df.copy()
+    if column not in data.columns:
+        return data
+    if digits_only:
+        extracted = data[column].astype(str).str.extract(r"(\\d+)")
+        data[column] = pd.to_numeric(extracted[0], errors="coerce")
+    data[column] = pd.to_numeric(data[column], errors="coerce").astype("Int64")
+    return data
+def _rename_column(df: pd.DataFrame, source: str, target: str) -> pd.DataFrame:
+    if source not in df.columns:
+        return df
+    return df.rename(columns={source: target})
+def _log_id_diagnostics(df: pd.DataFrame, *, name: str, col_id: str) -> None:
+    if col_id not in df.columns:
+        logger.warning("La colonne {} est absente du fichier {}.", col_id, name)
+        return
+    total = len(df)
+    uniques = df[col_id].nunique(dropna=True)
+    duplicates = total - uniques
+    logger.info(
+        "{name}: {total} lignes | {uniques} identifiants uniques | {duplicates} doublons",
+        name=name,
+        total=total,
+        uniques=uniques,
+        duplicates=duplicates,
+    )
+def _persist_sql_trace(df_dict: dict[str, pd.DataFrame], settings: Settings) -> pd.DataFrame:
+    """
+    Reproduire la fusion SQL décrite dans le notebook.
+    Chaque DataFrame est stocké dans une base SQLite éphémère pour
+    conserver une traçabilité de la requête exécutée.
+    """
+    db_path = settings.db_file
+    sql_path = settings.sql_file
+    db_path.parent.mkdir(parents=True, exist_ok=True)
+    sql_path.parent.mkdir(parents=True, exist_ok=True)
+    if db_path.exists():
+        db_path.unlink()
+    query = f"""
+    SELECT *
+    FROM sirh
+    INNER JOIN evaluation USING ({settings.col_id})
+    INNER JOIN sond USING ({settings.col_id});
+    """.strip()
+    with db_path.open("wb") as _:
+        pass  # just ensure the file exists for sqlite on some platforms
+    with sqlite3.connect(db_path) as conn:
+        for name, frame in df_dict.items():
+            frame.to_sql(name, conn, index=False, if_exists="replace")
+        merged = pd.read_sql_query(query, conn)
+    sql_path.write_text(query, encoding="utf-8")
+    return merged
+def build_dataset(settings: Settings) -> pd.DataFrame:
+    """Load, clean, harmonize and merge the three raw sources."""
+    sirh = clean_text_values(
+        safe_read_csv(settings.path_sirh).pipe(
+            _harmonize_id_column, settings.col_id, digits_only=True
+        )
+    )
+    evaluation = clean_text_values(
+        safe_read_csv(settings.path_eval)
+        .pipe(_rename_column, "eval_number", settings.col_id)
+        .pipe(_harmonize_id_column, settings.col_id, digits_only=True)
+    )
+    sond = clean_text_values(
+        safe_read_csv(settings.path_sondage)
+        .pipe(_rename_column, "code_sondage", settings.col_id)
+        .pipe(_harmonize_id_column, settings.col_id, digits_only=True)
+    )
+    for name, frame in {"sirh": sirh, "evaluation": evaluation, "sond": sond}.items():
+        _log_id_diagnostics(frame, name=name, col_id=settings.col_id)
+    frames = {
+        "sirh": sirh,
+        "evaluation": evaluation,
+        "sond": sond,
+    }
+    merged = _persist_sql_trace(frames, settings)
+    missing_cols = [settings.col_id] if settings.col_id not in merged.columns else []
+    if missing_cols:
+        raise KeyError(
+            f"La colonne {settings.col_id} est absente de la fusion finale. "
+            "Vérifiez vos fichiers sources."
+        )
+    logger.success("Fusion réalisée: {} lignes / {} colonnes", *merged.shape)
+    return merged
+def save_dataset(df: pd.DataFrame, output_path: Path) -> None:
+    output_path.parent.mkdir(parents=True, exist_ok=True)
+    df.to_csv(output_path, index=False)
+    logger.success("Fichier fusionné sauvegardé dans {}", output_path)
+# ---------------------------------------------------------------------------
+# CLI
+# ---------------------------------------------------------------------------
 @app.command()
 def main(
+    settings_path: Path = typer.Option(
+        None,
+        "--settings",
+        "-s",
+        help="Chemin vers un fichier settings.yml personnalisé.",
+    ),
+    output_path: Path = typer.Option(
+        INTERIM_DATA_DIR / "merged.csv",
+        "--output",
+        "-o",
+        help="Chemin de sortie du dataset fusionné.",
+    ),
 ):
+    """Entrypoint Typer pour reproduire la fusion des données brutes."""
+    settings = load_settings(settings_path) if settings_path else load_settings()
+    df = build_dataset(settings)
+    save_dataset(df, output_path)
 if __name__ == "__main__":

hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/projet_05/explainability.py ADDED Viewed

	@@ -0,0 +1,102 @@

+from __future__ import annotations
+from pathlib import Path
+from typing import Tuple
+import numpy as np
+import pandas as pd
+from loguru import logger
+from projet_05.branding import Theme, apply_brand_theme, make_diverging_cmap
+from scripts_projet04.manet_projet04.shap_generator import (  # type: ignore[import-not-found]
+    shap_global,
+    shap_local,
+)
+apply_brand_theme()
+def _shape_array(values) -> np.ndarray:
+    if hasattr(values, "values"):
+        arr = np.array(values.values)
+    else:
+        arr = np.array(values)
+    return np.nan_to_num(arr, copy=False)
+def compute_shap_summary(
+    pipeline,
+    X: pd.DataFrame,
+    y: pd.Series,
+    *,
+    max_samples: int = 500,
+) -> Tuple[pd.DataFrame | None, object | None]:
+    """
+    Reuse the historical `shap_global` helper to build the plots and a tabular summary.
+    Returns
+    -------
+    summary_df : pd.DataFrame | None
+        Moyenne absolue des valeurs SHAP (ordre décroissant).
+    shap_values : shap.Explanation | None
+        Objet renvoyé par shap_global pour des analyses locales ultérieures.
+    """
+    cmap = make_diverging_cmap(Theme.PRIMARY, Theme.SECONDARY)
+    shap_values, _, feature_names = shap_global(
+        pipeline,
+        X,
+        y,
+        sample_size=max_samples,
+        cmap=cmap,
+    )
+    if shap_values is None or feature_names is None:
+        logger.warning("Impossible de générer les résumés SHAP.")
+        return None, None
+    shap_array = _shape_array(shap_values)
+    if shap_array.ndim == 1:
+        shap_array = shap_array.reshape(-1, 1)
+    mean_abs = np.abs(shap_array).mean(axis=0)
+    summary = (
+        pd.DataFrame({"feature": list(feature_names), "mean_abs_shap": mean_abs})
+        .sort_values("mean_abs_shap", ascending=False)
+        .reset_index(drop=True)
+    )
+    return summary, shap_values
+def save_shap_summary(summary: pd.DataFrame, output_path: Path) -> None:
+    output_path.parent.mkdir(parents=True, exist_ok=True)
+    summary.to_csv(output_path, index=False)
+    logger.info("Résumé SHAP sauvegardé dans {}", output_path)
+def export_local_explanations(
+    pipeline,
+    shap_values,
+    X: pd.DataFrame,
+    custom_index: int | None = None,
+) -> None:
+    """
+    Génère trois cas d'usage par défaut (impact max, risque max, risque min)
+    et un indice custom optionnel pour la trace historique.
+    """
+    if shap_values is None:
+        return
+    shap_array = _shape_array(shap_values)
+    idx_impact = int(np.argmax(np.sum(np.abs(shap_array), axis=1)))
+    shap_local(idx_impact, shap_values)
+    y_proba_all = pipeline.predict_proba(X)[:, 1]
+    idx_highrisk = int(np.argmax(y_proba_all))
+    shap_local(idx_highrisk, shap_values)
+    idx_lowrisk = int(np.argmin(y_proba_all))
+    shap_local(idx_lowrisk, shap_values, text_scale=0.6)
+    if custom_index is not None:
+        shap_local(custom_index, shap_values, max_display=8)
+__all__ = ["compute_shap_summary", "save_shap_summary", "export_local_explanations"]

hf_space/hf_space/hf_space/hf_space/hf_space/hf_space/projet_05/features.py CHANGED Viewed

@@ -1,28 +1,170 @@
 from pathlib import Path
 from loguru import logger
-from tqdm import tqdm
 import typer
-from projet_05.config import PROCESSED_DATA_DIR
-app = typer.Typer()
 @app.command()
 def main(
-    # ---- REPLACE DEFAULT PATHS AS APPROPRIATE ----
-    input_path: Path = PROCESSED_DATA_DIR / "dataset.csv",
-    output_path: Path = PROCESSED_DATA_DIR / "features.csv",
-    # -----------------------------------------
 ):
-    # ---- REPLACE THIS WITH YOUR OWN CODE ----
-    logger.info("Generating features from dataset...")
-    for i in tqdm(range(10), total=10):
-        if i == 5:
-            logger.info("Something happened for iteration 5.")
-    logger.success("Features generation complete.")
-    # -----------------------------------------
 if __name__ == "__main__":

+from __future__ import annotations
+import json
+from datetime import datetime
 from pathlib import Path
+import numpy as np
+import pandas as pd
 from loguru import logger
 import typer
+from projet_05.config import INTERIM_DATA_DIR, PROCESSED_DATA_DIR
+from projet_05.settings import Settings, load_settings
+app = typer.Typer(help="Génération des features et nettoyage de la cible.")
+TARGET_MAPPING = {
+    "1": 1,
+    "0": 0,
+    "oui": 1,
+    "non": 0,
+    "true": 1,
+    "false": 0,
+    "quitte": 1,
+    "reste": 0,
+    "yes": 1,
+    "no": 0,
+}
+# ---------------------------------------------------------------------------
+# Utilitaires cœur de pipeline
+# ---------------------------------------------------------------------------
+def _load_merged_dataset(path: Path) -> pd.DataFrame:
+    if not path.exists():
+        raise FileNotFoundError(
+            f"Le fichier fusionné {path} est introuvable. Lancez `python projet_05/dataset.py` d'abord."
+        )
+    logger.info("Chargement du dataset fusionné depuis {}", path)
+    return pd.read_csv(path)
+def _normalize_target(df: pd.DataFrame, settings: Settings) -> pd.DataFrame:
+    if settings.target not in df.columns:
+        raise KeyError(f"La variable cible '{settings.target}' est absente du fichier.")
+    normalized = (
+        df[settings.target]
+        .astype(str)
+        .str.strip()
+        .str.lower()
+        .map(TARGET_MAPPING)
+    )
+    df = df.copy()
+    df[settings.target] = normalized
+    before = len(df)
+    df = df[df[settings.target].isin([0, 1])].copy()
+    dropped = before - len(df)
+    if dropped:
+        logger.warning("Suppression de {} lignes avec une cible invalide.", dropped)
+    df[settings.target] = df[settings.target].astype(int)
+    return df
+def _safe_ratio(df: pd.DataFrame, numerator: str, denominator: str, output: str) -> None:
+    if numerator not in df.columns or denominator not in df.columns:
+        return
+    denominator_series = df[denominator].replace({0: np.nan})
+    df[output] = df[numerator] / denominator_series
+def _engineer_features(df: pd.DataFrame, settings: Settings) -> pd.DataFrame:
+    engineered = df.copy()
+    col = "augementation_salaire_precedente"
+    if col in engineered:
+        engineered[col] = (
+            engineered[col]
+            .astype(str)
+            .str.replace("%", "", regex=False)
+            .str.replace(",", ".", regex=False)
+            .str.strip()
+        )
+        engineered[col] = pd.to_numeric(engineered[col], errors="coerce") / 100
+    _safe_ratio(engineered, "augementation_salaire_precedente", "revenu_mensuel", "augmentation_par_revenu")
+    _safe_ratio(engineered, "annees_dans_le_poste_actuel", "annee_experience_totale", "annee_sur_poste_par_experience")
+    _safe_ratio(engineered, "nb_formations_suivies", "annee_experience_totale", "nb_formation_par_experience")
+    _safe_ratio(
+        engineered, "annees_depuis_la_derniere_promotion", "annee_experience_totale", "dern_promo_par_experience"
+    )
+    if settings.sat_cols:
+        existing = [col for col in settings.sat_cols if col in engineered.columns]
+        if existing:
+            engineered["score_moyen_satisfaction"] = engineered[existing].mean(axis=1)
+    if "note_evaluation_actuelle" in engineered.columns and "note_evaluation_precedente" in engineered.columns:
+        engineered["evolution_note"] = (
+            engineered["note_evaluation_actuelle"] - engineered["note_evaluation_precedente"]
+        )
+    return engineered
+def build_features(settings: Settings, *, input_path: Path) -> pd.DataFrame:
+    df = _load_merged_dataset(input_path)
+    df = _normalize_target(df, settings)
+    df = _engineer_features(df, settings)
+    return df
+def save_features(df: pd.DataFrame, output_path: Path) -> None:
+    output_path.parent.mkdir(parents=True, exist_ok=True)
+    df.to_csv(output_path, index=False)
+    logger.success("Dataset enrichi sauvegardé dans {}", output_path)
+def save_schema(settings: Settings, output_path: Path) -> None:
+    schema = {
+        "target": settings.target,
+        "col_id": settings.col_id,
+        "numerical_features": list(settings.num_cols),
+        "categorical_features": list(settings.cat_cols),
+        "satisfaction_features": list(settings.sat_cols),
+        "generated_at": datetime.utcnow().isoformat(),
+    }
+    output_path.parent.mkdir(parents=True, exist_ok=True)
+    output_path.write_text(json.dumps(schema, indent=2), encoding="utf-8")
+    logger.info("Schéma sauvegardé dans {}", output_path)
+# ---------------------------------------------------------------------------
+# CLI
+# ---------------------------------------------------------------------------
 @app.command()
 def main(
+    settings_path: Path = typer.Option(
+        None,
+        "--settings",
+        "-s",
+        help="Chemin optionnel vers un fichier settings.yml personnalisé.",
+    ),
+    input_path: Path = typer.Option(
+        INTERIM_DATA_DIR / "merged.csv",
+        "--input",
+        "-i",
+        help="Chemin du fichier issu de la fusion.",
+    ),
+    output_path: Path = typer.Option(
+        PROCESSED_DATA_DIR / "dataset.csv",
+        "--output",
+        "-o",
+        help="Chemin du fichier enrichi.",
+    ),
+    schema_path: Path = typer.Option(
+        PROCESSED_DATA_DIR / "schema.json",
+        "--schema",
+        help="Chemin de sauvegarde du schéma de features.",
+    ),
 ):
+    """Pipeline Typer pour préparer le dataset enrichi."""
+    settings = load_settings(settings_path) if settings_path else load_settings()
+    df = build_features(settings, input_path=input_path)
+    save_features(df, output_path)
+    save_schema(settings, schema_path)
 if __name__ == "__main__":