Spaces:
Sleeping
Sleeping
init project
Browse files- Dockerfile +17 -0
- README.md +82 -0
- aws/s3_credentials.csv +2 -0
- data/2024_nox.csv +0 -0
- data/2024_o3.csv +0 -0
- data/2024_pm10.csv +0 -0
- data/2024_pm25.csv +0 -0
- data/input_data.csv +7 -0
- data/input_data.csv:Zone.Identifier +3 -0
- data/sample_output_data.csv +0 -0
- data/sample_output_data.csv:Zone.Identifier +3 -0
- etl_process.py +100 -0
- jenkins/Jenkinsfile +69 -0
- jenkins/Jenkinsfile:Zone.Identifier +3 -0
- requirements.txt +2 -0
- tests/requirements.txt +4 -0
- tests/test_etl.py +52 -0
- tests/upload_s3.py +19 -0
Dockerfile
ADDED
|
@@ -0,0 +1,17 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Use an official Python runtime as a parent image
|
| 2 |
+
FROM python:3.9-slim
|
| 3 |
+
|
| 4 |
+
# Set the working directory inside the container
|
| 5 |
+
WORKDIR /app
|
| 6 |
+
|
| 7 |
+
# Copy the current directory contents into the container
|
| 8 |
+
COPY . /app
|
| 9 |
+
|
| 10 |
+
# Install any needed packages specified in requirements.txt
|
| 11 |
+
RUN pip install pandas
|
| 12 |
+
|
| 13 |
+
# Make a volume mount point for the input/output CSV files
|
| 14 |
+
VOLUME ["/app/input_data.csv", "/app/output_data.csv"]
|
| 15 |
+
|
| 16 |
+
# Run the application (by default, run the main ETL process)
|
| 17 |
+
CMD ["python", "etl_process.py"]
|
README.md
ADDED
|
@@ -0,0 +1,82 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Guide de Configuration Jenkins et Pipeline CI/CD
|
| 2 |
+
|
| 3 |
+
## 📌 Introduction
|
| 4 |
+
|
| 5 |
+
Ce projet implémente un pipeline CI/CD dans Jenkins pour exécuter un processus ETL en conteneur Docker. Le pipeline inclut des tests unitaires et envoie les résultats sur AWS S3.
|
| 6 |
+
|
| 7 |
+
## 📂 Structure du projet
|
| 8 |
+
|
| 9 |
+
```bash
|
| 10 |
+
├── data/
|
| 11 |
+
│ ├── input_data.csv # Fichier d'entrée pour l'ETL
|
| 12 |
+
│ ├── sample_output_data.csv # Résultat attendu du processus ETL (échantillon)
|
| 13 |
+
│
|
| 14 |
+
├── jenkins/
|
| 15 |
+
│ └── Jenkinsfile # Pipeline Jenkins pour CI/CD
|
| 16 |
+
│
|
| 17 |
+
├── tests/
|
| 18 |
+
│ ├── Dockerfile # Dockerfile pour lancer les tests
|
| 19 |
+
│ ├── requirements.txt # Dépendances spécifiques aux tests
|
| 20 |
+
│ ├── test_etl.py # Tests unitaires du pipeline ETL
|
| 21 |
+
│ └── upload_s3.py # Script d'upload de test vers S3
|
| 22 |
+
│
|
| 23 |
+
├── Dockerfile # Conteneurisation du projet principal
|
| 24 |
+
├── etl_process.py # Script principal du processus ETL
|
| 25 |
+
├── requirements.txt # Dépendances du projet
|
| 26 |
+
└── README.md # Documentation du projet
|
| 27 |
+
```
|
| 28 |
+
|
| 29 |
+
---
|
| 30 |
+
|
| 31 |
+
## 🚀 Étapes de configuration
|
| 32 |
+
|
| 33 |
+
### 1️⃣ Configuration de Jenkins
|
| 34 |
+
|
| 35 |
+
Assurez-vous que Jenkins est bien installé et fonctionne.
|
| 36 |
+
|
| 37 |
+
### 2️⃣ Ajout des Variables d'Environnement
|
| 38 |
+
|
| 39 |
+
Les variables AWS pour l'upload sur S3 doivent être ajoutées à Jenkins.
|
| 40 |
+
|
| 41 |
+
1. **Aller dans Jenkins** → **Manage Jenkins** → **Manage Credentials**
|
| 42 |
+
2. Sélectionnez **(global)** → **Add Credentials**
|
| 43 |
+
3. Ajoutez les clés AWS en tant que **Secret Text**:
|
| 44 |
+
- **AWS_ACCESS_KEY_ID**
|
| 45 |
+
- **AWS_SECRET_ACCESS_KEY**
|
| 46 |
+
|
| 47 |
+
Ou ajoutez-les directement dans Jenkins :
|
| 48 |
+
|
| 49 |
+
1. **Aller dans Jenkins** → **Manage Jenkins** → **Configure System**
|
| 50 |
+
2. Ajoutez sous **Global Properties** → **Environment Variables** :
|
| 51 |
+
- `AWS_ACCESS_KEY_ID = VOTRE_CLE`
|
| 52 |
+
- `AWS_SECRET_ACCESS_KEY = VOTRE_CLE_SECRET`
|
| 53 |
+
|
| 54 |
+
### 3️⃣ Configuration du Job Jenkins
|
| 55 |
+
|
| 56 |
+
1. **Créer un nouveau job Jenkins** → **Pipeline**
|
| 57 |
+
2. **Sélectionner Pipeline Script from SCM**
|
| 58 |
+
3. **Ajouter le repository GitHub** contenant le `Jenkinsfile`
|
| 59 |
+
4. **Sauvegarder et exécuter**
|
| 60 |
+
|
| 61 |
+
---
|
| 62 |
+
|
| 63 |
+
## 🏗️ Fonctionnement du Pipeline
|
| 64 |
+
|
| 65 |
+
### 🛠️ Étapes du Pipeline :
|
| 66 |
+
|
| 67 |
+
1. **Clone du repository** : Récupère le code depuis GitHub.
|
| 68 |
+
2. **Exécution des tests** :
|
| 69 |
+
- Lance les tests dans un conteneur Docker.
|
| 70 |
+
- Enregistre les résultats en XML.
|
| 71 |
+
- Envoie les résultats sur S3.
|
| 72 |
+
3. **Build de l'ETL** :
|
| 73 |
+
- Construit l'image Docker pour l'ETL.
|
| 74 |
+
4. **Exécution de l'ETL** :
|
| 75 |
+
- Monte les fichiers CSV et exécute le traitement.
|
| 76 |
+
- Sauvegarde le résultat dans `data/output_data.csv`.
|
| 77 |
+
|
| 78 |
+
---
|
| 79 |
+
|
| 80 |
+
Ce pipeline CI/CD garantit l'intégration et le déploiement automatisé du processus ETL en utilisant Jenkins et Docker.
|
| 81 |
+
|
| 82 |
+
🔥 N'hésitez pas à adapter les configurations en fonction de votre environnement !
|
aws/s3_credentials.csv
ADDED
|
@@ -0,0 +1,2 @@
|
|
|
|
|
|
|
|
|
|
| 1 |
+
User name,Password,Console sign-in URL
|
| 2 |
+
aws_quality_air_project,eTl0l6|k,https://020895663224.signin.aws.amazon.com/console
|
data/2024_nox.csv
ADDED
|
File without changes
|
data/2024_o3.csv
ADDED
|
File without changes
|
data/2024_pm10.csv
ADDED
|
File without changes
|
data/2024_pm25.csv
ADDED
|
File without changes
|
data/input_data.csv
ADDED
|
@@ -0,0 +1,7 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
id,name,age,city,salary
|
| 2 |
+
1,John Doe,28,New York,70000
|
| 3 |
+
2,Jane Smith,34,Los Angeles,80000
|
| 4 |
+
3,Bob Johnson,45,Chicago,90000
|
| 5 |
+
4,Alice Williams,29,San Francisco,85000
|
| 6 |
+
5,Charlie Brown,NaN,Houston,65000
|
| 7 |
+
6,Eve Davis,38,Boston,95000
|
data/input_data.csv:Zone.Identifier
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
[ZoneTransfer]
|
| 2 |
+
ZoneId=3
|
| 3 |
+
ReferrerUrl=C:\Users\poove\Downloads\paycare.zip
|
data/sample_output_data.csv
ADDED
|
File without changes
|
data/sample_output_data.csv:Zone.Identifier
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
[ZoneTransfer]
|
| 2 |
+
ZoneId=3
|
| 3 |
+
ReferrerUrl=C:\Users\poove\Downloads\paycare.zip
|
etl_process.py
ADDED
|
@@ -0,0 +1,100 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
import pandas as pd
|
| 2 |
+
import os
|
| 3 |
+
|
| 4 |
+
|
| 5 |
+
# Step 1: Extract
|
| 6 |
+
def extract_data(file_path):
|
| 7 |
+
"""Extracts data from a CSV file."""
|
| 8 |
+
try:
|
| 9 |
+
data = pd.read_csv(file_path)
|
| 10 |
+
print("Data extraction successful.")
|
| 11 |
+
return data
|
| 12 |
+
except Exception as e:
|
| 13 |
+
print(f"Error in data extraction: {e}")
|
| 14 |
+
return None
|
| 15 |
+
|
| 16 |
+
|
| 17 |
+
# Step 2: Transform
|
| 18 |
+
def transform_data(data):
|
| 19 |
+
"""Transforms the data by cleaning and adding new features."""
|
| 20 |
+
try:
|
| 21 |
+
# Drop rows with missing values
|
| 22 |
+
data_cleaned = data.dropna().copy()
|
| 23 |
+
|
| 24 |
+
# Add a new column for Tax (assuming a flat 10% tax rate on salary)
|
| 25 |
+
# data_cleaned["tax"] = data_cleaned["salary"] * 0.1
|
| 26 |
+
data_cleaned.loc[:, "tax"] = data_cleaned["salary"] * 0.1
|
| 27 |
+
|
| 28 |
+
# Calculate net salary after tax
|
| 29 |
+
# data_cleaned["net_salary"] = data_cleaned["salary"] - data_cleaned["tax"]
|
| 30 |
+
data_cleaned.loc[:, "net_salary"] = data_cleaned["salary"] - data_cleaned["tax"]
|
| 31 |
+
|
| 32 |
+
# data_cleaned["net_salary"] = model.predict(X)
|
| 33 |
+
|
| 34 |
+
print("Data transformation successful.")
|
| 35 |
+
return data_cleaned
|
| 36 |
+
except Exception as e:
|
| 37 |
+
print(f"Error in data transformation: {e}")
|
| 38 |
+
return None
|
| 39 |
+
|
| 40 |
+
|
| 41 |
+
# # Step 3: Load
|
| 42 |
+
# def load_data(data, output_file_path):
|
| 43 |
+
# """Loads the transformed data into a new CSV file."""
|
| 44 |
+
# try:
|
| 45 |
+
# data.to_csv(output_file_path, index=False)
|
| 46 |
+
# print(f"Data loaded successfully to {output_file_path}.")
|
| 47 |
+
# except Exception as e:
|
| 48 |
+
# print(f"Error in data loading: {e}")
|
| 49 |
+
|
| 50 |
+
|
| 51 |
+
# # Main ETL function
|
| 52 |
+
# def etl_process(input_file, output_file):
|
| 53 |
+
# data = extract_data(input_file)
|
| 54 |
+
# if data is not None:
|
| 55 |
+
# transformed_data = transform_data(data)
|
| 56 |
+
# if transformed_data is not None:
|
| 57 |
+
# load_data(transformed_data, output_file)
|
| 58 |
+
|
| 59 |
+
|
| 60 |
+
# if __name__ == "__main__":
|
| 61 |
+
# input_file = "input_data.csv"
|
| 62 |
+
# output_file = "output_data.csv"
|
| 63 |
+
# etl_process(input_file, output_file)
|
| 64 |
+
|
| 65 |
+
|
| 66 |
+
# Step 3: Load
|
| 67 |
+
def load_data(data, output_file_path):
|
| 68 |
+
"""Loads the transformed data into a new CSV file."""
|
| 69 |
+
try:
|
| 70 |
+
# Assurer que le dossier `data/` existe
|
| 71 |
+
output_dir = os.path.dirname(output_file_path)
|
| 72 |
+
if not os.path.exists(output_dir):
|
| 73 |
+
os.makedirs(output_dir)
|
| 74 |
+
print(f"📂 Created missing directory: {output_dir}")
|
| 75 |
+
|
| 76 |
+
# Sauvegarde du fichier
|
| 77 |
+
data.to_csv(output_file_path, index=False)
|
| 78 |
+
print(f"✅ Data loaded successfully to {output_file_path}.")
|
| 79 |
+
except Exception as e:
|
| 80 |
+
print(f"❌ Error in data loading: {e}")
|
| 81 |
+
|
| 82 |
+
|
| 83 |
+
# Main ETL function
|
| 84 |
+
def etl_process(input_file, output_file):
|
| 85 |
+
print("🚀 Starting ETL Process...")
|
| 86 |
+
|
| 87 |
+
data = extract_data(input_file)
|
| 88 |
+
if data is not None:
|
| 89 |
+
transformed_data = transform_data(data)
|
| 90 |
+
if transformed_data is not None:
|
| 91 |
+
load_data(transformed_data, output_file)
|
| 92 |
+
|
| 93 |
+
print("✅ ETL Process Completed!")
|
| 94 |
+
|
| 95 |
+
|
| 96 |
+
if __name__ == "__main__":
|
| 97 |
+
input_file = "data/input_data.csv" # Assurez-vous que le fichier est bien là
|
| 98 |
+
output_file = "data/output_data.csv" # Sauvegarde bien dans `data/`
|
| 99 |
+
|
| 100 |
+
etl_process(input_file, output_file)
|
jenkins/Jenkinsfile
ADDED
|
@@ -0,0 +1,69 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
pipeline {
|
| 2 |
+
agent any
|
| 3 |
+
|
| 4 |
+
environment {
|
| 5 |
+
TEST_IMAGE = 'paycare-tests'
|
| 6 |
+
ETL_IMAGE = 'paycare-etl'
|
| 7 |
+
AWS_ACCESS_KEY_ID = credentials('AWS_ACCESS_KEY_ID')
|
| 8 |
+
AWS_SECRET_ACCESS_KEY = credentials('AWS_SECRET_ACCESS_KEY')
|
| 9 |
+
AWS_DEFAULT_REGION = 'eu-north-1'
|
| 10 |
+
}
|
| 11 |
+
|
| 12 |
+
stages {
|
| 13 |
+
|
| 14 |
+
stage('Clone Repository') {
|
| 15 |
+
steps {
|
| 16 |
+
git branch: 'main', url: 'https://github.com/semarmehdi/paycare.git'
|
| 17 |
+
}
|
| 18 |
+
}
|
| 19 |
+
|
| 20 |
+
stage('Build Test Container') {
|
| 21 |
+
steps {
|
| 22 |
+
sh 'docker build -t ${TEST_IMAGE} -f tests/Dockerfile .'
|
| 23 |
+
}
|
| 24 |
+
}
|
| 25 |
+
|
| 26 |
+
stage('Run Unit Tests') {
|
| 27 |
+
steps {
|
| 28 |
+
sh '''
|
| 29 |
+
docker run --rm \
|
| 30 |
+
--env AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_ID} \
|
| 31 |
+
--env AWS_SECRET_ACCESS_KEY=${AWS_SECRET_ACCESS_KEY} \
|
| 32 |
+
--env AWS_DEFAULT_REGION=${AWS_DEFAULT_REGION} \
|
| 33 |
+
paycare-tests
|
| 34 |
+
'''
|
| 35 |
+
}
|
| 36 |
+
}
|
| 37 |
+
|
| 38 |
+
stage('Build ETL Container') {
|
| 39 |
+
steps {
|
| 40 |
+
sh 'docker build -t ${ETL_IMAGE} .'
|
| 41 |
+
}
|
| 42 |
+
}
|
| 43 |
+
|
| 44 |
+
stage('Run ETL in Docker') {
|
| 45 |
+
steps {
|
| 46 |
+
script {
|
| 47 |
+
sh '''
|
| 48 |
+
docker run --rm \
|
| 49 |
+
-v ${WORKSPACE}/data:/app/data \
|
| 50 |
+
${ETL_IMAGE}
|
| 51 |
+
'''
|
| 52 |
+
|
| 53 |
+
sh 'ls -l ${WORKSPACE}/data'
|
| 54 |
+
}
|
| 55 |
+
}
|
| 56 |
+
}
|
| 57 |
+
|
| 58 |
+
}
|
| 59 |
+
|
| 60 |
+
post {
|
| 61 |
+
success {
|
| 62 |
+
echo '✅ ETL Pipeline completed successfully!'
|
| 63 |
+
archiveArtifacts artifacts: 'data/output_data.csv', fingerprint: true
|
| 64 |
+
}
|
| 65 |
+
failure {
|
| 66 |
+
echo '❌ ETL Pipeline failed.'
|
| 67 |
+
}
|
| 68 |
+
}
|
| 69 |
+
}
|
jenkins/Jenkinsfile:Zone.Identifier
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
[ZoneTransfer]
|
| 2 |
+
ZoneId=3
|
| 3 |
+
ReferrerUrl=C:\Users\poove\Downloads\paycare.zip
|
requirements.txt
ADDED
|
@@ -0,0 +1,2 @@
|
|
|
|
|
|
|
|
|
|
| 1 |
+
pandas
|
| 2 |
+
pytest
|
tests/requirements.txt
ADDED
|
@@ -0,0 +1,4 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
pandas==2.0.3
|
| 2 |
+
numpy==1.24.4
|
| 3 |
+
pytest
|
| 4 |
+
boto3
|
tests/test_etl.py
ADDED
|
@@ -0,0 +1,52 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
import pytest
|
| 2 |
+
import pandas as pd
|
| 3 |
+
from io import StringIO
|
| 4 |
+
from etl_process import extract_data, transform_data, load_data
|
| 5 |
+
|
| 6 |
+
|
| 7 |
+
# Test for data extraction
|
| 8 |
+
def test_extract_data():
|
| 9 |
+
csv_data = StringIO(
|
| 10 |
+
"employee_id,employee_name,salary\n101,Alice,5000\n102,Bob,6000"
|
| 11 |
+
)
|
| 12 |
+
data = pd.read_csv(csv_data)
|
| 13 |
+
assert data is not None
|
| 14 |
+
assert len(data) == 2
|
| 15 |
+
|
| 16 |
+
|
| 17 |
+
# Test for data transformation
|
| 18 |
+
def test_transform_data():
|
| 19 |
+
data = pd.DataFrame(
|
| 20 |
+
{
|
| 21 |
+
"employee_id": [101, 102],
|
| 22 |
+
"employee_name": ["Alice", "Bob"],
|
| 23 |
+
"salary": [5000, 6000],
|
| 24 |
+
}
|
| 25 |
+
)
|
| 26 |
+
|
| 27 |
+
transformed_data = transform_data(data)
|
| 28 |
+
assert "tax" in transformed_data.columns
|
| 29 |
+
assert "net_salary" in transformed_data.columns
|
| 30 |
+
assert transformed_data["tax"][0] == 500 # 10% of 5000
|
| 31 |
+
assert transformed_data["net_salary"][0] == 4500 # 5000 - 500
|
| 32 |
+
|
| 33 |
+
|
| 34 |
+
# Test for data loading
|
| 35 |
+
def test_load_data(tmpdir):
|
| 36 |
+
data = pd.DataFrame(
|
| 37 |
+
{
|
| 38 |
+
"employee_id": [101],
|
| 39 |
+
"employee_name": ["Alice"],
|
| 40 |
+
"salary": [5000],
|
| 41 |
+
"tax": [500],
|
| 42 |
+
"net_salary": [4500],
|
| 43 |
+
}
|
| 44 |
+
)
|
| 45 |
+
|
| 46 |
+
output_file = tmpdir.join("output_data.csv")
|
| 47 |
+
load_data(data, str(output_file))
|
| 48 |
+
loaded_data = pd.read_csv(output_file)
|
| 49 |
+
|
| 50 |
+
assert len(loaded_data) == 1
|
| 51 |
+
assert loaded_data["employee_name"][0] == "Alice"
|
| 52 |
+
assert loaded_data["net_salary"][0] == 4500
|
tests/upload_s3.py
ADDED
|
@@ -0,0 +1,19 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
import boto3
|
| 2 |
+
|
| 3 |
+
# Initialisation du client S3
|
| 4 |
+
s3_client = boto3.client("s3")
|
| 5 |
+
|
| 6 |
+
# Définition des paramètres
|
| 7 |
+
# file_name = "unit-tests.xml" # Nom du fichier XML généré par pytest
|
| 8 |
+
file_name = "results.xml"
|
| 9 |
+
bucket_name = "jedhamehdi"
|
| 10 |
+
s3_key = "test-results/unit-tests.xml" # Chemin dans le bucket
|
| 11 |
+
|
| 12 |
+
# Upload du fichier
|
| 13 |
+
try:
|
| 14 |
+
s3_client.upload_file(file_name, bucket_name, s3_key)
|
| 15 |
+
print(
|
| 16 |
+
f"✅ Fichier '{file_name}' envoyé avec succès sur 's3://{bucket_name}/{s3_key}'"
|
| 17 |
+
)
|
| 18 |
+
except Exception as e:
|
| 19 |
+
print(f"❌ Erreur lors de l'upload sur S3 : {e}")
|