push_data_pr (#2)
Browse files- push json file to the dataset using a pr (43a2b78f1895422e99a5687323d47b142387c92f)
- crate json file well named (8bd8fa8db0eb8e86ee7889f3f0e49e9951629202)
- crate json file well named (e794a4b709696e06fe07770145a211e4ff5ba05a)
- fix pb of local files generated (beee3d386da0701718608a71ef31563c6f50aed8)
- better handling of dynamic sections to keep in memory fields that are already filled (0662adb28de0eddcf7afcd7ece6722e21d58ea07)
- handle problems with dynamic sections and implement boamps format validator (e2adf94cd8d9918a6463e2187ebb7e34bcc7248d)
- code refacto and cleaning (488a9f6127c0ee4a5ba79aaeb5264812fabffd80)
- README.md +92 -9
- app.py +169 -23
- assets/utils/validation.py +69 -28
- src/services/dataset_upload.py +37 -0
- src/services/form_parser.py +147 -0
- src/services/huggingface.py +0 -244
- src/services/json_generator.py +212 -190
- src/services/report_builder.py +273 -0
- src/services/util.py +1 -10
- src/ui/form_components.py +128 -98
README.md
CHANGED
|
@@ -11,20 +11,103 @@ license: apache-2.0
|
|
| 11 |
short_description: Create a report in BoAmps format
|
| 12 |
---
|
| 13 |
|
|
|
|
|
|
|
| 14 |
This tool is part of the initiative [BoAmps](https://github.com/Boavizta/BoAmps).
|
| 15 |
The purpose of the BoAmps project is to build a large, open, database of energy consumption of IT / AI tasks depending on data nature, algorithms, hardware, etc., in order to improve energy efficiency approaches based on empiric knowledge.
|
| 16 |
|
| 17 |
This space was initiated by a group of students from Sud Telecom Paris, many thanks to [Hicham FILALI](https://huggingface.co/FILALIHicham) for his work.
|
| 18 |
|
| 19 |
-
##
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 20 |
|
| 21 |
-
|
| 22 |
-
|
| 23 |
-
|
|
|
|
|
|
|
|
|
|
| 24 |
|
| 25 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 26 |
|
| 27 |
-
|
| 28 |
-
Activate it: >.\.venv\Scripts\activate
|
| 29 |
-
Install dependencies: >pipenv install -d
|
| 30 |
-
Launch the application: pipenv run python main.py
|
|
|
|
| 11 |
short_description: Create a report in BoAmps format
|
| 12 |
---
|
| 13 |
|
| 14 |
+
# BoAmps Report Creation Tool 🌿
|
| 15 |
+
|
| 16 |
This tool is part of the initiative [BoAmps](https://github.com/Boavizta/BoAmps).
|
| 17 |
The purpose of the BoAmps project is to build a large, open, database of energy consumption of IT / AI tasks depending on data nature, algorithms, hardware, etc., in order to improve energy efficiency approaches based on empiric knowledge.
|
| 18 |
|
| 19 |
This space was initiated by a group of students from Sud Telecom Paris, many thanks to [Hicham FILALI](https://huggingface.co/FILALIHicham) for his work.
|
| 20 |
|
| 21 |
+
## 🚀 Quick Start
|
| 22 |
+
|
| 23 |
+
### Prerequisites
|
| 24 |
+
- **Python** >= 3.12
|
| 25 |
+
|
| 26 |
+
|
| 27 |
+
### Installation Steps
|
| 28 |
+
|
| 29 |
+
1. **Clone the repository**
|
| 30 |
+
|
| 31 |
+
2. **Create and activate virtual environment (not mandatory)**
|
| 32 |
+
```bash
|
| 33 |
+
# Windows
|
| 34 |
+
python -m venv .venv
|
| 35 |
+
.\.venv\Scripts\activate
|
| 36 |
+
|
| 37 |
+
# Linux/MacOS
|
| 38 |
+
python -m venv .venv
|
| 39 |
+
source .venv/bin/activate
|
| 40 |
+
```
|
| 41 |
+
|
| 42 |
+
3. **Install dependencies**
|
| 43 |
+
```bash
|
| 44 |
+
pip install pipenv
|
| 45 |
+
pipenv install --dev
|
| 46 |
+
```
|
| 47 |
+
|
| 48 |
+
4. **Launch the application**
|
| 49 |
+
```bash
|
| 50 |
+
python ./app.py
|
| 51 |
+
```
|
| 52 |
+
|
| 53 |
+
5. **Access the application**
|
| 54 |
+
- Open your browser and go to `http://localhost:7860`
|
| 55 |
+
- The Gradio interface will be available for creating BoAmps reports
|
| 56 |
+
|
| 57 |
+
## 🏗️ Architecture Overview
|
| 58 |
+
|
| 59 |
+
### Core Components
|
| 60 |
+
|
| 61 |
+
1. **`app.py`** - Main application file
|
| 62 |
+
- Initializes the Gradio interface
|
| 63 |
+
- Orchestrates all UI components
|
| 64 |
+
- Handles application routing and main logic
|
| 65 |
|
| 66 |
+
2. **Services Layer (`src/services/`)**
|
| 67 |
+
- **`json_generator.py`**: Generates BoAmps-compliant JSON reports
|
| 68 |
+
- **`report_builder.py`**: Constructs structured report data
|
| 69 |
+
- **`form_parser.py`**: Processes and validates form inputs
|
| 70 |
+
- **`dataset_upload.py`**: Manages Hugging Face dataset integration
|
| 71 |
+
- **`util.py`**: Common utility functions
|
| 72 |
|
| 73 |
+
3. **UI Layer (`src/ui/`)**
|
| 74 |
+
- **`form_components.py`**: Gradio interface components for different report sections
|
| 75 |
+
|
| 76 |
+
4. **Assets & Validation (`assets/`)**
|
| 77 |
+
- **`validation.py`**: BoAmps schema validation logic
|
| 78 |
+
- **`app.css`**: Application styling
|
| 79 |
+
|
| 80 |
+
### Data Flow
|
| 81 |
+
|
| 82 |
+
```
|
| 83 |
+
User Input (Gradio Form)
|
| 84 |
+
↓
|
| 85 |
+
Form Parser & Validation
|
| 86 |
+
↓
|
| 87 |
+
JSON Generator
|
| 88 |
+
↓
|
| 89 |
+
Report Builder
|
| 90 |
+
↓
|
| 91 |
+
BoAmps Schema Validation
|
| 92 |
+
↓
|
| 93 |
+
JSON Report Output
|
| 94 |
+
```
|
| 95 |
+
|
| 96 |
+
## 🤝 Contributing
|
| 97 |
+
|
| 98 |
+
Contributions are welcome! Please:
|
| 99 |
+
|
| 100 |
+
1. Fork the repository
|
| 101 |
+
2. Create a feature branch
|
| 102 |
+
3. Make your changes
|
| 103 |
+
4. Submit a pull request
|
| 104 |
+
|
| 105 |
+
## 📄 License
|
| 106 |
+
|
| 107 |
+
This project is licensed under the Apache 2.0 License - see the license information in the repository header.
|
| 108 |
+
|
| 109 |
+
## 🙏 Acknowledgments
|
| 110 |
+
|
| 111 |
+
This space was initiated by a group of students from Sud Telecom Paris, many thanks to [Hicham FILALI](https://huggingface.co/FILALIHicham) for his work.
|
| 112 |
|
| 113 |
+
For more information about the BoAmps initiative, visit the [official repository](https://github.com/Boavizta/BoAmps).
|
|
|
|
|
|
|
|
|
app.py
CHANGED
|
@@ -1,7 +1,8 @@
|
|
| 1 |
import gradio as gr
|
| 2 |
from os import path
|
| 3 |
-
from src.services.
|
| 4 |
from src.services.json_generator import generate_json
|
|
|
|
| 5 |
from src.ui.form_components import (
|
| 6 |
create_header_tab,
|
| 7 |
create_task_tab,
|
|
@@ -19,22 +20,148 @@ init_huggingface()
|
|
| 19 |
|
| 20 |
|
| 21 |
def handle_submit(*inputs):
|
| 22 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 23 |
|
| 24 |
# Check if the message indicates validation failure
|
| 25 |
-
if message.startswith("The
|
| 26 |
-
|
|
|
|
|
|
|
| 27 |
|
| 28 |
publish_button = gr.Button(
|
| 29 |
"Share your data to the public repository", interactive=True, elem_classes="pubbutton")
|
| 30 |
|
| 31 |
-
return "Report sucessefully created",
|
|
|
|
| 32 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 33 |
|
| 34 |
-
|
| 35 |
-
|
| 36 |
-
|
| 37 |
-
|
|
|
|
| 38 |
|
| 39 |
|
| 40 |
# Create Gradio interface
|
|
@@ -57,30 +184,49 @@ with gr.Blocks(css_paths=css_path) as app:
|
|
| 57 |
submit_button = gr.Button("Submit", elem_classes="subbutton")
|
| 58 |
output = gr.Textbox(label="Output", lines=1)
|
| 59 |
json_output = gr.Textbox(visible=False)
|
| 60 |
-
|
| 61 |
publish_button = gr.Button(
|
| 62 |
"Share your data to the public repository", interactive=False, elem_classes="pubbutton")
|
| 63 |
|
| 64 |
-
# Event Handlers
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 65 |
submit_button.click(
|
| 66 |
handle_submit,
|
| 67 |
-
inputs=
|
| 68 |
-
|
| 69 |
-
*task_components,
|
| 70 |
-
*measures_components,
|
| 71 |
-
*system_components,
|
| 72 |
-
*software_components,
|
| 73 |
-
*infrastructure_components,
|
| 74 |
-
*environment_components,
|
| 75 |
-
*quality_components,
|
| 76 |
-
],
|
| 77 |
-
outputs=[output, file_output, json_output, publish_button]
|
| 78 |
)
|
| 79 |
# Event Handlers
|
| 80 |
publish_button.click(
|
| 81 |
handle_publi,
|
| 82 |
inputs=[
|
| 83 |
-
json_output
|
| 84 |
],
|
| 85 |
outputs=[output]
|
| 86 |
)
|
|
|
|
| 1 |
import gradio as gr
|
| 2 |
from os import path
|
| 3 |
+
from src.services.dataset_upload import init_huggingface, update_dataset
|
| 4 |
from src.services.json_generator import generate_json
|
| 5 |
+
from src.services.form_parser import form_parser
|
| 6 |
from src.ui.form_components import (
|
| 7 |
create_header_tab,
|
| 8 |
create_task_tab,
|
|
|
|
| 20 |
|
| 21 |
|
| 22 |
def handle_submit(*inputs):
|
| 23 |
+
"""Handle form submission with optimized parsing."""
|
| 24 |
+
try:
|
| 25 |
+
# Parse inputs using the structured parser
|
| 26 |
+
parsed_data = form_parser.parse_inputs(inputs)
|
| 27 |
+
|
| 28 |
+
# Extract data for generate_json function
|
| 29 |
+
header_params = list(parsed_data["header"].values())
|
| 30 |
+
|
| 31 |
+
# Task data
|
| 32 |
+
task_simple = parsed_data["task_simple"]
|
| 33 |
+
taskFamily, taskStage, nbRequest = task_simple[
|
| 34 |
+
"taskFamily"], task_simple["taskStage"], task_simple["nbRequest"]
|
| 35 |
+
|
| 36 |
+
# Dynamic sections - algorithm data
|
| 37 |
+
algorithms = parsed_data["algorithms"]
|
| 38 |
+
trainingType = algorithms["trainingType"]
|
| 39 |
+
algorithmType = algorithms["algorithmType"]
|
| 40 |
+
algorithmName = algorithms["algorithmName"]
|
| 41 |
+
algorithmUri = algorithms["algorithmUri"]
|
| 42 |
+
foundationModelName = algorithms["foundationModelName"]
|
| 43 |
+
foundationModelUri = algorithms["foundationModelUri"]
|
| 44 |
+
parametersNumber = algorithms["parametersNumber"]
|
| 45 |
+
framework = algorithms["framework"]
|
| 46 |
+
frameworkVersion = algorithms["frameworkVersion"]
|
| 47 |
+
classPath = algorithms["classPath"]
|
| 48 |
+
layersNumber = algorithms["layersNumber"]
|
| 49 |
+
epochsNumber = algorithms["epochsNumber"]
|
| 50 |
+
optimizer = algorithms["optimizer"]
|
| 51 |
+
quantization = algorithms["quantization"]
|
| 52 |
+
|
| 53 |
+
# Dynamic sections - dataset data
|
| 54 |
+
dataset = parsed_data["dataset"]
|
| 55 |
+
dataUsage = dataset["dataUsage"]
|
| 56 |
+
dataType = dataset["dataType"]
|
| 57 |
+
dataFormat = dataset["dataFormat"]
|
| 58 |
+
dataSize = dataset["dataSize"]
|
| 59 |
+
dataQuantity = dataset["dataQuantity"]
|
| 60 |
+
shape = dataset["shape"]
|
| 61 |
+
source = dataset["source"]
|
| 62 |
+
sourceUri = dataset["sourceUri"]
|
| 63 |
+
owner = dataset["owner"]
|
| 64 |
+
|
| 65 |
+
# Task final data
|
| 66 |
+
task_final = parsed_data["task_final"]
|
| 67 |
+
measuredAccuracy, estimatedAccuracy, taskDescription = task_final[
|
| 68 |
+
"measuredAccuracy"], task_final["estimatedAccuracy"], task_final["taskDescription"]
|
| 69 |
+
|
| 70 |
+
# Measures data
|
| 71 |
+
measures = parsed_data["measures"]
|
| 72 |
+
measurementMethod = measures["measurementMethod"]
|
| 73 |
+
manufacturer = measures["manufacturer"]
|
| 74 |
+
version = measures["version"]
|
| 75 |
+
cpuTrackingMode = measures["cpuTrackingMode"]
|
| 76 |
+
gpuTrackingMode = measures["gpuTrackingMode"]
|
| 77 |
+
averageUtilizationCpu = measures["averageUtilizationCpu"]
|
| 78 |
+
averageUtilizationGpu = measures["averageUtilizationGpu"]
|
| 79 |
+
powerCalibrationMeasurement = measures["powerCalibrationMeasurement"]
|
| 80 |
+
durationCalibrationMeasurement = measures["durationCalibrationMeasurement"]
|
| 81 |
+
powerConsumption = measures["powerConsumption"]
|
| 82 |
+
measurementDuration = measures["measurementDuration"]
|
| 83 |
+
measurementDateTime = measures["measurementDateTime"]
|
| 84 |
+
|
| 85 |
+
# System data
|
| 86 |
+
system = parsed_data["system"]
|
| 87 |
+
osystem, distribution, distributionVersion = system[
|
| 88 |
+
"osystem"], system["distribution"], system["distributionVersion"]
|
| 89 |
+
|
| 90 |
+
# Software data
|
| 91 |
+
software = parsed_data["software"]
|
| 92 |
+
language, version_software = software["language"], software["version_software"]
|
| 93 |
+
|
| 94 |
+
# Infrastructure data
|
| 95 |
+
infra_simple = parsed_data["infrastructure_simple"]
|
| 96 |
+
infraType, cloudProvider, cloudInstance, cloudService = infra_simple["infraType"], infra_simple[
|
| 97 |
+
"cloudProvider"], infra_simple["cloudInstance"], infra_simple["cloudService"]
|
| 98 |
+
|
| 99 |
+
# Infrastructure components
|
| 100 |
+
infra_components = parsed_data["infrastructure_components"]
|
| 101 |
+
componentName = infra_components["componentName"]
|
| 102 |
+
componentType = infra_components["componentType"]
|
| 103 |
+
nbComponent = infra_components["nbComponent"]
|
| 104 |
+
memorySize = infra_components["memorySize"]
|
| 105 |
+
manufacturer_infra = infra_components["manufacturer_infra"]
|
| 106 |
+
family = infra_components["family"]
|
| 107 |
+
series = infra_components["series"]
|
| 108 |
+
share = infra_components["share"]
|
| 109 |
+
|
| 110 |
+
# Environment data
|
| 111 |
+
environment = parsed_data["environment"]
|
| 112 |
+
country, latitude, longitude, location, powerSupplierType, powerSource, powerSourceCarbonIntensity = environment["country"], environment["latitude"], environment[
|
| 113 |
+
"longitude"], environment["location"], environment["powerSupplierType"], environment["powerSource"], environment["powerSourceCarbonIntensity"]
|
| 114 |
+
|
| 115 |
+
# Quality data
|
| 116 |
+
quality = parsed_data["quality"]["quality"]
|
| 117 |
+
|
| 118 |
+
# Call generate_json with structured parameters
|
| 119 |
+
message, file_path, json_output = generate_json(
|
| 120 |
+
*header_params,
|
| 121 |
+
taskFamily, taskStage, nbRequest,
|
| 122 |
+
trainingType, algorithmType, algorithmName, algorithmUri, foundationModelName, foundationModelUri, parametersNumber, framework, frameworkVersion, classPath, layersNumber, epochsNumber, optimizer, quantization,
|
| 123 |
+
dataUsage, dataType, dataFormat, dataSize, dataQuantity, shape, source, sourceUri, owner,
|
| 124 |
+
measuredAccuracy, estimatedAccuracy, taskDescription,
|
| 125 |
+
measurementMethod, manufacturer, version, cpuTrackingMode, gpuTrackingMode,
|
| 126 |
+
averageUtilizationCpu, averageUtilizationGpu, powerCalibrationMeasurement,
|
| 127 |
+
durationCalibrationMeasurement, powerConsumption,
|
| 128 |
+
measurementDuration, measurementDateTime,
|
| 129 |
+
osystem, distribution, distributionVersion,
|
| 130 |
+
language, version_software,
|
| 131 |
+
infraType, cloudProvider, cloudInstance, cloudService, componentName, componentType,
|
| 132 |
+
nbComponent, memorySize, manufacturer_infra, family,
|
| 133 |
+
series, share,
|
| 134 |
+
country, latitude, longitude, location,
|
| 135 |
+
powerSupplierType, powerSource, powerSourceCarbonIntensity,
|
| 136 |
+
quality
|
| 137 |
+
)
|
| 138 |
+
|
| 139 |
+
except Exception as e:
|
| 140 |
+
return f"Error: {e}", None, "", gr.Button("Share your data to the public repository", interactive=False, elem_classes="pubbutton")
|
| 141 |
|
| 142 |
# Check if the message indicates validation failure
|
| 143 |
+
if message.startswith("The json file does not correspond"):
|
| 144 |
+
publish_button = gr.Button(
|
| 145 |
+
"Share your data to the public repository", interactive=False, elem_classes="pubbutton")
|
| 146 |
+
return message, file_path, json_output, publish_button
|
| 147 |
|
| 148 |
publish_button = gr.Button(
|
| 149 |
"Share your data to the public repository", interactive=True, elem_classes="pubbutton")
|
| 150 |
|
| 151 |
+
return "Report sucessefully created", file_path, json_output, publish_button
|
| 152 |
+
|
| 153 |
|
| 154 |
+
def handle_publi(file_path, json_output):
|
| 155 |
+
"""Handle publication to Hugging Face dataset with improved error handling."""
|
| 156 |
+
try:
|
| 157 |
+
if not file_path or not json_output:
|
| 158 |
+
return "Error: No file or data to publish."
|
| 159 |
|
| 160 |
+
# If validation passed, proceed to update_dataset
|
| 161 |
+
update_output = update_dataset(file_path, json_output)
|
| 162 |
+
return update_output
|
| 163 |
+
except Exception as e:
|
| 164 |
+
return f"Error during publication: {str(e)}"
|
| 165 |
|
| 166 |
|
| 167 |
# Create Gradio interface
|
|
|
|
| 184 |
submit_button = gr.Button("Submit", elem_classes="subbutton")
|
| 185 |
output = gr.Textbox(label="Output", lines=1)
|
| 186 |
json_output = gr.Textbox(visible=False)
|
| 187 |
+
json_file = gr.File(label="Downloadable JSON")
|
| 188 |
publish_button = gr.Button(
|
| 189 |
"Share your data to the public repository", interactive=False, elem_classes="pubbutton")
|
| 190 |
|
| 191 |
+
# Event Handlers - Optimized input flattening
|
| 192 |
+
def flatten_inputs(components):
|
| 193 |
+
"""
|
| 194 |
+
Recursively flatten nested lists of components with improved performance.
|
| 195 |
+
Uses iterative approach and generator expressions for better memory efficiency.
|
| 196 |
+
"""
|
| 197 |
+
result = []
|
| 198 |
+
stack = list(reversed(components)) # Use stack to avoid recursion
|
| 199 |
+
|
| 200 |
+
while stack:
|
| 201 |
+
item = stack.pop()
|
| 202 |
+
if isinstance(item, list):
|
| 203 |
+
# Add items in reverse order to maintain original sequence
|
| 204 |
+
stack.extend(reversed(item))
|
| 205 |
+
else:
|
| 206 |
+
result.append(item)
|
| 207 |
+
|
| 208 |
+
return result
|
| 209 |
+
|
| 210 |
+
all_inputs = flatten_inputs(header_components + task_components + measures_components +
|
| 211 |
+
system_components + software_components + infrastructure_components +
|
| 212 |
+
environment_components + quality_components)
|
| 213 |
+
|
| 214 |
+
# Validate input count matches expected structure
|
| 215 |
+
expected_count = form_parser.get_total_input_count()
|
| 216 |
+
if len(all_inputs) != expected_count:
|
| 217 |
+
print(
|
| 218 |
+
f"Warning: Input count mismatch. Expected {expected_count}, got {len(all_inputs)}")
|
| 219 |
+
|
| 220 |
submit_button.click(
|
| 221 |
handle_submit,
|
| 222 |
+
inputs=all_inputs,
|
| 223 |
+
outputs=[output, json_file, json_output, publish_button]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 224 |
)
|
| 225 |
# Event Handlers
|
| 226 |
publish_button.click(
|
| 227 |
handle_publi,
|
| 228 |
inputs=[
|
| 229 |
+
json_file, json_output
|
| 230 |
],
|
| 231 |
outputs=[output]
|
| 232 |
)
|
assets/utils/validation.py
CHANGED
|
@@ -1,33 +1,74 @@
|
|
| 1 |
-
|
| 2 |
-
|
| 3 |
-
|
| 4 |
-
|
| 5 |
-
|
| 6 |
-
|
| 7 |
-
|
| 8 |
-
|
| 9 |
-
|
| 10 |
-
|
| 11 |
-
|
| 12 |
-
|
| 13 |
-
|
| 14 |
-
|
| 15 |
-
for item in v:
|
| 16 |
-
if isinstance(item, dict):
|
| 17 |
-
result = find_field(item, field)
|
| 18 |
-
if result is not None:
|
| 19 |
-
return result
|
| 20 |
return None
|
| 21 |
|
| 22 |
-
missing_fields = []
|
| 23 |
|
| 24 |
-
|
| 25 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 26 |
|
| 27 |
-
|
| 28 |
-
|
| 29 |
-
|
|
|
|
|
|
|
|
|
|
| 30 |
|
| 31 |
-
|
| 32 |
-
return False, f"The following fields are required: {', '.join(missing_fields)}"
|
| 33 |
-
return True, "All required fields are filled."
|
|
|
|
| 1 |
+
import json
|
| 2 |
+
from referencing import Registry, Resource
|
| 3 |
+
from jsonschema import Draft202012Validator
|
| 4 |
+
import requests
|
| 5 |
+
|
| 6 |
+
|
| 7 |
+
def fetch_json_from_url(url: str):
|
| 8 |
+
"""Fetch JSON content from a GitHub raw URL"""
|
| 9 |
+
try:
|
| 10 |
+
response = requests.get(url, timeout=10)
|
| 11 |
+
response.raise_for_status()
|
| 12 |
+
return response.json()
|
| 13 |
+
except (requests.exceptions.RequestException, json.JSONDecodeError) as e:
|
| 14 |
+
print(f"Error fetching/parsing {url}: {e}")
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 15 |
return None
|
| 16 |
|
|
|
|
| 17 |
|
| 18 |
+
# GitHub URLs for the schemas
|
| 19 |
+
SCHEMA_URLS = {
|
| 20 |
+
"algorithm": "https://raw.githubusercontent.com/Boavizta/BoAmps/main/model/algorithm_schema.json",
|
| 21 |
+
"dataset": "https://raw.githubusercontent.com/Boavizta/BoAmps/main/model/dataset_schema.json",
|
| 22 |
+
"measure": "https://raw.githubusercontent.com/Boavizta/BoAmps/main/model/measure_schema.json",
|
| 23 |
+
"hardware": "https://raw.githubusercontent.com/Boavizta/BoAmps/main/model/hardware_schema.json",
|
| 24 |
+
"report": "https://raw.githubusercontent.com/Boavizta/BoAmps/main/model/report_schema.json"
|
| 25 |
+
}
|
| 26 |
+
|
| 27 |
+
|
| 28 |
+
def load_schemas():
|
| 29 |
+
"""Load all schemas from GitHub URLs"""
|
| 30 |
+
schemas = {}
|
| 31 |
+
for name, url in SCHEMA_URLS.items():
|
| 32 |
+
schemas[name] = fetch_json_from_url(url)
|
| 33 |
+
return schemas
|
| 34 |
+
|
| 35 |
+
|
| 36 |
+
def create_registry(schemas):
|
| 37 |
+
"""Create a registry with all sub-schemas"""
|
| 38 |
+
sub_schema_names = ["algorithm", "dataset", "measure", "hardware"]
|
| 39 |
+
resources = [
|
| 40 |
+
(SCHEMA_URLS[name], Resource.from_contents(schemas[name]))
|
| 41 |
+
for name in sub_schema_names
|
| 42 |
+
]
|
| 43 |
+
return Registry().with_resources(resources)
|
| 44 |
+
|
| 45 |
+
|
| 46 |
+
# Load schemas once at module import
|
| 47 |
+
_schemas = load_schemas()
|
| 48 |
+
_registry = create_registry(_schemas)
|
| 49 |
+
|
| 50 |
+
|
| 51 |
+
def validate_boamps_schema(instance):
|
| 52 |
+
"""Validate instance against BoAmps report schema"""
|
| 53 |
+
# Create validator using pre-loaded schemas and registry
|
| 54 |
+
validator = Draft202012Validator(_schemas["report"], registry=_registry)
|
| 55 |
+
|
| 56 |
+
# Validate
|
| 57 |
+
if validator.is_valid(instance):
|
| 58 |
+
return True, "All required fields are filled & your report has the right format!"
|
| 59 |
+
|
| 60 |
+
# Build error message
|
| 61 |
+
errors = list(validator.iter_errors(instance))
|
| 62 |
+
error_lines = [
|
| 63 |
+
f"The json file does not correspond to the schema, there are {len(errors)} errors:\n",
|
| 64 |
+
"-" * 50
|
| 65 |
+
]
|
| 66 |
|
| 67 |
+
for err in errors:
|
| 68 |
+
error_lines.extend([
|
| 69 |
+
f"Error on data: {err.json_path}",
|
| 70 |
+
f" --> {err.message}",
|
| 71 |
+
"-" * 50
|
| 72 |
+
])
|
| 73 |
|
| 74 |
+
return False, "\n".join(error_lines)
|
|
|
|
|
|
src/services/dataset_upload.py
ADDED
|
@@ -0,0 +1,37 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
from huggingface_hub import HfApi, login
|
| 2 |
+
from src.services.util import HF_TOKEN, DATASET_NAME
|
| 3 |
+
import os
|
| 4 |
+
|
| 5 |
+
|
| 6 |
+
def init_huggingface():
|
| 7 |
+
"""Initialize Hugging Face authentication."""
|
| 8 |
+
if HF_TOKEN is None:
|
| 9 |
+
raise ValueError(
|
| 10 |
+
"Hugging Face token not found in environment variables.")
|
| 11 |
+
login(token=HF_TOKEN)
|
| 12 |
+
|
| 13 |
+
|
| 14 |
+
def update_dataset(file_path, json_data):
|
| 15 |
+
"""Update the Hugging Face dataset with new data."""
|
| 16 |
+
|
| 17 |
+
if json_data is None or json_data.startswith("The following fields are required"):
|
| 18 |
+
return json_data or "No data to submit. Please fill in all required fields."
|
| 19 |
+
try:
|
| 20 |
+
# Initialize Hugging Face authentication
|
| 21 |
+
init_huggingface()
|
| 22 |
+
api = HfApi()
|
| 23 |
+
|
| 24 |
+
short_filename = os.path.basename(file_path)
|
| 25 |
+
|
| 26 |
+
api.upload_file(
|
| 27 |
+
path_or_fileobj=file_path,
|
| 28 |
+
repo_id=DATASET_NAME,
|
| 29 |
+
path_in_repo=f"data/{short_filename}",
|
| 30 |
+
repo_type="dataset",
|
| 31 |
+
commit_message=f"Add new BoAmps report data: {short_filename}",
|
| 32 |
+
create_pr=True,
|
| 33 |
+
)
|
| 34 |
+
|
| 35 |
+
except Exception as e:
|
| 36 |
+
return f"Error updating dataset: {str(e)}"
|
| 37 |
+
return "Data submitted successfully and dataset updated! Consult the data here: https://huggingface.co/datasets/boavizta/open_data_boamps"
|
src/services/form_parser.py
ADDED
|
@@ -0,0 +1,147 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""
|
| 2 |
+
Form parser configuration and utilities for handling Gradio form inputs.
|
| 3 |
+
This module provides a centralized way to manage form structure and parsing.
|
| 4 |
+
"""
|
| 5 |
+
|
| 6 |
+
from dataclasses import dataclass
|
| 7 |
+
from typing import List, Any, Tuple
|
| 8 |
+
|
| 9 |
+
|
| 10 |
+
@dataclass
|
| 11 |
+
class FormSection:
|
| 12 |
+
"""Represents a section of the form with its field count."""
|
| 13 |
+
name: str
|
| 14 |
+
field_count: int
|
| 15 |
+
fields: List[str] = None
|
| 16 |
+
|
| 17 |
+
|
| 18 |
+
@dataclass
|
| 19 |
+
class DynamicSection:
|
| 20 |
+
"""Represents a dynamic section with multiple rows and fields."""
|
| 21 |
+
name: str
|
| 22 |
+
fields: List[str]
|
| 23 |
+
max_rows: int = 5
|
| 24 |
+
|
| 25 |
+
@property
|
| 26 |
+
def total_components(self) -> int:
|
| 27 |
+
return len(self.fields) * self.max_rows
|
| 28 |
+
|
| 29 |
+
|
| 30 |
+
# Form structure configuration
|
| 31 |
+
FORM_STRUCTURE = [
|
| 32 |
+
FormSection("header", 11, [
|
| 33 |
+
"licensing", "formatVersion", "formatVersionSpecificationUri", "reportId",
|
| 34 |
+
"reportDatetime", "reportStatus", "publisher_name", "publisher_division",
|
| 35 |
+
"publisher_projectName", "publisher_confidentialityLevel", "publisher_publicKey"
|
| 36 |
+
]),
|
| 37 |
+
|
| 38 |
+
FormSection("task_simple", 3, [
|
| 39 |
+
"taskFamily", "taskStage", "nbRequest"
|
| 40 |
+
]),
|
| 41 |
+
|
| 42 |
+
DynamicSection("algorithms", [
|
| 43 |
+
"trainingType", "algorithmType", "algorithmName", "algorithmUri",
|
| 44 |
+
"foundationModelName", "foundationModelUri", "parametersNumber", "framework",
|
| 45 |
+
"frameworkVersion", "classPath", "layersNumber", "epochsNumber", "optimizer", "quantization"
|
| 46 |
+
]),
|
| 47 |
+
|
| 48 |
+
DynamicSection("dataset", [
|
| 49 |
+
"dataUsage", "dataType", "dataFormat", "dataSize", "dataQuantity",
|
| 50 |
+
"shape", "source", "sourceUri", "owner"
|
| 51 |
+
]),
|
| 52 |
+
|
| 53 |
+
FormSection("task_final", 3, [
|
| 54 |
+
"measuredAccuracy", "estimatedAccuracy", "taskDescription"
|
| 55 |
+
]),
|
| 56 |
+
|
| 57 |
+
DynamicSection("measures", [
|
| 58 |
+
"measurementMethod", "manufacturer", "version", "cpuTrackingMode", "gpuTrackingMode",
|
| 59 |
+
"averageUtilizationCpu", "averageUtilizationGpu", "powerCalibrationMeasurement",
|
| 60 |
+
"durationCalibrationMeasurement", "powerConsumption", "measurementDuration", "measurementDateTime"
|
| 61 |
+
]),
|
| 62 |
+
|
| 63 |
+
FormSection("system", 3, [
|
| 64 |
+
"osystem", "distribution", "distributionVersion"
|
| 65 |
+
]),
|
| 66 |
+
|
| 67 |
+
FormSection("software", 2, [
|
| 68 |
+
"language", "version_software"
|
| 69 |
+
]),
|
| 70 |
+
|
| 71 |
+
FormSection("infrastructure_simple", 4, [
|
| 72 |
+
"infraType", "cloudProvider", "cloudInstance", "cloudService"
|
| 73 |
+
]),
|
| 74 |
+
|
| 75 |
+
DynamicSection("infrastructure_components", [
|
| 76 |
+
"componentName", "componentType", "nbComponent", "memorySize",
|
| 77 |
+
"manufacturer_infra", "family", "series", "share"
|
| 78 |
+
]),
|
| 79 |
+
|
| 80 |
+
FormSection("environment", 7, [
|
| 81 |
+
"country", "latitude", "longitude", "location",
|
| 82 |
+
"powerSupplierType", "powerSource", "powerSourceCarbonIntensity"
|
| 83 |
+
]),
|
| 84 |
+
|
| 85 |
+
FormSection("quality", 1, ["quality"])
|
| 86 |
+
]
|
| 87 |
+
|
| 88 |
+
|
| 89 |
+
class FormParser:
|
| 90 |
+
"""Utility class for parsing form inputs based on the form structure."""
|
| 91 |
+
|
| 92 |
+
def __init__(self):
|
| 93 |
+
self.structure = FORM_STRUCTURE
|
| 94 |
+
|
| 95 |
+
def parse_inputs(self, inputs: Tuple[Any, ...]) -> dict:
|
| 96 |
+
"""
|
| 97 |
+
Parse form inputs into a structured dictionary.
|
| 98 |
+
|
| 99 |
+
Args:
|
| 100 |
+
inputs: Tuple of all form input values
|
| 101 |
+
|
| 102 |
+
Returns:
|
| 103 |
+
dict: Parsed form data organized by sections
|
| 104 |
+
"""
|
| 105 |
+
parsed_data = {}
|
| 106 |
+
idx = 0
|
| 107 |
+
|
| 108 |
+
for section in self.structure:
|
| 109 |
+
if isinstance(section, FormSection):
|
| 110 |
+
# Simple section - extract values directly
|
| 111 |
+
section_data = inputs[idx:idx + section.field_count]
|
| 112 |
+
if section.fields:
|
| 113 |
+
parsed_data[section.name] = dict(
|
| 114 |
+
zip(section.fields, section_data))
|
| 115 |
+
else:
|
| 116 |
+
parsed_data[section.name] = section_data
|
| 117 |
+
idx += section.field_count
|
| 118 |
+
|
| 119 |
+
elif isinstance(section, DynamicSection):
|
| 120 |
+
# Dynamic section - extract and reshape data
|
| 121 |
+
flat_data = inputs[idx:idx + section.total_components]
|
| 122 |
+
idx += section.total_components
|
| 123 |
+
|
| 124 |
+
# Reshape flat data into field-organized lists
|
| 125 |
+
section_data = {}
|
| 126 |
+
for field_idx, field_name in enumerate(section.fields):
|
| 127 |
+
start_pos = field_idx * section.max_rows
|
| 128 |
+
end_pos = start_pos + section.max_rows
|
| 129 |
+
section_data[field_name] = flat_data[start_pos:end_pos]
|
| 130 |
+
|
| 131 |
+
parsed_data[section.name] = section_data
|
| 132 |
+
|
| 133 |
+
return parsed_data
|
| 134 |
+
|
| 135 |
+
def get_total_input_count(self) -> int:
|
| 136 |
+
"""Get the total number of expected inputs."""
|
| 137 |
+
total = 0
|
| 138 |
+
for section in self.structure:
|
| 139 |
+
if isinstance(section, FormSection):
|
| 140 |
+
total += section.field_count
|
| 141 |
+
elif isinstance(section, DynamicSection):
|
| 142 |
+
total += section.total_components
|
| 143 |
+
return total
|
| 144 |
+
|
| 145 |
+
|
| 146 |
+
# Global parser instance
|
| 147 |
+
form_parser = FormParser()
|
src/services/huggingface.py
DELETED
|
@@ -1,244 +0,0 @@
|
|
| 1 |
-
from huggingface_hub import login
|
| 2 |
-
from datasets import load_dataset, Dataset, concatenate_datasets
|
| 3 |
-
import json
|
| 4 |
-
from src.services.util import HF_TOKEN, DATASET_NAME
|
| 5 |
-
|
| 6 |
-
|
| 7 |
-
def init_huggingface():
|
| 8 |
-
"""Initialize Hugging Face authentication."""
|
| 9 |
-
if HF_TOKEN is None:
|
| 10 |
-
raise ValueError(
|
| 11 |
-
"Hugging Face token not found in environment variables.")
|
| 12 |
-
login(token=HF_TOKEN)
|
| 13 |
-
|
| 14 |
-
|
| 15 |
-
def update_dataset(json_data):
|
| 16 |
-
"""Update the Hugging Face dataset with new data."""
|
| 17 |
-
if json_data is None or json_data.startswith("The following fields are required"):
|
| 18 |
-
return json_data or "No data to submit. Please fill in all required fields."
|
| 19 |
-
|
| 20 |
-
try:
|
| 21 |
-
data = json.loads(json_data)
|
| 22 |
-
except json.JSONDecodeError:
|
| 23 |
-
return "Invalid JSON data. Please ensure all required fields are filled correctly."
|
| 24 |
-
|
| 25 |
-
try:
|
| 26 |
-
dataset = load_dataset(DATASET_NAME, split="train")
|
| 27 |
-
print(dataset)
|
| 28 |
-
except:
|
| 29 |
-
dataset = Dataset.from_dict({})
|
| 30 |
-
|
| 31 |
-
new_data = create_flattened_data(data)
|
| 32 |
-
new_dataset = Dataset.from_dict(new_data)
|
| 33 |
-
|
| 34 |
-
if len(dataset) > 0:
|
| 35 |
-
print("dataset intitial")
|
| 36 |
-
print(dataset)
|
| 37 |
-
print("data to add ")
|
| 38 |
-
print(new_dataset)
|
| 39 |
-
updated_dataset = concatenate_datasets([dataset, new_dataset])
|
| 40 |
-
else:
|
| 41 |
-
updated_dataset = new_dataset
|
| 42 |
-
|
| 43 |
-
updated_dataset.push_to_hub(DATASET_NAME)
|
| 44 |
-
return "Data submitted successfully and dataset updated! Consult the data [here](https://huggingface.co/datasets/boavizta/BoAmps_data)"
|
| 45 |
-
|
| 46 |
-
|
| 47 |
-
def create_flattened_data(data):
|
| 48 |
-
"""Create a flattened data structure for the algorithms."""
|
| 49 |
-
# Handle algorithms
|
| 50 |
-
algorithms = data.get("task", {}).get("algorithms", [])
|
| 51 |
-
fields = ["trainingType", "algorithmType", "algorithmName", "algorithmUri", "foundationModelName", "foundationModelUri",
|
| 52 |
-
"parametersNumber", "framework", "frameworkVersion", "classPath", "layersNumber", "epochsNumber", "optimizer", "quantization"]
|
| 53 |
-
"""Create a flattened data structure for the algorithms."""
|
| 54 |
-
algorithms_data = {field: "| ".join(str(algo.get(
|
| 55 |
-
field)) for algo in algorithms if algo.get(field)) or "" for field in fields}
|
| 56 |
-
trainingType_str = algorithms_data["trainingType"]
|
| 57 |
-
algorithmType_str = algorithms_data["algorithmType"]
|
| 58 |
-
algorithmName_str = algorithms_data["algorithmName"]
|
| 59 |
-
algorithmUri_str = algorithms_data["algorithmUri"]
|
| 60 |
-
foundationModelName_str = algorithms_data["foundationModelName"]
|
| 61 |
-
foundationModelUri_str = algorithms_data["foundationModelUri"]
|
| 62 |
-
parametersNumber_str = algorithms_data["parametersNumber"]
|
| 63 |
-
framework_str = algorithms_data["framework"]
|
| 64 |
-
frameworkVersion_str = algorithms_data["frameworkVersion"]
|
| 65 |
-
classPath_str = algorithms_data["classPath"]
|
| 66 |
-
layersNumber_str = algorithms_data["layersNumber"]
|
| 67 |
-
epochsNumber_str = algorithms_data["epochsNumber"]
|
| 68 |
-
optimizer_str = algorithms_data["optimizer"]
|
| 69 |
-
quantization_str = algorithms_data["quantization"]
|
| 70 |
-
|
| 71 |
-
"""Create a flattened data structure for the dataset."""
|
| 72 |
-
# Handle dataset
|
| 73 |
-
dataset = data.get("task", {}).get("dataset", [])
|
| 74 |
-
fields = ["dataUsage", "dataType", "dataFormat", "dataSize",
|
| 75 |
-
"dataQuantity", "shape", "source", "sourceUri", "owner"]
|
| 76 |
-
"""Create a flattened data structure for the dataset."""
|
| 77 |
-
dataset_data = {field: "| ".join(
|
| 78 |
-
str(d.get(field)) for d in dataset if d.get(field)) or "" for field in fields}
|
| 79 |
-
dataUsage_str = dataset_data["dataUsage"]
|
| 80 |
-
dataType_str = dataset_data["dataType"]
|
| 81 |
-
dataFormat_str = dataset_data["dataFormat"]
|
| 82 |
-
dataSize_str = dataset_data["dataSize"]
|
| 83 |
-
dataQuantity_str = dataset_data["dataQuantity"]
|
| 84 |
-
shape_str = dataset_data["shape"]
|
| 85 |
-
source_str = dataset_data["source"]
|
| 86 |
-
sourceUri_str = dataset_data["sourceUri"]
|
| 87 |
-
owner_str = dataset_data["owner"]
|
| 88 |
-
|
| 89 |
-
"""Create a flattened data structure for the measures."""
|
| 90 |
-
# Handle measures
|
| 91 |
-
measures = data.get("measures", [])
|
| 92 |
-
fields = ["measurementMethod", "manufacturer", "version", "cpuTrackingMode", "gpuTrackingMode", "averageUtilizationCpu", "averageUtilizationGpu",
|
| 93 |
-
"powerCalibrationMeasurement", "durationCalibrationMeasurement", "powerConsumption", "measurementDuration", "measurementDateTime"]
|
| 94 |
-
"""Create a flattened data structure for the measures."""
|
| 95 |
-
measures_data = {field: "| ".join(str(measure.get(
|
| 96 |
-
field)) for measure in measures if measure.get(field)) or "" for field in fields}
|
| 97 |
-
measurementMethod_str = measures_data["measurementMethod"]
|
| 98 |
-
manufacturer_str = measures_data["manufacturer"]
|
| 99 |
-
version_str = measures_data["version"]
|
| 100 |
-
cpuTrackingMode_str = measures_data["cpuTrackingMode"]
|
| 101 |
-
gpuTrackingMode_str = measures_data["gpuTrackingMode"]
|
| 102 |
-
averageUtilizationCpu_str = measures_data["averageUtilizationCpu"]
|
| 103 |
-
averageUtilizationGpu_str = measures_data["averageUtilizationGpu"]
|
| 104 |
-
powerCalibrationMeasurement_str = measures_data["powerCalibrationMeasurement"]
|
| 105 |
-
durationCalibrationMeasurement_str = measures_data["durationCalibrationMeasurement"]
|
| 106 |
-
powerConsumption_str = measures_data["powerConsumption"]
|
| 107 |
-
measurementDuration_str = measures_data["measurementDuration"]
|
| 108 |
-
measurementDateTime_str = measures_data["measurementDateTime"]
|
| 109 |
-
|
| 110 |
-
# Handle components
|
| 111 |
-
components = data.get("infrastructure", {}).get("components", [])
|
| 112 |
-
fields = ["componentName", "componentType", "nbComponent", "memorySize",
|
| 113 |
-
"manufacturer", "family", "series", "share"]
|
| 114 |
-
|
| 115 |
-
# Generate concatenated strings for each field
|
| 116 |
-
component_data = {field: "| ".join(str(comp.get(
|
| 117 |
-
field)) for comp in components if comp.get(field)) or "" for field in fields}
|
| 118 |
-
|
| 119 |
-
componentName_str = component_data["componentName"]
|
| 120 |
-
componentType_str = component_data["componentType"]
|
| 121 |
-
nbComponent_str = component_data["nbComponent"]
|
| 122 |
-
memorySize_str = component_data["memorySize"]
|
| 123 |
-
manufacturer_infra_str = component_data["manufacturer"]
|
| 124 |
-
family_str = component_data["family"]
|
| 125 |
-
series_str = component_data["series"]
|
| 126 |
-
share_str = component_data["share"]
|
| 127 |
-
|
| 128 |
-
return {
|
| 129 |
-
# Header
|
| 130 |
-
"licensing": [data.get("header", {}).get("licensing", "")],
|
| 131 |
-
"formatVersion": [data.get("header", {}).get("formatVersion", "")],
|
| 132 |
-
"formatVersionSpecificationUri": [data.get("header", {}).get("formatVersionSpecificationUri", "")],
|
| 133 |
-
"reportId": [data.get("header", {}).get("reportId", "")],
|
| 134 |
-
"reportDatetime": [data.get("header", {}).get("reportDatetime", "")],
|
| 135 |
-
"reportStatus": [data.get("header", {}).get("reportStatus", "")],
|
| 136 |
-
"publisher_name": [data.get("header", {}).get("publisher", {}).get("name", "")],
|
| 137 |
-
"publisher_division": [data.get("header", {}).get("publisher", {}).get("division", "")],
|
| 138 |
-
"publisher_projectName": [data.get("header", {}).get("publisher", {}).get("projectName", "")],
|
| 139 |
-
"publisher_confidentialityLevel": [data.get("header", {}).get("publisher", {}).get("confidentialityLevel", "")],
|
| 140 |
-
"publisher_publicKey": [data.get("header", {}).get("publisher", {}).get("publicKey", "")],
|
| 141 |
-
|
| 142 |
-
# Task
|
| 143 |
-
"taskStage": [data.get("task", {}).get("taskStage", "")],
|
| 144 |
-
"taskFamily": [data.get("task", {}).get("taskFamily", "")],
|
| 145 |
-
"nbRequest": [data.get("task", {}).get("nbRequest", "")],
|
| 146 |
-
# Algorithms
|
| 147 |
-
"trainingType": [trainingType_str],
|
| 148 |
-
"algorithmType": [algorithmType_str],
|
| 149 |
-
"algorithmName": [algorithmName_str],
|
| 150 |
-
"algorithmUri": [algorithmUri_str],
|
| 151 |
-
"foundationModelName": [foundationModelName_str],
|
| 152 |
-
"foundationModelUri": [foundationModelUri_str],
|
| 153 |
-
"parametersNumber": [parametersNumber_str],
|
| 154 |
-
"framework": [framework_str],
|
| 155 |
-
"frameworkVersion": [frameworkVersion_str],
|
| 156 |
-
"classPath": [classPath_str],
|
| 157 |
-
"layersNumber": [layersNumber_str],
|
| 158 |
-
"epochsNumber": [epochsNumber_str],
|
| 159 |
-
"optimizer": [optimizer_str],
|
| 160 |
-
"quantization": [quantization_str],
|
| 161 |
-
# Dataset
|
| 162 |
-
"dataUsage": [dataUsage_str],
|
| 163 |
-
"dataType": [dataType_str],
|
| 164 |
-
"dataFormat": [dataFormat_str],
|
| 165 |
-
"dataSize": [dataSize_str],
|
| 166 |
-
"dataQuantity": [dataQuantity_str],
|
| 167 |
-
"shape": [shape_str],
|
| 168 |
-
"source": [source_str],
|
| 169 |
-
"sourceUri": [sourceUri_str],
|
| 170 |
-
"owner": [owner_str],
|
| 171 |
-
"measuredAccuracy": [data.get("task", {}).get("measuredAccuracy", "")],
|
| 172 |
-
"estimatedAccuracy": [data.get("task", {}).get("estimatedAccuracy", "")],
|
| 173 |
-
"taskDescription": [data.get("task", {}).get("taskDescription", "")],
|
| 174 |
-
|
| 175 |
-
# Measures
|
| 176 |
-
"measurementMethod": [measurementMethod_str],
|
| 177 |
-
"manufacturer": [manufacturer_str],
|
| 178 |
-
"version": [version_str],
|
| 179 |
-
"cpuTrackingMode": [cpuTrackingMode_str],
|
| 180 |
-
"gpuTrackingMode": [gpuTrackingMode_str],
|
| 181 |
-
"averageUtilizationCpu": [averageUtilizationCpu_str],
|
| 182 |
-
"averageUtilizationGpu": [averageUtilizationGpu_str],
|
| 183 |
-
"powerCalibrationMeasurement": [powerCalibrationMeasurement_str],
|
| 184 |
-
"durationCalibrationMeasurement": [durationCalibrationMeasurement_str],
|
| 185 |
-
"powerConsumption": [powerConsumption_str],
|
| 186 |
-
"measurementDuration": [measurementDuration_str],
|
| 187 |
-
"measurementDateTime": [measurementDateTime_str],
|
| 188 |
-
|
| 189 |
-
# System
|
| 190 |
-
"os": [data.get("system", {}).get("os", "")],
|
| 191 |
-
"distribution": [data.get("system", {}).get("distribution", "")],
|
| 192 |
-
"distributionVersion": [data.get("system", {}).get("distributionVersion", "")],
|
| 193 |
-
|
| 194 |
-
# Software
|
| 195 |
-
"language": [data.get("software", {}).get("language", "")],
|
| 196 |
-
"version_software": [data.get("software", {}).get("version_software", "")],
|
| 197 |
-
|
| 198 |
-
# Infrastructure
|
| 199 |
-
"infraType": [data.get("infrastructure", {}).get("infra_type", "")],
|
| 200 |
-
"cloudProvider": [data.get("infrastructure", {}).get("cloudProvider", "")],
|
| 201 |
-
"cloudInstance": [data.get("infrastructure", {}).get("cloudInstance", "")],
|
| 202 |
-
"cloudService": [data.get("infrastructure", {}).get("cloudService", "")],
|
| 203 |
-
"componentName": [componentName_str],
|
| 204 |
-
"componentType": [componentType_str],
|
| 205 |
-
"nbComponent": [nbComponent_str],
|
| 206 |
-
"memorySize": [memorySize_str],
|
| 207 |
-
"manufacturer_infra": [manufacturer_infra_str],
|
| 208 |
-
"family": [family_str],
|
| 209 |
-
"series": [series_str],
|
| 210 |
-
"share": [share_str],
|
| 211 |
-
|
| 212 |
-
# Environment
|
| 213 |
-
"country": [data.get("environment", {}).get("country", "")],
|
| 214 |
-
"latitude": [data.get("environment", {}).get("latitude", "")],
|
| 215 |
-
"longitude": [data.get("environment", {}).get("longitude", "")],
|
| 216 |
-
"location": [data.get("environment", {}).get("location", "")],
|
| 217 |
-
"powerSupplierType": [data.get("environment", {}).get("powerSupplierType", "")],
|
| 218 |
-
"powerSource": [data.get("environment", {}).get("powerSource", "")],
|
| 219 |
-
"powerSourceCarbonIntensity": [data.get("environment", {}).get("powerSourceCarbonIntensity", "")],
|
| 220 |
-
|
| 221 |
-
# Quality
|
| 222 |
-
"quality": [data.get("quality", "")],
|
| 223 |
-
}
|
| 224 |
-
|
| 225 |
-
|
| 226 |
-
"""
|
| 227 |
-
def create_flattened_data(data):
|
| 228 |
-
out = {}
|
| 229 |
-
|
| 230 |
-
def flatten(x, name=''):
|
| 231 |
-
if type(x) is dict:
|
| 232 |
-
for a in x:
|
| 233 |
-
flatten(x[a], name + a + '_')
|
| 234 |
-
elif type(x) is list:
|
| 235 |
-
i = 0
|
| 236 |
-
for a in x:
|
| 237 |
-
flatten(a, name + str(i) + '_')
|
| 238 |
-
i += 1
|
| 239 |
-
else:
|
| 240 |
-
out[name[:-1]] = x
|
| 241 |
-
|
| 242 |
-
flatten(data)
|
| 243 |
-
return out
|
| 244 |
-
"""
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
src/services/json_generator.py
CHANGED
|
@@ -1,7 +1,67 @@
|
|
| 1 |
import json
|
| 2 |
import tempfile
|
| 3 |
from datetime import datetime
|
| 4 |
-
from assets.utils.validation import
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 5 |
|
| 6 |
|
| 7 |
def generate_json(
|
|
@@ -20,7 +80,7 @@ def generate_json(
|
|
| 20 |
durationCalibrationMeasurement, powerConsumption,
|
| 21 |
measurementDuration, measurementDateTime,
|
| 22 |
# System
|
| 23 |
-
|
| 24 |
# Software
|
| 25 |
language, version_software,
|
| 26 |
# Infrastructure
|
|
@@ -33,192 +93,154 @@ def generate_json(
|
|
| 33 |
# Quality
|
| 34 |
quality
|
| 35 |
):
|
| 36 |
-
"""Generate JSON data from form inputs."""
|
| 37 |
-
|
| 38 |
-
|
| 39 |
-
|
| 40 |
-
|
| 41 |
-
|
| 42 |
-
|
| 43 |
-
|
| 44 |
-
|
| 45 |
-
|
| 46 |
-
|
| 47 |
-
|
| 48 |
-
|
| 49 |
-
|
| 50 |
-
|
| 51 |
-
|
| 52 |
-
|
| 53 |
-
|
| 54 |
-
|
| 55 |
-
|
| 56 |
-
|
| 57 |
-
|
| 58 |
-
|
| 59 |
-
|
| 60 |
-
|
| 61 |
-
|
| 62 |
-
|
| 63 |
-
|
| 64 |
-
|
| 65 |
-
|
| 66 |
-
|
| 67 |
-
|
| 68 |
-
|
| 69 |
-
|
| 70 |
-
|
| 71 |
-
|
| 72 |
-
|
| 73 |
-
|
| 74 |
-
|
| 75 |
-
|
| 76 |
-
|
| 77 |
-
|
| 78 |
-
|
| 79 |
-
|
| 80 |
-
|
| 81 |
-
|
| 82 |
-
|
| 83 |
-
|
| 84 |
-
|
| 85 |
-
|
| 86 |
-
|
| 87 |
-
|
| 88 |
-
|
| 89 |
-
|
| 90 |
-
|
| 91 |
-
|
| 92 |
-
|
| 93 |
-
|
| 94 |
-
|
| 95 |
-
|
| 96 |
-
|
| 97 |
-
|
| 98 |
-
|
| 99 |
-
|
| 100 |
-
|
| 101 |
-
|
| 102 |
-
|
| 103 |
-
|
| 104 |
-
|
| 105 |
-
|
| 106 |
-
|
| 107 |
-
|
| 108 |
-
|
| 109 |
-
|
| 110 |
-
|
| 111 |
-
|
| 112 |
-
|
| 113 |
-
|
| 114 |
-
|
| 115 |
-
|
| 116 |
-
|
| 117 |
-
|
| 118 |
-
|
| 119 |
-
|
| 120 |
-
|
| 121 |
-
|
| 122 |
-
|
| 123 |
-
|
| 124 |
-
|
| 125 |
-
|
| 126 |
-
|
| 127 |
-
|
| 128 |
-
|
| 129 |
-
|
| 130 |
-
|
| 131 |
-
|
| 132 |
-
|
| 133 |
-
|
| 134 |
-
|
| 135 |
-
|
| 136 |
-
|
| 137 |
-
|
| 138 |
-
|
| 139 |
-
|
| 140 |
-
|
| 141 |
-
|
| 142 |
-
|
| 143 |
-
|
| 144 |
-
|
| 145 |
-
|
| 146 |
-
|
| 147 |
-
|
| 148 |
-
|
| 149 |
-
|
| 150 |
-
|
| 151 |
-
|
| 152 |
-
|
| 153 |
-
|
| 154 |
-
|
| 155 |
-
|
| 156 |
-
|
| 157 |
-
|
| 158 |
-
|
| 159 |
-
|
| 160 |
-
|
| 161 |
-
|
| 162 |
-
|
| 163 |
-
|
| 164 |
-
|
| 165 |
-
|
| 166 |
-
|
| 167 |
-
|
| 168 |
-
|
| 169 |
-
|
| 170 |
-
|
| 171 |
-
|
| 172 |
-
|
| 173 |
-
|
| 174 |
-
if software:
|
| 175 |
-
report["software"] = software
|
| 176 |
-
|
| 177 |
-
# proceed infrastructure
|
| 178 |
-
infrastructure = {}
|
| 179 |
-
if infraType:
|
| 180 |
-
infrastructure["infraType"] = infraType
|
| 181 |
-
if cloudProvider:
|
| 182 |
-
infrastructure["cloudProvider"] = cloudProvider
|
| 183 |
-
if cloudInstance:
|
| 184 |
-
infrastructure["cloudInstance"] = cloudInstance
|
| 185 |
-
if cloudService:
|
| 186 |
-
infrastructure["cloudService"] = cloudService
|
| 187 |
-
if components_list:
|
| 188 |
-
infrastructure["components"] = components_list
|
| 189 |
-
report["infrastructure"] = infrastructure
|
| 190 |
-
|
| 191 |
-
# proceed environment
|
| 192 |
-
environment = {}
|
| 193 |
-
if country:
|
| 194 |
-
environment["country"] = country
|
| 195 |
-
if latitude:
|
| 196 |
-
environment["latitude"] = latitude
|
| 197 |
-
if longitude:
|
| 198 |
-
environment["longitude"] = longitude
|
| 199 |
-
if location:
|
| 200 |
-
environment["location"] = location
|
| 201 |
-
if powerSupplierType:
|
| 202 |
-
environment["powerSupplierType"] = powerSupplierType
|
| 203 |
-
if powerSource:
|
| 204 |
-
environment["powerSource"] = powerSource
|
| 205 |
-
if powerSourceCarbonIntensity:
|
| 206 |
-
environment["powerSourceCarbonIntensity"] = powerSourceCarbonIntensity
|
| 207 |
-
if environment:
|
| 208 |
-
report["environment"] = environment
|
| 209 |
-
|
| 210 |
-
# proceed quality
|
| 211 |
-
if quality:
|
| 212 |
-
report["quality"] = quality
|
| 213 |
-
|
| 214 |
-
# Validate obligatory fields
|
| 215 |
-
is_valid, message = validate_obligatory_fields(report)
|
| 216 |
-
if not is_valid:
|
| 217 |
-
return message, None, ""
|
| 218 |
# Create the JSON string
|
| 219 |
-
|
| 220 |
-
|
| 221 |
-
|
| 222 |
-
|
| 223 |
-
|
| 224 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
import json
|
| 2 |
import tempfile
|
| 3 |
from datetime import datetime
|
| 4 |
+
from assets.utils.validation import validate_boamps_schema
|
| 5 |
+
from src.services.report_builder import ReportBuilder
|
| 6 |
+
import os
|
| 7 |
+
|
| 8 |
+
|
| 9 |
+
def process_component_list(fields_dict):
|
| 10 |
+
"""
|
| 11 |
+
Fonction générique pour traiter une liste de composants à partir d'un dictionnaire de champs.
|
| 12 |
+
|
| 13 |
+
Args:
|
| 14 |
+
fields_dict (dict): Dictionnaire où les clés sont les noms des champs
|
| 15 |
+
et les valeurs sont des listes de composants Gradio ou des objets gr.State.
|
| 16 |
+
|
| 17 |
+
Returns:
|
| 18 |
+
list: Liste de dictionnaires représentant les composants.
|
| 19 |
+
"""
|
| 20 |
+
component_list = []
|
| 21 |
+
|
| 22 |
+
# Extract values from different input types
|
| 23 |
+
processed_fields = {}
|
| 24 |
+
for field_name, field_values in fields_dict.items():
|
| 25 |
+
if hasattr(field_values, 'value'): # It's a gr.State object
|
| 26 |
+
processed_fields[field_name] = field_values.value if field_values.value else [
|
| 27 |
+
]
|
| 28 |
+
elif isinstance(field_values, list) and len(field_values) > 0:
|
| 29 |
+
# It's a list of Gradio components or values
|
| 30 |
+
values = []
|
| 31 |
+
for item in field_values:
|
| 32 |
+
if hasattr(item, '__class__') and 'gradio' in str(item.__class__):
|
| 33 |
+
# It's a Gradio component, the value was passed as input to this function
|
| 34 |
+
# We need to handle this in the calling function by passing the values directly
|
| 35 |
+
values.append(item if item is not None else "")
|
| 36 |
+
else:
|
| 37 |
+
# It's already a value
|
| 38 |
+
values.append(item if item is not None else "")
|
| 39 |
+
processed_fields[field_name] = values
|
| 40 |
+
else:
|
| 41 |
+
processed_fields[field_name] = field_values if field_values else []
|
| 42 |
+
|
| 43 |
+
# Trouver le nombre maximum d'éléments parmi tous les champs
|
| 44 |
+
max_items = 0
|
| 45 |
+
for field_values in processed_fields.values():
|
| 46 |
+
if field_values:
|
| 47 |
+
max_items = max(max_items, len(field_values))
|
| 48 |
+
|
| 49 |
+
# Créer les composants
|
| 50 |
+
for i in range(max_items):
|
| 51 |
+
component = {}
|
| 52 |
+
|
| 53 |
+
for field_name, field_values in processed_fields.items():
|
| 54 |
+
if i < len(field_values):
|
| 55 |
+
value = field_values[i]
|
| 56 |
+
# Only add the field if it has a meaningful value (not empty, not just whitespace)
|
| 57 |
+
if value is not None and str(value).strip() != "":
|
| 58 |
+
component[field_name] = value
|
| 59 |
+
|
| 60 |
+
# Add component if it has any field (as requested by user)
|
| 61 |
+
if component:
|
| 62 |
+
component_list.append(component)
|
| 63 |
+
|
| 64 |
+
return component_list
|
| 65 |
|
| 66 |
|
| 67 |
def generate_json(
|
|
|
|
| 80 |
durationCalibrationMeasurement, powerConsumption,
|
| 81 |
measurementDuration, measurementDateTime,
|
| 82 |
# System
|
| 83 |
+
osystem, distribution, distributionVersion,
|
| 84 |
# Software
|
| 85 |
language, version_software,
|
| 86 |
# Infrastructure
|
|
|
|
| 93 |
# Quality
|
| 94 |
quality
|
| 95 |
):
|
| 96 |
+
"""Generate JSON data from form inputs using optimized ReportBuilder."""
|
| 97 |
+
|
| 98 |
+
try:
|
| 99 |
+
# Use ReportBuilder for cleaner, more maintainable code
|
| 100 |
+
builder = ReportBuilder()
|
| 101 |
+
|
| 102 |
+
# Build header section
|
| 103 |
+
header_data = {
|
| 104 |
+
"licensing": licensing,
|
| 105 |
+
"formatVersion": formatVersion,
|
| 106 |
+
"formatVersionSpecificationUri": formatVersionSpecificationUri,
|
| 107 |
+
"reportId": reportId,
|
| 108 |
+
"reportDatetime": reportDatetime or datetime.now().strftime("%Y-%m-%d %H:%M:%S"),
|
| 109 |
+
"reportStatus": reportStatus,
|
| 110 |
+
"publisher_name": publisher_name,
|
| 111 |
+
"publisher_division": publisher_division,
|
| 112 |
+
"publisher_projectName": publisher_projectName,
|
| 113 |
+
"publisher_confidentialityLevel": publisher_confidentialityLevel,
|
| 114 |
+
"publisher_publicKey": publisher_publicKey
|
| 115 |
+
}
|
| 116 |
+
builder.add_header(header_data)
|
| 117 |
+
|
| 118 |
+
# Build task section
|
| 119 |
+
task_data = {
|
| 120 |
+
"taskStage": taskStage,
|
| 121 |
+
"taskFamily": taskFamily,
|
| 122 |
+
"nbRequest": nbRequest,
|
| 123 |
+
"measuredAccuracy": measuredAccuracy,
|
| 124 |
+
"estimatedAccuracy": estimatedAccuracy,
|
| 125 |
+
"taskDescription": taskDescription,
|
| 126 |
+
"algorithms": {
|
| 127 |
+
"trainingType": trainingType,
|
| 128 |
+
"algorithmType": algorithmType,
|
| 129 |
+
"algorithmName": algorithmName,
|
| 130 |
+
"algorithmUri": algorithmUri,
|
| 131 |
+
"foundationModelName": foundationModelName,
|
| 132 |
+
"foundationModelUri": foundationModelUri,
|
| 133 |
+
"parametersNumber": parametersNumber,
|
| 134 |
+
"framework": framework,
|
| 135 |
+
"frameworkVersion": frameworkVersion,
|
| 136 |
+
"classPath": classPath,
|
| 137 |
+
"layersNumber": layersNumber,
|
| 138 |
+
"epochsNumber": epochsNumber,
|
| 139 |
+
"optimizer": optimizer,
|
| 140 |
+
"quantization": quantization
|
| 141 |
+
},
|
| 142 |
+
"dataset": {
|
| 143 |
+
"dataUsage": dataUsage,
|
| 144 |
+
"dataType": dataType,
|
| 145 |
+
"dataFormat": dataFormat,
|
| 146 |
+
"dataSize": dataSize,
|
| 147 |
+
"dataQuantity": dataQuantity,
|
| 148 |
+
"shape": shape,
|
| 149 |
+
"source": source,
|
| 150 |
+
"sourceUri": sourceUri,
|
| 151 |
+
"owner": owner
|
| 152 |
+
}
|
| 153 |
+
}
|
| 154 |
+
builder.add_task(task_data)
|
| 155 |
+
|
| 156 |
+
# Build measures section
|
| 157 |
+
measures_data = {
|
| 158 |
+
"measurementMethod": measurementMethod,
|
| 159 |
+
"manufacturer": manufacturer,
|
| 160 |
+
"version": version,
|
| 161 |
+
"cpuTrackingMode": cpuTrackingMode,
|
| 162 |
+
"gpuTrackingMode": gpuTrackingMode,
|
| 163 |
+
"averageUtilizationCpu": averageUtilizationCpu,
|
| 164 |
+
"averageUtilizationGpu": averageUtilizationGpu,
|
| 165 |
+
"powerCalibrationMeasurement": powerCalibrationMeasurement,
|
| 166 |
+
"durationCalibrationMeasurement": durationCalibrationMeasurement,
|
| 167 |
+
"powerConsumption": powerConsumption,
|
| 168 |
+
"measurementDuration": measurementDuration,
|
| 169 |
+
"measurementDateTime": measurementDateTime
|
| 170 |
+
}
|
| 171 |
+
builder.add_measures(measures_data)
|
| 172 |
+
|
| 173 |
+
# Build system section
|
| 174 |
+
system_data = {
|
| 175 |
+
"osystem": osystem,
|
| 176 |
+
"distribution": distribution,
|
| 177 |
+
"distributionVersion": distributionVersion
|
| 178 |
+
}
|
| 179 |
+
builder.add_system(system_data)
|
| 180 |
+
|
| 181 |
+
# Build software section
|
| 182 |
+
software_data = {
|
| 183 |
+
"language": language,
|
| 184 |
+
"version_software": version_software
|
| 185 |
+
}
|
| 186 |
+
builder.add_software(software_data)
|
| 187 |
+
|
| 188 |
+
# Build infrastructure section
|
| 189 |
+
infrastructure_data = {
|
| 190 |
+
"infraType": infraType,
|
| 191 |
+
"cloudProvider": cloudProvider,
|
| 192 |
+
"cloudInstance": cloudInstance,
|
| 193 |
+
"cloudService": cloudService,
|
| 194 |
+
"components": {
|
| 195 |
+
"componentName": componentName,
|
| 196 |
+
"componentType": componentType,
|
| 197 |
+
"nbComponent": nbComponent,
|
| 198 |
+
"memorySize": memorySize,
|
| 199 |
+
"manufacturer": manufacturer_infra,
|
| 200 |
+
"family": family,
|
| 201 |
+
"series": series,
|
| 202 |
+
"share": share
|
| 203 |
+
}
|
| 204 |
+
}
|
| 205 |
+
builder.add_infrastructure(infrastructure_data)
|
| 206 |
+
|
| 207 |
+
# Build environment section
|
| 208 |
+
environment_data = {
|
| 209 |
+
"country": country,
|
| 210 |
+
"latitude": latitude,
|
| 211 |
+
"longitude": longitude,
|
| 212 |
+
"location": location,
|
| 213 |
+
"powerSupplierType": powerSupplierType,
|
| 214 |
+
"powerSource": powerSource,
|
| 215 |
+
"powerSourceCarbonIntensity": powerSourceCarbonIntensity
|
| 216 |
+
}
|
| 217 |
+
builder.add_environment(environment_data)
|
| 218 |
+
|
| 219 |
+
# Add quality
|
| 220 |
+
builder.add_quality(quality)
|
| 221 |
+
|
| 222 |
+
# Build the final report
|
| 223 |
+
report = builder.build()
|
| 224 |
+
|
| 225 |
+
# Validate that the schema follows the BoAmps format
|
| 226 |
+
is_valid, message = validate_boamps_schema(report)
|
| 227 |
+
if not is_valid:
|
| 228 |
+
return message, None, ""
|
| 229 |
+
|
| 230 |
+
# Create and save the JSON file
|
| 231 |
+
filename = f"report_{taskStage}_{taskFamily}_{infraType}_{reportId}.json"
|
| 232 |
+
filename = filename.replace(" ", "-")
|
| 233 |
+
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 234 |
# Create the JSON string
|
| 235 |
+
json_str = json.dumps(report, indent=4, ensure_ascii=False)
|
| 236 |
+
|
| 237 |
+
# Write JSON to a temporary file with the desired filename
|
| 238 |
+
temp_dir = tempfile.gettempdir()
|
| 239 |
+
temp_path = os.path.join(temp_dir, filename)
|
| 240 |
+
with open(temp_path, "w", encoding="utf-8") as tmp:
|
| 241 |
+
tmp.write(json_str)
|
| 242 |
+
|
| 243 |
+
return message, temp_path, json_str
|
| 244 |
+
|
| 245 |
+
except Exception as e:
|
| 246 |
+
return f"Error generating JSON: {str(e)}", None, ""
|
src/services/report_builder.py
ADDED
|
@@ -0,0 +1,273 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""
|
| 2 |
+
JSON processing utilities for BoAmps report generation.
|
| 3 |
+
Provides optimized functions for data transformation and organization.
|
| 4 |
+
"""
|
| 5 |
+
|
| 6 |
+
from typing import Dict, List, Any, Optional
|
| 7 |
+
|
| 8 |
+
|
| 9 |
+
def create_section_dict(data: Dict[str, Any], required_fields: List[str] = None) -> Dict[str, Any]:
|
| 10 |
+
"""
|
| 11 |
+
Create a section dictionary, including only non-empty values.
|
| 12 |
+
|
| 13 |
+
Args:
|
| 14 |
+
data: Dictionary of field values
|
| 15 |
+
required_fields: List of fields that should always be included if provided
|
| 16 |
+
|
| 17 |
+
Returns:
|
| 18 |
+
Dictionary with non-empty values only, or empty dict if no meaningful values
|
| 19 |
+
"""
|
| 20 |
+
section = {}
|
| 21 |
+
required_fields = required_fields or []
|
| 22 |
+
|
| 23 |
+
for key, value in data.items():
|
| 24 |
+
# Include only if it's a required field with meaningful value, or if it's meaningful
|
| 25 |
+
if key in required_fields and is_meaningful_value(value):
|
| 26 |
+
section[key] = value
|
| 27 |
+
elif key not in required_fields and is_meaningful_value(value):
|
| 28 |
+
section[key] = value
|
| 29 |
+
|
| 30 |
+
return section
|
| 31 |
+
|
| 32 |
+
|
| 33 |
+
def is_meaningful_value(value: Any) -> bool:
|
| 34 |
+
"""
|
| 35 |
+
Check if a value is meaningful (not empty, not just whitespace).
|
| 36 |
+
|
| 37 |
+
Args:
|
| 38 |
+
value: Value to check
|
| 39 |
+
|
| 40 |
+
Returns:
|
| 41 |
+
True if the value is meaningful, False otherwise
|
| 42 |
+
"""
|
| 43 |
+
if value is None:
|
| 44 |
+
return False
|
| 45 |
+
if isinstance(value, str):
|
| 46 |
+
return value.strip() != ""
|
| 47 |
+
if isinstance(value, (int, float)):
|
| 48 |
+
return True
|
| 49 |
+
if isinstance(value, (list, dict)):
|
| 50 |
+
return len(value) > 0
|
| 51 |
+
return bool(value)
|
| 52 |
+
|
| 53 |
+
|
| 54 |
+
def process_dynamic_component_list(field_data: Dict[str, List[Any]], max_rows: int = 5) -> List[Dict[str, Any]]:
|
| 55 |
+
"""
|
| 56 |
+
Process dynamic component data into a list of component dictionaries.
|
| 57 |
+
Optimized version of the original process_component_list function.
|
| 58 |
+
|
| 59 |
+
Args:
|
| 60 |
+
field_data: Dictionary where keys are field names and values are lists of row values
|
| 61 |
+
max_rows: Maximum number of rows to process
|
| 62 |
+
|
| 63 |
+
Returns:
|
| 64 |
+
List of component dictionaries
|
| 65 |
+
"""
|
| 66 |
+
components = []
|
| 67 |
+
|
| 68 |
+
# Find the actual number of rows with data
|
| 69 |
+
actual_rows = 0
|
| 70 |
+
for field_values in field_data.values():
|
| 71 |
+
if field_values:
|
| 72 |
+
# Count non-empty values from the end
|
| 73 |
+
for i in range(len(field_values) - 1, -1, -1):
|
| 74 |
+
if is_meaningful_value(field_values[i]):
|
| 75 |
+
actual_rows = max(actual_rows, i + 1)
|
| 76 |
+
break
|
| 77 |
+
|
| 78 |
+
# Create components for rows that have data
|
| 79 |
+
for row_idx in range(min(actual_rows, max_rows)):
|
| 80 |
+
component = {}
|
| 81 |
+
|
| 82 |
+
# Add fields that have meaningful values for this row
|
| 83 |
+
for field_name, field_values in field_data.items():
|
| 84 |
+
if row_idx < len(field_values) and is_meaningful_value(field_values[row_idx]):
|
| 85 |
+
component[field_name] = field_values[row_idx]
|
| 86 |
+
|
| 87 |
+
# Only add component if it has at least one field
|
| 88 |
+
if component:
|
| 89 |
+
components.append(component)
|
| 90 |
+
|
| 91 |
+
return components
|
| 92 |
+
|
| 93 |
+
|
| 94 |
+
def create_publisher_section(data: Dict[str, Any]) -> Optional[Dict[str, Any]]:
|
| 95 |
+
"""
|
| 96 |
+
Create publisher section with proper validation.
|
| 97 |
+
|
| 98 |
+
Args:
|
| 99 |
+
data: Dictionary containing all header data
|
| 100 |
+
|
| 101 |
+
Returns:
|
| 102 |
+
Publisher dictionary or None if no publisher data
|
| 103 |
+
"""
|
| 104 |
+
publisher_fields = {
|
| 105 |
+
"name": data.get("publisher_name"),
|
| 106 |
+
"division": data.get("publisher_division"),
|
| 107 |
+
"projectName": data.get("publisher_projectName"),
|
| 108 |
+
"confidentialityLevel": data.get("publisher_confidentialityLevel"),
|
| 109 |
+
"publicKey": data.get("publisher_publicKey")
|
| 110 |
+
}
|
| 111 |
+
|
| 112 |
+
publisher = create_section_dict(
|
| 113 |
+
publisher_fields, required_fields=["confidentialityLevel"])
|
| 114 |
+
return publisher if publisher else None
|
| 115 |
+
|
| 116 |
+
|
| 117 |
+
class ReportBuilder:
|
| 118 |
+
"""
|
| 119 |
+
Builder class for creating BoAmps reports with optimized data processing.
|
| 120 |
+
"""
|
| 121 |
+
|
| 122 |
+
def __init__(self):
|
| 123 |
+
self.report = {}
|
| 124 |
+
|
| 125 |
+
def add_header(self, header_data: Dict[str, Any]) -> 'ReportBuilder':
|
| 126 |
+
"""Add header section to the report."""
|
| 127 |
+
header_fields = {
|
| 128 |
+
"licensing": header_data.get("licensing"),
|
| 129 |
+
"formatVersion": header_data.get("formatVersion"),
|
| 130 |
+
"formatVersionSpecificationUri": header_data.get("formatVersionSpecificationUri"),
|
| 131 |
+
"reportId": header_data.get("reportId"),
|
| 132 |
+
"reportDatetime": header_data.get("reportDatetime"),
|
| 133 |
+
"reportStatus": header_data.get("reportStatus")
|
| 134 |
+
}
|
| 135 |
+
|
| 136 |
+
header = create_section_dict(header_fields, required_fields=[
|
| 137 |
+
"reportId", "reportDatetime"])
|
| 138 |
+
|
| 139 |
+
# Add publisher if available
|
| 140 |
+
publisher = create_publisher_section(header_data)
|
| 141 |
+
if publisher:
|
| 142 |
+
header["publisher"] = publisher
|
| 143 |
+
|
| 144 |
+
if header:
|
| 145 |
+
self.report["header"] = header
|
| 146 |
+
|
| 147 |
+
return self
|
| 148 |
+
|
| 149 |
+
def add_task(self, task_data: Dict[str, Any]) -> 'ReportBuilder':
|
| 150 |
+
"""Add task section to the report."""
|
| 151 |
+
task = {}
|
| 152 |
+
|
| 153 |
+
# Simple task fields
|
| 154 |
+
simple_fields = {
|
| 155 |
+
"taskStage": task_data.get("taskStage"),
|
| 156 |
+
"taskFamily": task_data.get("taskFamily"),
|
| 157 |
+
"nbRequest": task_data.get("nbRequest"),
|
| 158 |
+
"measuredAccuracy": task_data.get("measuredAccuracy"),
|
| 159 |
+
"estimatedAccuracy": task_data.get("estimatedAccuracy"),
|
| 160 |
+
"taskDescription": task_data.get("taskDescription")
|
| 161 |
+
}
|
| 162 |
+
|
| 163 |
+
task.update(create_section_dict(simple_fields,
|
| 164 |
+
required_fields=["taskStage", "taskFamily"]))
|
| 165 |
+
|
| 166 |
+
# Process algorithms
|
| 167 |
+
if "algorithms" in task_data:
|
| 168 |
+
algorithms = process_dynamic_component_list(
|
| 169 |
+
task_data["algorithms"])
|
| 170 |
+
if algorithms:
|
| 171 |
+
task["algorithms"] = algorithms
|
| 172 |
+
|
| 173 |
+
# Process dataset
|
| 174 |
+
if "dataset" in task_data:
|
| 175 |
+
dataset = process_dynamic_component_list(task_data["dataset"])
|
| 176 |
+
if dataset:
|
| 177 |
+
task["dataset"] = dataset
|
| 178 |
+
|
| 179 |
+
self.report["task"] = task
|
| 180 |
+
return self
|
| 181 |
+
|
| 182 |
+
def add_measures(self, measures_data: Dict[str, List[Any]]) -> 'ReportBuilder':
|
| 183 |
+
"""Add measures section to the report."""
|
| 184 |
+
measures = process_dynamic_component_list(measures_data)
|
| 185 |
+
if measures:
|
| 186 |
+
self.report["measures"] = measures
|
| 187 |
+
return self
|
| 188 |
+
|
| 189 |
+
def add_system(self, system_data: Dict[str, Any]) -> 'ReportBuilder':
|
| 190 |
+
"""Add system section to the report."""
|
| 191 |
+
system_fields = {
|
| 192 |
+
"os": system_data.get("osystem"),
|
| 193 |
+
"distribution": system_data.get("distribution"),
|
| 194 |
+
"distributionVersion": system_data.get("distributionVersion")
|
| 195 |
+
}
|
| 196 |
+
|
| 197 |
+
system = create_section_dict(system_fields, required_fields=["os"])
|
| 198 |
+
# Only add system section if it has meaningful values
|
| 199 |
+
if system:
|
| 200 |
+
self.report["system"] = system
|
| 201 |
+
return self
|
| 202 |
+
|
| 203 |
+
def add_software(self, software_data: Dict[str, Any]) -> 'ReportBuilder':
|
| 204 |
+
"""Add software section to the report."""
|
| 205 |
+
software_fields = {
|
| 206 |
+
"language": software_data.get("language"),
|
| 207 |
+
"version": software_data.get("version_software")
|
| 208 |
+
}
|
| 209 |
+
|
| 210 |
+
software = create_section_dict(
|
| 211 |
+
software_fields, required_fields=["language"])
|
| 212 |
+
# Only add software section if it has meaningful values
|
| 213 |
+
if software:
|
| 214 |
+
self.report["software"] = software
|
| 215 |
+
return self
|
| 216 |
+
|
| 217 |
+
def add_infrastructure(self, infra_data: Dict[str, Any]) -> 'ReportBuilder':
|
| 218 |
+
"""Add infrastructure section to the report."""
|
| 219 |
+
infrastructure = {}
|
| 220 |
+
|
| 221 |
+
# Simple infrastructure fields
|
| 222 |
+
simple_fields = {
|
| 223 |
+
"infraType": infra_data.get("infraType"),
|
| 224 |
+
"cloudProvider": infra_data.get("cloudProvider"),
|
| 225 |
+
"cloudInstance": infra_data.get("cloudInstance"),
|
| 226 |
+
"cloudService": infra_data.get("cloudService")
|
| 227 |
+
}
|
| 228 |
+
|
| 229 |
+
# Add simple fields only if they have meaningful values
|
| 230 |
+
simple_infra = create_section_dict(
|
| 231 |
+
simple_fields, required_fields=["infraType"])
|
| 232 |
+
infrastructure.update(simple_infra)
|
| 233 |
+
|
| 234 |
+
# Process components
|
| 235 |
+
if "components" in infra_data:
|
| 236 |
+
components = process_dynamic_component_list(
|
| 237 |
+
infra_data["components"])
|
| 238 |
+
if components:
|
| 239 |
+
infrastructure["components"] = components
|
| 240 |
+
|
| 241 |
+
# Only add infrastructure section if it has meaningful content
|
| 242 |
+
if infrastructure:
|
| 243 |
+
self.report["infrastructure"] = infrastructure
|
| 244 |
+
return self
|
| 245 |
+
|
| 246 |
+
def add_environment(self, env_data: Dict[str, Any]) -> 'ReportBuilder':
|
| 247 |
+
"""Add environment section to the report."""
|
| 248 |
+
env_fields = {
|
| 249 |
+
"country": env_data.get("country"),
|
| 250 |
+
"latitude": env_data.get("latitude"),
|
| 251 |
+
"longitude": env_data.get("longitude"),
|
| 252 |
+
"location": env_data.get("location"),
|
| 253 |
+
"powerSupplierType": env_data.get("powerSupplierType"),
|
| 254 |
+
"powerSource": env_data.get("powerSource"),
|
| 255 |
+
"powerSourceCarbonIntensity": env_data.get("powerSourceCarbonIntensity")
|
| 256 |
+
}
|
| 257 |
+
|
| 258 |
+
environment = create_section_dict(
|
| 259 |
+
env_fields, required_fields=["country"])
|
| 260 |
+
# Only add environment section if it has meaningful values
|
| 261 |
+
if environment:
|
| 262 |
+
self.report["environment"] = environment
|
| 263 |
+
return self
|
| 264 |
+
|
| 265 |
+
def add_quality(self, quality_value: Any) -> 'ReportBuilder':
|
| 266 |
+
"""Add quality field to the report."""
|
| 267 |
+
if is_meaningful_value(quality_value):
|
| 268 |
+
self.report["quality"] = quality_value
|
| 269 |
+
return self
|
| 270 |
+
|
| 271 |
+
def build(self) -> Dict[str, Any]:
|
| 272 |
+
"""Build and return the final report."""
|
| 273 |
+
return self.report
|
src/services/util.py
CHANGED
|
@@ -2,16 +2,7 @@ import os
|
|
| 2 |
|
| 3 |
# Hugging Face Configuration
|
| 4 |
HF_TOKEN = os.environ.get("HF_TOKEN")
|
| 5 |
-
DATASET_NAME = "boavizta/
|
| 6 |
-
|
| 7 |
-
# Form Field Configurations
|
| 8 |
-
# not used and verified for now
|
| 9 |
-
MANDATORY_SECTIONS = ["task", "measures", "infrastructure"]
|
| 10 |
-
OBLIGATORY_FIELDS = [
|
| 11 |
-
"taskStage", "taskFamily", "dataUsage", "dataType",
|
| 12 |
-
"measurementMethod", "powerConsumption", "infraType", "componentType",
|
| 13 |
-
"nbComponent"
|
| 14 |
-
]
|
| 15 |
|
| 16 |
# Dropdown Options
|
| 17 |
REPORT_STATUS_OPTIONS = ["draft", "final", "corrective", "other"]
|
|
|
|
| 2 |
|
| 3 |
# Hugging Face Configuration
|
| 4 |
HF_TOKEN = os.environ.get("HF_TOKEN")
|
| 5 |
+
DATASET_NAME = "boavizta/open_data_boamps"
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 6 |
|
| 7 |
# Dropdown Options
|
| 8 |
REPORT_STATUS_OPTIONS = ["draft", "final", "corrective", "other"]
|
src/ui/form_components.py
CHANGED
|
@@ -1,4 +1,6 @@
|
|
|
|
|
| 1 |
import gradio as gr
|
|
|
|
| 2 |
from src.services.util import (
|
| 3 |
REPORT_STATUS_OPTIONS, CONFIDENTIALITY_LEVELS, DATA_USAGE_OPTIONS, DATA_FORMAT,
|
| 4 |
DATA_TYPES, DATA_SOURCE,
|
|
@@ -9,126 +11,150 @@ from src.services.util import (
|
|
| 9 |
|
| 10 |
def create_dynamic_section(section_name, fields_config, initial_count=1, layout="row"):
|
| 11 |
"""
|
| 12 |
-
Creates a dynamic section
|
|
|
|
| 13 |
|
| 14 |
Args:
|
| 15 |
section_name (str): The name of the section (e.g., "Algorithms", "Components").
|
| 16 |
-
fields_config (list): A list of dictionaries defining the configuration for each field
|
| 17 |
-
|
| 18 |
-
- "type": The Gradio component type (e.g., gr.Textbox, gr.Number).
|
| 19 |
-
- "label": The label for the field.
|
| 20 |
-
- "info": Additional information or tooltip for the field.
|
| 21 |
-
- "value" (optional): The default value for the field.
|
| 22 |
-
- "kwargs" (optional): Additional keyword arguments for the component.
|
| 23 |
-
- "elem_classes" (optional): CSS classes for styling the field.
|
| 24 |
-
initial_count (int): The initial number of rows to render in the section.
|
| 25 |
layout (str): The layout of the fields in each row ("row" or "column").
|
| 26 |
|
| 27 |
Returns:
|
| 28 |
-
tuple: A tuple containing
|
| 29 |
-
- count_state: A Gradio state object tracking the number of rows.
|
| 30 |
-
- field_states: A list of Gradio state objects, one for each field, to store the values of the fields.
|
| 31 |
-
- add_btn: The "Add" button component for adding new rows.
|
| 32 |
"""
|
| 33 |
-
#
|
| 34 |
-
|
| 35 |
-
|
| 36 |
-
#
|
| 37 |
field_states = [gr.State([]) for _ in fields_config]
|
| 38 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 39 |
all_components = []
|
|
|
|
| 40 |
|
| 41 |
-
|
| 42 |
-
|
| 43 |
-
|
| 44 |
-
|
| 45 |
-
|
| 46 |
-
|
| 47 |
-
|
| 48 |
-
Returns:
|
| 49 |
-
tuple: Updated states for all fields.
|
| 50 |
-
"""
|
| 51 |
-
# Split states and current values
|
| 52 |
-
# Extract the current states for each field.
|
| 53 |
-
states = list(states_and_values[:len(fields_config)])
|
| 54 |
-
# Extract the new values for the fields.
|
| 55 |
-
current_values = states_and_values[len(fields_config):-1]
|
| 56 |
-
index = states_and_values[-1] # The index of the row being updated.
|
| 57 |
-
|
| 58 |
-
# Update each field's state
|
| 59 |
-
for field_idx, (state, value) in enumerate(zip(states, current_values)):
|
| 60 |
-
# Ensure the state list is long enough to accommodate the current index.
|
| 61 |
-
while len(state) <= index:
|
| 62 |
-
state.append("")
|
| 63 |
-
# Update the value at the specified index.
|
| 64 |
-
state[index] = value if value is not None else ""
|
| 65 |
-
|
| 66 |
-
return tuple(states)
|
| 67 |
-
|
| 68 |
-
@gr.render(inputs=count_state)
|
| 69 |
-
def render_dynamic_section(count):
|
| 70 |
-
"""
|
| 71 |
-
Renders the dynamic section with the current number of rows and their states.
|
| 72 |
-
|
| 73 |
-
Args:
|
| 74 |
-
count (int): The number of rows to render.
|
| 75 |
-
|
| 76 |
-
Returns:
|
| 77 |
-
list: A list of dynamically generated components for the section.
|
| 78 |
-
"""
|
| 79 |
-
nonlocal all_components
|
| 80 |
-
all_components = [] # Reset the list of components for re-rendering.
|
| 81 |
-
|
| 82 |
-
for i in range(count):
|
| 83 |
-
# Create a row or column layout for the current row of fields.
|
| 84 |
with (gr.Row() if layout == "row" else gr.Column()):
|
| 85 |
-
row_components = []
|
| 86 |
-
field_refs = [] # References to the current row's components.
|
| 87 |
|
| 88 |
for field_idx, config in enumerate(fields_config):
|
| 89 |
-
# Create
|
| 90 |
component = config["type"](
|
| 91 |
-
label=f"{config['label']} ({section_name}{
|
| 92 |
info=config.get("info", ""),
|
| 93 |
value=config.get("value", ""),
|
| 94 |
**config.get("kwargs", {}),
|
| 95 |
elem_classes=config.get("elem_classes", "")
|
| 96 |
)
|
| 97 |
row_components.append(component)
|
| 98 |
-
field_refs.append(component)
|
| 99 |
|
| 100 |
-
#
|
| 101 |
-
|
| 102 |
-
|
| 103 |
-
inputs=[*field_states, *field_refs, gr.State(i)],
|
| 104 |
-
outputs=field_states
|
| 105 |
-
)
|
| 106 |
|
| 107 |
-
# Add
|
| 108 |
-
remove_btn = gr.Button(
|
| 109 |
-
|
| 110 |
-
lambda x, idx=i, fs=field_states: (
|
| 111 |
-
max(0, x - 1), # Decrease the count of rows.
|
| 112 |
-
# Remove the row's values.
|
| 113 |
-
*[fs[i].value[:idx] + fs[i].value[idx + 1:] for i in range(len(fs))]
|
| 114 |
-
),
|
| 115 |
-
inputs=count_state,
|
| 116 |
-
outputs=[count_state, *field_states]
|
| 117 |
-
)
|
| 118 |
row_components.append(remove_btn)
|
| 119 |
|
| 120 |
-
|
| 121 |
-
all_components.extend(row_components)
|
| 122 |
-
return all_components
|
| 123 |
|
| 124 |
-
#
|
| 125 |
-
|
| 126 |
|
| 127 |
-
#
|
| 128 |
add_btn = gr.Button(f"Add {section_name}")
|
| 129 |
-
add_btn.click(lambda x: x + 1, count_state, count_state)
|
| 130 |
|
| 131 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 132 |
|
| 133 |
|
| 134 |
def create_header_tab():
|
|
@@ -141,9 +167,13 @@ def create_header_tab():
|
|
| 141 |
formatVersionSpecificationUri = gr.Textbox(
|
| 142 |
label="Format Version Specification URI", info="(the URI of the present specification of this set of schemas)")
|
| 143 |
reportId = gr.Textbox(
|
| 144 |
-
label="Report ID", info="(the unique identifier of this report, preferably as a uuid4 string)")
|
| 145 |
reportDatetime = gr.Textbox(
|
| 146 |
-
label="Report Datetime",
|
|
|
|
|
|
|
|
|
|
|
|
|
| 147 |
reportStatus = gr.Dropdown(value=None,
|
| 148 |
label="Report Status",
|
| 149 |
choices=REPORT_STATUS_OPTIONS,
|
|
@@ -259,7 +289,7 @@ def create_task_tab():
|
|
| 259 |
"info": "(the type of quantization used : fp32, fp16, b16, int8 ...)",
|
| 260 |
}
|
| 261 |
],
|
| 262 |
-
initial_count=
|
| 263 |
layout="column"
|
| 264 |
)
|
| 265 |
|
|
@@ -323,7 +353,7 @@ def create_task_tab():
|
|
| 323 |
"info": "(the owner of the dataset if available)",
|
| 324 |
}
|
| 325 |
],
|
| 326 |
-
initial_count=
|
| 327 |
layout="column"
|
| 328 |
)
|
| 329 |
|
|
@@ -421,7 +451,7 @@ def create_measures_tab():
|
|
| 421 |
"info": "(the date when the measurement began, in format YYYY-MM-DD HH:MM:SS)",
|
| 422 |
}
|
| 423 |
],
|
| 424 |
-
initial_count=
|
| 425 |
layout="column"
|
| 426 |
)
|
| 427 |
|
|
@@ -520,7 +550,7 @@ def create_infrastructure_tab():
|
|
| 520 |
"info": "(the percentage of the physical equipment used by the task, this sharing property should be set to 1 by default (if no share) and otherwise to the correct percentage, e.g. 0.5 if you share half-time.)",
|
| 521 |
}
|
| 522 |
],
|
| 523 |
-
initial_count=
|
| 524 |
layout="column"
|
| 525 |
)
|
| 526 |
|
|
|
|
| 1 |
+
import uuid
|
| 2 |
import gradio as gr
|
| 3 |
+
import datetime
|
| 4 |
from src.services.util import (
|
| 5 |
REPORT_STATUS_OPTIONS, CONFIDENTIALITY_LEVELS, DATA_USAGE_OPTIONS, DATA_FORMAT,
|
| 6 |
DATA_TYPES, DATA_SOURCE,
|
|
|
|
| 11 |
|
| 12 |
def create_dynamic_section(section_name, fields_config, initial_count=1, layout="row"):
|
| 13 |
"""
|
| 14 |
+
Creates a simplified dynamic section with a fixed number of pre-created rows.
|
| 15 |
+
This approach prioritizes data preservation over true dynamic functionality.
|
| 16 |
|
| 17 |
Args:
|
| 18 |
section_name (str): The name of the section (e.g., "Algorithms", "Components").
|
| 19 |
+
fields_config (list): A list of dictionaries defining the configuration for each field.
|
| 20 |
+
initial_count (int): The initial number of rows to show (up to MAX_ROWS).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 21 |
layout (str): The layout of the fields in each row ("row" or "column").
|
| 22 |
|
| 23 |
Returns:
|
| 24 |
+
tuple: A tuple containing states and the add button, compatible with existing code.
|
|
|
|
|
|
|
|
|
|
| 25 |
"""
|
| 26 |
+
# Fixed number of rows - simple but reliable approach
|
| 27 |
+
MAX_ROWS = 5
|
| 28 |
+
|
| 29 |
+
# Create field states for compatibility with existing code
|
| 30 |
field_states = [gr.State([]) for _ in fields_config]
|
| 31 |
+
|
| 32 |
+
# Initialize field states with empty values for all possible rows
|
| 33 |
+
for field_state in field_states:
|
| 34 |
+
field_state.value = [""] * MAX_ROWS
|
| 35 |
+
|
| 36 |
+
# Create all rows upfront (some hidden initially)
|
| 37 |
all_components = []
|
| 38 |
+
all_field_components = [] # Store all field components for event binding
|
| 39 |
|
| 40 |
+
for row_idx in range(MAX_ROWS):
|
| 41 |
+
# Use accordion instead of Group for better visibility control
|
| 42 |
+
# Show only initial_count rows at the beginning
|
| 43 |
+
is_visible = row_idx < initial_count
|
| 44 |
+
|
| 45 |
+
# Use accordion that's open for visible rows
|
| 46 |
+
with gr.Accordion(f"{section_name} {row_idx + 1}", open=is_visible, visible=is_visible) as group:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 47 |
with (gr.Row() if layout == "row" else gr.Column()):
|
| 48 |
+
row_components = []
|
|
|
|
| 49 |
|
| 50 |
for field_idx, config in enumerate(fields_config):
|
| 51 |
+
# Create component
|
| 52 |
component = config["type"](
|
| 53 |
+
label=f"{config['label']} ({section_name} {row_idx + 1})",
|
| 54 |
info=config.get("info", ""),
|
| 55 |
value=config.get("value", ""),
|
| 56 |
**config.get("kwargs", {}),
|
| 57 |
elem_classes=config.get("elem_classes", "")
|
| 58 |
)
|
| 59 |
row_components.append(component)
|
|
|
|
| 60 |
|
| 61 |
+
# Store component and indices for later event binding
|
| 62 |
+
all_field_components.append(
|
| 63 |
+
(component, field_idx, row_idx))
|
|
|
|
|
|
|
|
|
|
| 64 |
|
| 65 |
+
# Add remove button for this row
|
| 66 |
+
remove_btn = gr.Button(
|
| 67 |
+
"❌ Remove", variant="secondary", size="sm", visible=True)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 68 |
row_components.append(remove_btn)
|
| 69 |
|
| 70 |
+
all_components.append((group, row_components))
|
|
|
|
|
|
|
| 71 |
|
| 72 |
+
# Visibility state
|
| 73 |
+
visible_count = gr.State(initial_count)
|
| 74 |
|
| 75 |
+
# Add button
|
| 76 |
add_btn = gr.Button(f"Add {section_name}")
|
|
|
|
| 77 |
|
| 78 |
+
def handle_add(current_count):
|
| 79 |
+
"""Show one more row if available"""
|
| 80 |
+
new_count = min(current_count + 1, MAX_ROWS)
|
| 81 |
+
|
| 82 |
+
# Update visibility for all groups
|
| 83 |
+
visibility_updates = []
|
| 84 |
+
for i in range(MAX_ROWS):
|
| 85 |
+
# For accordion, we need to update both visible and open states
|
| 86 |
+
visibility_updates.append(
|
| 87 |
+
gr.update(visible=(i < new_count), open=(i < new_count)))
|
| 88 |
+
|
| 89 |
+
return new_count, *visibility_updates
|
| 90 |
+
|
| 91 |
+
def handle_remove(current_count):
|
| 92 |
+
"""Hide the last visible row"""
|
| 93 |
+
new_count = max(current_count - 1,
|
| 94 |
+
1) # Always keep at least 1 row visible
|
| 95 |
+
|
| 96 |
+
# Update visibility for all groups
|
| 97 |
+
visibility_updates = []
|
| 98 |
+
for i in range(MAX_ROWS):
|
| 99 |
+
# For accordion, we need to update both visible and open states
|
| 100 |
+
visibility_updates.append(
|
| 101 |
+
gr.update(visible=(i < new_count), open=(i < new_count)))
|
| 102 |
+
|
| 103 |
+
return new_count, *visibility_updates
|
| 104 |
+
|
| 105 |
+
# Connect add button
|
| 106 |
+
group_outputs = [group for group, _ in all_components]
|
| 107 |
+
add_btn.click(
|
| 108 |
+
fn=handle_add,
|
| 109 |
+
inputs=[visible_count],
|
| 110 |
+
outputs=[visible_count] + group_outputs
|
| 111 |
+
)
|
| 112 |
+
|
| 113 |
+
# Connect remove buttons for each row
|
| 114 |
+
for row_idx, (group, row_components) in enumerate(all_components):
|
| 115 |
+
remove_btn = row_components[-1] # Remove button is the last component
|
| 116 |
+
remove_btn.click(
|
| 117 |
+
fn=handle_remove,
|
| 118 |
+
inputs=[visible_count],
|
| 119 |
+
outputs=[visible_count] + group_outputs
|
| 120 |
+
)
|
| 121 |
+
|
| 122 |
+
# Force initial visibility on interface load
|
| 123 |
+
def force_initial_visibility():
|
| 124 |
+
"""Force initial visibility when the interface loads"""
|
| 125 |
+
visibility_updates = []
|
| 126 |
+
for i in range(MAX_ROWS):
|
| 127 |
+
visibility_updates.append(
|
| 128 |
+
gr.update(visible=(i < initial_count), open=(i < initial_count)))
|
| 129 |
+
return visibility_updates
|
| 130 |
+
|
| 131 |
+
# Create a simple info display
|
| 132 |
+
info_display = gr.Markdown(f"**{section_name}** (Max {MAX_ROWS} items)")
|
| 133 |
+
|
| 134 |
+
# Dummy count state for compatibility
|
| 135 |
+
count_state = gr.State(initial_count)
|
| 136 |
+
|
| 137 |
+
# Apply initial visibility immediately after component creation
|
| 138 |
+
if initial_count > 0:
|
| 139 |
+
# Use app load event to ensure visibility
|
| 140 |
+
for i, (group, _) in enumerate(all_components):
|
| 141 |
+
if i < initial_count:
|
| 142 |
+
group.visible = True
|
| 143 |
+
group.open = True
|
| 144 |
+
|
| 145 |
+
# Store the actual components to return instead of gr.State
|
| 146 |
+
components_to_return = []
|
| 147 |
+
for field_idx in range(len(fields_config)):
|
| 148 |
+
field_components = []
|
| 149 |
+
for row_idx in range(MAX_ROWS):
|
| 150 |
+
# Find the component for this field and row
|
| 151 |
+
for component, f_idx, r_idx in all_field_components:
|
| 152 |
+
if f_idx == field_idx and r_idx == row_idx:
|
| 153 |
+
field_components.append(component)
|
| 154 |
+
break
|
| 155 |
+
components_to_return.append(field_components)
|
| 156 |
+
|
| 157 |
+
return (count_state, *components_to_return, add_btn)
|
| 158 |
|
| 159 |
|
| 160 |
def create_header_tab():
|
|
|
|
| 167 |
formatVersionSpecificationUri = gr.Textbox(
|
| 168 |
label="Format Version Specification URI", info="(the URI of the present specification of this set of schemas)")
|
| 169 |
reportId = gr.Textbox(
|
| 170 |
+
label="Report ID", info="(the unique identifier of this report, preferably as a uuid4 string)", value=str(uuid.uuid4()))
|
| 171 |
reportDatetime = gr.Textbox(
|
| 172 |
+
label="Report Datetime",
|
| 173 |
+
info="Required field<br>(the publishing date of this report in format YYYY-MM-DD HH:MM:SS)",
|
| 174 |
+
elem_classes="mandatory_field",
|
| 175 |
+
value=datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S")
|
| 176 |
+
)
|
| 177 |
reportStatus = gr.Dropdown(value=None,
|
| 178 |
label="Report Status",
|
| 179 |
choices=REPORT_STATUS_OPTIONS,
|
|
|
|
| 289 |
"info": "(the type of quantization used : fp32, fp16, b16, int8 ...)",
|
| 290 |
}
|
| 291 |
],
|
| 292 |
+
initial_count=1,
|
| 293 |
layout="column"
|
| 294 |
)
|
| 295 |
|
|
|
|
| 353 |
"info": "(the owner of the dataset if available)",
|
| 354 |
}
|
| 355 |
],
|
| 356 |
+
initial_count=1,
|
| 357 |
layout="column"
|
| 358 |
)
|
| 359 |
|
|
|
|
| 451 |
"info": "(the date when the measurement began, in format YYYY-MM-DD HH:MM:SS)",
|
| 452 |
}
|
| 453 |
],
|
| 454 |
+
initial_count=1,
|
| 455 |
layout="column"
|
| 456 |
)
|
| 457 |
|
|
|
|
| 550 |
"info": "(the percentage of the physical equipment used by the task, this sharing property should be set to 1 by default (if no share) and otherwise to the correct percentage, e.g. 0.5 if you share half-time.)",
|
| 551 |
}
|
| 552 |
],
|
| 553 |
+
initial_count=1,
|
| 554 |
layout="column"
|
| 555 |
)
|
| 556 |
|