# Object detection STM32 model zoo Remember that minimalistic yaml files are available [here](../config_file_examples/) to play with specific services, and that all pre-trained models in the [STM32 model zoo](https://github.com/STMicroelectronics/stm32ai-modelzoo/) are provided with their configuration .yaml file used to generate them. These are very good starting points to start playing with! ## Table of contents 1. [Object detection Model Zoo introduction](#1) 2. [Object detection tutorial](#2) - [2.1 Choose the operation mode](#2-1) - [2.2 Global settings](#2-2) - [2.3 Model settings](#2-3) - [2.4 Dataset specification](#2-4) - [2.5 Apply image preprocessing](#2-5) - [2.6 Use data augmentation](#2-6) - [2.6 Set the training parameters](#2-6) - [2.7 Set the postprocessing parameters](#2-7) - [2.8 Model quantization](#2-8) - [2.9 Benchmark the model](#2-9) - [2.10 Deploy the model](#2-10) - [2.11 Hydra and MLflow settings](#2-11) 3. [Run the object detection chained service](#3) 4. [Visualize the chained services results](#4) - [4.1 Saved results](#4-1) - [4.2 Run tensorboard](#4-2) - [4.3 Run ClearML](#4-3) - [4.4 Run MLFlow](#4-4) 5. [Appendix A: YAML syntax](#A)
1. Object detection Model Zoo introduction The object detection model zoo provides a collection of independent services and pre-built chained services that can be used to perform various functions related to machine learning for Object detection. The individual services include tasks such as training the model or quantizing the model, while the chained services combine multiple services to perform more complex functions, such as training the model, quantizing it, and evaluating the quantized model successively. To use the services in the Object detection model zoo, you can utilize the model zoo [stm32ai_main.py](../stm32ai_main.py) along with the [user_config.yaml](../user_config.yaml) file as input. The yaml file specifies the service or the chained services and a set of configuration parameters such as the model (either from the model zoo or your own custom model), the dataset, the number of epochs, and the preprocessing parameters, among others. More information about the different services and their configuration options can be found in the next section. The object detection datasets are expected to be in TFS TensorFlow format. For this project, we are using the Pascal VOC 2012 dataset, which can be downloaded directly in YOLO Darknet TXT format from [here](https://public.roboflow.com/object-detection/pascal-voc-2012/1/download/darknet). An example of this structure is shown below: ```yaml train/: train_image_1.jpg train_image_1.txt train_image_2.jpg train_image_2.txt val/: val_image_1.jpg val_image_1.txt val_image_2.jpg val_image_2.txt ```
2. Object detection tutorial This tutorial demonstrates how to use the `chain_tqeb` services to train, benchmark, quantize, evaluate, and benchmark the model. To get started, you will need to update the [user_config.yaml](../user_config.yaml) file, which specifies the parameters and configuration options for the services that you want to use. Each section of the [user_config.yaml](../user_config.yaml) file is explained in detail in the following sections.
3. Run the object detection chained service After updating the [user_config.yaml](../user_config.yaml) file, please run the following command: ```bash python stm32ai_main.py ``` * Note that you can provide YAML attributes as arguments in the command, as shown below: ```bash python stm32ai_main.py operation_mode='chain_eb' ```
4. Visualize the chained services results Every time you run the Model Zoo, an experiment directory is created that contains all the directories and files created during the run. The names of experiment directories are all unique as they are based on the date and time of the run. Experiment directories are managed using the Hydra Python package. Refer to [Hydra Home](https://hydra.cc/) for more information about this package. By default, all the experiment directories are under the /object_detection/src/experiments_outputs directory and their names follow the "%Y_%m_%d_%H_%M_%S" pattern. This is illustrated in the figure below. ``` experiments_outputs | | +--------------+--------------------+--------------------+ | | | | | | | | mlruns | | MLflow +--- stm32ai_main.log files +--- training_metrics.csv +--- training_curves.png +--- float_model_confusion_matrix_validation_set.png | +-----------------------------------+--------------------------------+------------+ | | | | | | | | saved_models quantized_models logs .hydra | | | | +--- best_augmented_model.keras +--- quantized_model.keras TensorBoard Hydra +--- last_augmented_model.keras files files +--- best_model.keras ``` The file named 'stm32ai_main.log' under each experiment directory is the log file saved during the execution of the ' stm32ai_main.py' script. The contents of the other files saved under an experiment directory are described in the table below. | File | Directory | Contents | |:-------------------|:-------------------------|:-----------------------| | best_augmented_model.keras | saved_models | Best model saved during training, rescaling and data augmentation layers included (Keras) | | last_augmented_model.keras | saved_models | Last model saved at the end of a training, rescaling and data augmentation layers included (Keras) | | best_model.keras | saved_models | Best model obtained at the end of a training (Keras) | | quantized_model.tflite | quantized_models | Quantized model (TFlite) | | training_metrics.csv | metrics | Training metrics CSV including epochs, losses, accuracies and learning rate | | training_curves.png | metrics | Training learning curves (losses and accuracies) | | float_model_confusion_matrix_test_set.png | metrics | Float model confusion matrix | | quantized_model_confusion_matrix_test_set.png | metrics | Quantized model confusion matrix | All the directory names, including the naming pattern of experiment directories, can be changed using the configuration file. The names of the files cannot be changed. The models in the 'best_augmented_model.keras' and 'last_augmented_model.keras' Keras files contain rescaling and data augmentation layers. These files can be used to resume a training that you interrupted or that crashed. This will be explained in section training service [README](./README_TRAINING.md). These model files are not intended to be used outside of the Model Zoo context.
    4.1 Saved results All of the training and evaluation artifacts are saved in the current output simulation directory, which is located at **experiments_outputs/\**. For example, you can retrieve the confusion matrix generated after evaluating the float and the quantized model on the test set by navigating to the appropriate directory within **experiments_outputs/\**.
    4.2 Run tensorboard To visualize the training curves that were logged by TensorBoard, navigate to the ** experiments_outputs/\** directory and run the following command: ```bash tensorboard --logdir logs ``` This will start a server and its address will be displayed. Use this address in a web browser to connect to the server. Then, using the web browser, you will able to explore the learning curves and other training metrics.
    4.3 Run ClearML ClearML is an open-source tool used for logging and tracking machine learning experiments. It allows you to record metrics, parameters, and results, making it easier to monitor and compare diffrent runs. Follow these steps to configurate ClearML for logging your results. This setup only needs to be done once. if you haven't set it up yet, complete the steps below. if you've already configured ClearML, your results should be automatically logged and available in your session. - Sign up for free to the [ClearML Hosted Service](https://app.clear.ml), then go to your ClearML workspace and create new credentials. - Create a `clearml.conf` file and paste the credentials into it. If you are behind a proxy or using SSL portals, add `verify_certificate = False` to the configuration to make it work. Here is an example of what your `clearml.conf` file might look like: ```ini api { web_server: https://app.clear.ml api_server: https://api.clear.ml files_server: https://files.clear.ml # Add this line if you are behind a proxy or using SSL portals verify_certificate = False credentials { "access_key" = "YOUR_ACCESS_KEY" "secret_key" = "YOUR_SECRET_KEY" } } ``` Once configured, your experiments will be logged directly and shown in the project section under the name of your project.
    4.4 Run MLFlow MLflow is an API that allows you to log parameters, code versions, metrics, and artifacts while running machine learning code, and provides a way to visualize the results. To view and examine the results of multiple trainings, you can navigate to the **experiments_outputs** directory and access the MLflow Webapp by running the following command: ```bash mlflow ui ``` This will start a server and its address will be displayed. Use this address in a web browser to connect to the server. Then, using the web browser, you will be able to navigate the different experiment directories and look at the metrics they were collected. Refer to [MLflow Home](https://mlflow.org/) for more information about MLflow.
Appendix A: YAML syntax **Example and terminology:** An example of YAML code is shown below. ```yaml preprocessing: rescaling: scale: 1/127.5 offset: -1 resizing: aspect_ratio: fit interpolation: nearest ``` The code consists of a number of nested "key-value" pairs. The column character is used as a separator between the key and the value. Indentation is how YAML denotes nesting. The specification forbids tabs because tools treat them differently. A common practice is to use 2 or 3 spaces but you can use any number of them. We use "attribute-value" instead of "key-value" as in the YAML terminology, the term "attribute" being more relevant to our application. We may use the term "attribute" or "section" for nested attribute-value pairs constructs. In the example above, we may indifferently refer to "preprocessing" as an attribute (whose value is a list of nested constructs) or as a section. **Comments:** Comments begin with a pound sign. They can appear after an attribute value or take up an entire line. ```yaml preprocessing: rescaling: scale: 1/127.5 # This is a comment. offset: -1 resizing: # This is a comment. aspect_ratio: fit interpolation: nearest color_mode: rgb ``` **Attributes with no value:** The YAML language supports attributes with no value. The code below shows the alternative syntaxes you can use for such attributes. ```yaml attribute_1: attribute_2: ~ attribute_3: null attribute_4: None # Model Zoo extension ``` The value *None* is a Model Zoo extension that was made because it is intuitive to Python users. Attributes with no value can be useful to list in the configuration file all the attributes that are available in a given section and explicitly show which ones were not used. **Strings:** You can enclose strings in single or double quotes. However, unless the string contains special YAML characters, you don't need to use quotes. This syntax: ```yaml resizing: aspect_ratio: fit interpolation: nearest ``` is equivalent to this one: ```yaml resizing: aspect_ratio: "fit" interpolation: "nearest" ``` **Strings with special characters:** If a string value includes YAML special characters, you need to enclose it in single or double quotes. In the example below, the string includes the ',' character, so quotes are required. ```yaml name: "Pepper,_bell___Bacterial_spot" ``` **Strings spanning multiple lines:** You can write long strings on multiple lines for better readability. This can be done using the '|' (pipe) continuation character as shown in the example below. This syntax: ```yaml LearningRateScheduler: schedule: | lambda epoch, lr: (0.0005*epoch + 0.00001) if epoch < 20 else (0.01 if epoch < 50 else (lr / (1 + 0.0005 * epoch))) ``` is equivalent to this one: ```yaml LearningRateScheduler: schedule: "lambda epoch, lr: (0.0005*epoch + 0.00001) if epoch < 20 else (0.01 if epoch < 50 else (lr / (1 + 0.0005 * epoch)))" ``` Note that when using the first syntax, strings that contain YAML special characters don't need to be enclosed in quotes. In the example above, the string includes the ',' character. **Booleans:** The syntaxes you can use for boolean values are shown below. Supported values have been extended to *True* and *False* in the Model Zoo as they are intuitive to Python users. ```yaml # YAML native syntax attribute_1: true attribute_2: false # Model Zoo extensions attribute_3: True attribute_4: False ``` **Numbers and numerical expressions:** Attribute values can be integer numbers, floating-point numbers or numerical expressions as shown in the YAML code below. ```yaml ReduceLROnPlateau: patience: 10 # Integer value factor: 0.1 # Floating-point value min_lr: 1e-6 # Floating-point value, exponential notation rescaling: scale: 1/127.5 # Numerical expression, evaluated to 0.00784314 offset: -1 ``` **Lists:** You can specify lists on a single line or on multiple lines as shown below. This syntax: ```yaml class_names: [ aeroplane,bicycle,bird,boat,bottle,bus,car,cat,chair,cow,diningtable,dog,horse,motorbike,person,pottedplant,sheep,sofa,train,tvmonitor ] ``` is equivalent to this one: ```yaml class_names: - aeroplane - bicycle - bird - sunflowers - boat ... ``` **Multiple attribute-value pairs on one line:** Multiple attribute-value pairs can be specified on one line as shown below. This syntax: ```yaml rescaling: { scale: 1/127.5, offset: -1 } ``` is equivalent to this one: ```yaml rescaling: scale: 1/127.5, offset: -1 ```