# Object detection STM32 model zoo Remember that minimalistic yaml files are available [here](../config_file_examples_pt/) to play with specific services, and that all pre-trained models in the [STM32 model zoo](https://github.com/STMicroelectronics/stm32ai-modelzoo/) are provided with their configuration .yaml file used to generate them. These are very good starting points to start playing with! ## Table of contents 1. [Object detection Model Zoo introduction](#1) - [1.1 Datasets](#1-1) - [1.2 Models](#1-2) 2. [Object detection tutorial](#2) - [2.1 Choose the operation mode](#2-1) - [2.2 Global settings](#2-2) - [2.3 Model settings](#2-3) - [2.4 Dataset specification](#2-4) - [2.5 Use data augmentation](#2-5) - [2.6 Set the training parameters](#2-6) - [2.7 Apply image preprocessing](#2-7) - [2.8 Set the postprocessing parameters](#2-8) - [2.9 Model quantization](#2-9) - [2.10 Benchmark the model](#2-10) - [2.11 Deploy the model](#2-11) - [2.12 Hydra and MLflow settings](#2-12) 3. [Run the object detection chained service](#3) 4. [Visualize the chained services results](#4) - [4.1 Saved results](#4-1) - [4.2 Run tensorboard](#4-2) - [4.3 Run ClearML](#4-3) - [4.4 Run MLFlow](#4-4) 5. [Appendix A: YAML syntax](#A)
1. Object detection Model Zoo introduction The object detection model zoo for Torch provides a collection of independent services and pre-built chained services that can be used to perform various functions related to machine learning for Object detection. The individual services include tasks such as training, evaluation and quantization of the model as individual task, while the chained services combine multiple services to perform more complex functions, such as training, quantizing and evaluating the quantized model successively. To use the services in the Object detection model zoo, you can utilize the model zoo [stm32ai_main.py](../stm32ai_main.py) along with the [user_config_pt.yaml](../user_config.yaml) file as input. The yaml file specifies the service or the chained services and a set of configuration parameters such as the model (either from the model zoo or your own custom model), the dataset, the number of epochs, and the preprocessing parameters, among others. More information about the different services and their configuration options can be found in the next section. We primarily support the datasets to be in MS-COCO format. 1. COCO (Supported by ST_YOLOD models) 2. COCO and Pascal VOC (Supported by SSD models)
2. Object detection tutorial This tutorial demonstrates how to use the `chain_tqeb` services to train, benchmark, quantize, evaluate, and benchmark the model. To get started, you will need to update the [user_config_pt.yaml](../user_config_pt.yaml) file, which specifies the parameters and configuration options for the services that you want to use. Each section of the [user_config_pt.yaml](../user_config_pt.yaml) file is explained in detail in the following sections.
3. Run the object detection chained service After updating the [user_config_pt.yaml](../user_config_pt.yaml) file, please run the following command: ```bash cd stm32ai-modelzoo-services/object_detection python ./stm32ai_main.py --config-path ./ --config-name user_config_pt.yaml ```
4. Visualize the chained services results Every time you run the Model Zoo, an experiment directory is created that contains all the directories and files created during the run. The names of experiment directories are all unique as they are based on the date and time of the run. Experiment directories are managed using the Hydra Python package. Refer to [Hydra Home](https://hydra.cc/) for more information about this package. By default, all the experiment directories are under the /object_detection/src/experiments_outputs directory and their names follow the "%Y_%m_%d_%H_%M_%S" pattern. This is illustrated in the figure below. ``` experiments_outputs | | +-------------------+-------------------+-------------------+ | | | | | | | | mlruns | | MLflow +--- stm32ai_main.log files | +-----------------------------------+ | | | | .hydra/ / | | config.yaml tensorboard/ hydra.yaml best_ckpt.pth overrides.yaml epoch_1_ckpt.pth epoch_2_ckpt.pth ... | quantized_models/ | _infer.onnx _quant_qdq_pc.onnx ``` The file named 'stm32ai_main.log' under each experiment directory is the log file saved during the execution of the ' stm32ai_main.py' script. The contents of the other files saved under an experiment directory are described in the table below. | File/Directory | Location | Contents | |:----------------------|:----------|:-----------------------| | best_ckpt.pth | / | Best model checkpoint saved during training (PyTorch) | | epoch_X_ckpt.pth | / | Model checkpoint saved at epoch X (PyTorch) | | tensorboard/ | / | TensorBoard event files for training visualization | | _infer.onnx | quantized_models/ | Quantized ONNX model for inference | | _quant_qdq_pc.onnx | quantized_models/ | Quantized ONNX model with QDQ (Quantize-Dequantize) operators | | config.yaml | .hydra/ | Main configuration file used for the experiment | | hydra.yaml | .hydra/ | Hydra framework configuration | | overrides.yaml | .hydra/ | Configuration overrides applied to this run | All the directory names, including the naming pattern of experiment directories, can be changed using the configuration file. The names of the files cannot be changed. The PyTorch checkpoint files 'best_ckpt.pth' and 'epoch_X_ckpt.pth' can be used to resume a training that you interrupted or that crashed. This will be explained in the training service [README](./README_TRAINING.md). The quantized ONNX files in the 'quantized_models/' directory are ready for deployment on STM32 devices.
    4.1 Saved results All of the training and evaluation artifacts are saved in the current output simulation directory, which is located at **experiments_outputs/\**. For example, you can retrieve the confusion matrix generated after evaluating the float and the quantized model on the test set by navigating to the appropriate directory within **experiments_outputs/\**.
    4.2 Run tensorboard To visualize the training curves that were logged by TensorBoard, navigate to the ** experiments_outputs/\** directory and run the following command: ```bash tensorboard --logdir logs ``` This will start a server and its address will be displayed. Use this address in a web browser to connect to the server. Then, using the web browser, you will able to explore the learning curves and other training metrics.
    4.3 Run ClearML ClearML is an open-source tool used for logging and tracking machine learning experiments. It allows you to record metrics, parameters, and results, making it easier to monitor and compare different runs. Follow these steps to configure ClearML for logging your results. This setup only needs to be done once. If you haven't set it up yet, complete the steps below. If you've already configured ClearML, your results should be automatically logged and available in your session. - Sign up for free to the [ClearML Hosted Service](https://app.clear.ml), then go to your ClearML workspace and create new credentials. - Create a `clearml.conf` file and paste the credentials into it. If you are behind a proxy or using SSL portals, add `verify_certificate = False` to the configuration to make it work. Here is an example of what your `clearml.conf` file might look like: ```ini api { web_server: https://app.clear.ml api_server: https://api.clear.ml files_server: https://files.clear.ml # Add this line if you are behind a proxy or using SSL portals verify_certificate = False credentials { "access_key" = "YOUR_ACCESS_KEY" "secret_key" = "YOUR_SECRET_KEY" } } ``` Once configured, your experiments will be logged directly and shown in the project section under the name of your project.
    4.4 Run MLFlow MLflow is an API that allows you to log parameters, code versions, metrics, and artifacts while running machine learning code, and provides a way to visualize the results. To view and examine the results of multiple trainings, you can navigate to the **experiments_outputs** directory and access the MLflow Webapp by running the following command: ```bash mlflow ui ``` This will start a server and its address will be displayed. Use this address in a web browser to connect to the server. Then, using the web browser, you will be able to navigate the different experiment directories and look at the metrics they were collected. Refer to [MLflow Home](https://mlflow.org/) for more information about MLflow.
5. Appendix A: YAML syntax **Example and terminology:** An example of YAML code is shown below. ```yaml preprocessing: rescaling: scale: 1/127.5 offset: -1 resizing: aspect_ratio: fit interpolation: nearest ``` The code consists of a number of nested "key-value" pairs. The column character is used as a separator between the key and the value. Indentation is how YAML denotes nesting. The specification forbids tabs because tools treat them differently. A common practice is to use 2 or 3 spaces but you can use any number of them. We use "attribute-value" instead of "key-value" as in the YAML terminology, the term "attribute" being more relevant to our application. We may use the term "attribute" or "section" for nested attribute-value pairs constructs. In the example above, we may indifferently refer to "preprocessing" as an attribute (whose value is a list of nested constructs) or as a section. **Comments:** Comments begin with a pound sign. They can appear after an attribute value or take up an entire line. ```yaml preprocessing: rescaling: scale: 1/127.5 # This is a comment. offset: -1 resizing: # This is a comment. aspect_ratio: fit interpolation: nearest color_mode: rgb ``` **Attributes with no value:** The YAML language supports attributes with no value. The code below shows the alternative syntaxes you can use for such attributes. ```yaml attribute_1: attribute_2: ~ attribute_3: null attribute_4: None # Model Zoo extension ``` The value *None* is a Model Zoo extension that was made because it is intuitive to Python users. Attributes with no value can be useful to list in the configuration file all the attributes that are available in a given section and explicitly show which ones were not used. **Strings:** You can enclose strings in single or double quotes. However, unless the string contains special YAML characters, you don't need to use quotes. This syntax: ```yaml resizing: aspect_ratio: fit interpolation: nearest ``` is equivalent to this one: ```yaml resizing: aspect_ratio: "fit" interpolation: "nearest" ``` **Strings with special characters:** If a string value includes YAML special characters, you need to enclose it in single or double quotes. In the example below, the string includes the ',' character, so quotes are required. ```yaml name: "Pepper,_bell___Bacterial_spot" ``` **Strings spanning multiple lines:** You can write long strings on multiple lines for better readability. This can be done using the '|' (pipe) continuation character as shown in the example below. This syntax: ```yaml LearningRateScheduler: schedule: | lambda epoch, lr: (0.0005*epoch + 0.00001) if epoch < 20 else (0.01 if epoch < 50 else (lr / (1 + 0.0005 * epoch))) ``` is equivalent to this one: ```yaml LearningRateScheduler: schedule: "lambda epoch, lr: (0.0005*epoch + 0.00001) if epoch < 20 else (0.01 if epoch < 50 else (lr / (1 + 0.0005 * epoch)))" ``` Note that when using the first syntax, strings that contain YAML special characters don't need to be enclosed in quotes. In the example above, the string includes the ',' character. **Booleans:** The syntaxes you can use for boolean values are shown below. Supported values have been extended to *True* and *False* in the Model Zoo as they are intuitive to Python users. ```yaml # YAML native syntax attribute_1: true attribute_2: false # Model Zoo extensions attribute_3: True attribute_4: False ``` **Numbers and numerical expressions:** Attribute values can be integer numbers, floating-point numbers or numerical expressions as shown in the YAML code below. ```yaml dataset: seed: 123 random_size: [10, 20] # list mosaic_prob: 0.5 # float degrees: 10 # integer rescaling: scale: 1/127.5 # Numerical expression, evaluated to 0.00784314 offset: -1 ``` **Lists:** You can specify lists on a single line or on multiple lines as shown below. This syntax: ```yaml class_names: [ aeroplane,bicycle,bird,boat,bottle,bus,car,cat,chair,cow,diningtable,dog,horse,motorbike,person,pottedplant,sheep,sofa,train,tvmonitor ] ``` is equivalent to this one: ```yaml class_names: - aeroplane - bicycle - bird - sunflowers - boat ... ``` **Multiple attribute-value pairs on one line:** Multiple attribute-value pairs can be specified on one line as shown below. This syntax: ```yaml rescaling: { scale: 1/127.5, offset: -1 } ``` is equivalent to this one: ```yaml rescaling: scale: 1/127.5 offset: -1 ```