diff --git "a/celestial-mini/Train_TFLite2_Object_Detction_Model.ipynb" "b/celestial-mini/Train_TFLite2_Object_Detction_Model.ipynb" new file mode 100644--- /dev/null +++ "b/celestial-mini/Train_TFLite2_Object_Detction_Model.ipynb" @@ -0,0 +1,1757 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "id": "view-in-github", + "colab_type": "text" + }, + "source": [ + "\"Open" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "fF8ysCfYKgTP" + }, + "source": [ + "# TensorFlow Lite Object Detection API in Colab\n", + "**Author:** Evan Juras, [EJ Technology Consultants](https://ejtech.io)\n", + "\n", + "**Last updated:** 2/13/25\n", + "\n", + "**GitHub:** [TensorFlow Lite Object Detection](https://github.com/EdjeElectronics/TensorFlow-Lite-Object-Detection-on-Android-and-Raspberry-Pi)\n", + "\n", + "# Introduction\n", + "\n", + "This notebook uses [the TensorFlow 2 Object Detection API](https://github.com/tensorflow/models/tree/master/research/object_detection) to train an SSD-MobileNet model or EfficientDet model with a custom dataset and convert it to TensorFlow Lite format. By working through this Colab, you'll be able to create and download a TFLite model that you can run on your PC, an Android phone, or an edge device like the Raspberry Pi.\n", + "\n", + "> **WARNING:** Google deprecated the TensorFlow Object Detection API over two years ago. For the sake of legacy code, I've kept this training notebook on life support through various hacks and band-aid fixes, and it is prone to stop working at any point. **I will not be providing further support for this video or training notebook.**\n", + "\n", + "> I highly recommend using the newer PyTorch-based Ultralytics YOLO models for object detection. They perform better and they're easier to work with. See my video tutorial on how to train YOLO detection models here: [How to Train YOLO Object Detection Models in Google Colab](https://youtu.be/r0RspiLG260)\n", + "\n", + "

\n", + "
\n", + "Custom SSD-MobileNet-FPNLite model in action!\n", + "

\n", + "\n", + "I also made a YouTube video that walks through this guide step by step. I use a coin detection model as an example for the video. I recommend following along with the video while working through this notebook.\n", + "\n", + "

\n", + "
\n", + "Click here to go to the video!
\n", + "

\n", + "\n", + "**Important note: This notebook will be continuously updated to make sure it works with newer versions of TensorFlow. If you see any differences between the YouTube video and this notebook, always follow the notebook!**\n", + "\n", + "### Working in Colab\n", + "Colab provides a virtual machine in your browser complete with a Linux OS, filesystem, Python environment, and best of all, a free GPU. It comes with most TensorFlow backend requirements (like CUDA and cuDNN) pre-installed. Simply click the play button on sections of code in this notebook to execute them on the virtual machine.\n", + "\n", + "> *Note: Make sure you're using a GPU-equipped machine by going to \"Runtime\" -> \"Change runtime type\" in the top menu bar, and then selecting \"GPU\" from the Hardware accelerator dropdown.*\n", + "\n", + "This Colab notebook uses TensorFlow 2. If you'd like to use TensorFlow 1, please see my [TF1 Colab notebook](https://colab.research.google.com/github/EdjeElectronics/TensorFlow-Lite-Object-Detection-on-Android-and-Raspberry-Pi/blob/master/Train_TFLite1_Object_Detection_Model.ipynb).\n", + "\n", + "### Navigation\n", + "This is a long notebook! Each step of the training process has its own section. Click the arrow next to the heading for each section to expand it. You can use the table of contents in the left sidebar to jump from section to section." + ] + }, + { + "cell_type": "markdown", + "source": [ + "### Fallback Runtime\n", + "\n", + "**IMPORTANT NOTE as of February 13, 2025:** Colab recently upgraded its base version of Python from 3.10 to 3.11, which broke compatibility with this notebook. For now, we can use the \"fallback runtime\". Please do the following:\n", + "\n", + "1. Initialize the Notebook by clicking \"Connect\" in the top right corner\n", + "2. In the top menu bar, go to Tools -> Command Palette. Search for \"fallback runtime\" and select \"Use fallback runtime version\".\n", + "3. Continue working through the notebook as normal.\n", + "\n", + "![image.png]()" + ], + "metadata": { + "id": "OpveOOw5P6DG" + } + }, + { + "cell_type": "markdown", + "source": [ + "# 1. Gather and Label Training Images" + ], + "metadata": { + "id": "4VAvZo8qE4u5" + } + }, + { + "cell_type": "markdown", + "metadata": { + "id": "ag0qD4XiBDcz" + }, + "source": [ + "Before we start training, we need to gather and label images that will be used for training the object detection model. A good starting point for a proof-of-concept model is 200 images. The training images should have random objects in the image along with the desired objects, and should have a variety of backgrounds and lighting conditions.\n", + "\n", + "Watch the YouTube video below for instructions and tips on how to gather and label images for training an object detection model.\n", + "\n", + "

\n", + "
\n", + "Watch this video to learn how to capture and label images.
\n", + "

\n", + "\n", + "When you've finished gathering and labeling images, you should have a folder full of images and corresponding .xml data annotation file for each image. An example of a labeled image and the image folder for my coin detector model are shown below.\n", + "\n", + "![](https://raw.githubusercontent.com/EdjeElectronics/TensorFlow-Lite-Object-Detection-on-Android-and-Raspberry-Pi/master/doc/labeled_image_example2.png)" + ] + }, + { + "cell_type": "markdown", + "source": [ + "#2. Install TensorFlow Object Detection Dependencies" + ], + "metadata": { + "id": "sxb8_h-QFErO" + } + }, + { + "cell_type": "markdown", + "metadata": { + "id": "l7EOtpvlLeS0" + }, + "source": [ + "First, we'll install the TensorFlow Object Detection API in this Google Colab instance. This requires cloning the [TensorFlow models repository](https://github.com/tensorflow/models) and running a couple installation commands. Click the play button to run the following sections of code.\n", + "\n", + "The latest version of TensorFlow this Colab has been verified to work with is TF v2.8.0.\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "ypWGYdPlLRUN" + }, + "outputs": [], + "source": [ + "# Clone the tensorflow models repository from GitHub\n", + "!pip uninstall Cython -y # Temporary fix for \"No module named 'object_detection'\" error\n", + "!git clone --depth 1 https://github.com/tensorflow/models" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "6QPmVBSlLTzM" + }, + "outputs": [], + "source": [ + "# Copy setup files into models/research folder\n", + "%%bash\n", + "cd models/research/\n", + "protoc object_detection/protos/*.proto --python_out=.\n", + "#cp object_detection/packages/tf2/setup.py ." + ] + }, + { + "cell_type": "code", + "source": [ + "# Modify setup.py file to install the tf-models-official repository targeted at TF v2.8.0\n", + "import re\n", + "with open('/content/models/research/object_detection/packages/tf2/setup.py') as f:\n", + " s = f.read()\n", + "\n", + "with open('/content/models/research/setup.py', 'w') as f:\n", + " # Set fine_tune_checkpoint path\n", + " s = re.sub('tf-models-official>=2.5.1',\n", + " 'tf-models-official==2.8.0', s)\n", + " f.write(s)" + ], + "metadata": { + "id": "NRBnuCKjM4Bd" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "When the following code block runs, a window may appear asking to restart the session.\n", + "\n", + "**MAKE SURE THE CODE BLOCK FINISHES EXECUTING BEFORE CLICKING \"RESTART SESSION\".**" + ], + "metadata": { + "id": "YvzjdFaNR7bN" + } + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "OLDnCkLLwLr6" + }, + "outputs": [], + "source": [ + "# Install the Object Detection API (NOTE: This block takes about 10 minutes to finish executing)\n", + "\n", + "# Need to do a temporary fix with PyYAML because Colab isn't able to install PyYAML v5.4.1\n", + "!pip install pyyaml==5.3\n", + "!pip install /content/models/research/\n", + "\n", + "# Need to downgrade to TF v2.8.0 due to Colab compatibility bug with TF v2.10 (as of 10/03/22)\n", + "!pip install tensorflow==2.8.0\n", + "!pip install tensorflow_io==0.23.1" + ] + }, + { + "cell_type": "code", + "source": [ + "# Install CUDA version 11.0 (to maintain compatibility with TF v2.8.0)\n", + "!wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-ubuntu1804.pin\n", + "!mv cuda-ubuntu1804.pin /etc/apt/preferences.d/cuda-repository-pin-600\n", + "!wget http://developer.download.nvidia.com/compute/cuda/11.0.2/local_installers/cuda-repo-ubuntu1804-11-0-local_11.0.2-450.51.05-1_amd64.deb\n", + "!dpkg -i cuda-repo-ubuntu1804-11-0-local_11.0.2-450.51.05-1_amd64.deb\n", + "!apt-key add /var/cuda-repo-ubuntu1804-11-0-local/7fa2af80.pub\n", + "!apt-get update && sudo apt-get install cuda-toolkit-11-0\n", + "!export LD_LIBRARY_PATH=/usr/local/cuda-11.0/lib64:$LD_LIBRARY_PATH" + ], + "metadata": { + "id": "dlc6mZZlH0yg" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "6V7TrfUos-9E" + }, + "source": [ + "You may get warnings or errors related to package dependencies in the previous code block, but you can ignore them for now.\n", + "\n", + "Let's test our installation by running `model_builder_tf2_test.py` to make sure everything is working as expected. Run the following code block and confirm that it finishes without errors. If you get errors, try Googling them or checking the FAQ at the end of this Colab." + ] + }, + { + "cell_type": "code", + "source": [ + "# Set protoc and protobuf to correct versions\n", + "!sudo apt-get remove -y protobuf-compiler\n", + "!pip install 'protobuf<=3.20.1' --force-reinstall\n", + "!wget https://github.com/protocolbuffers/protobuf/releases/download/v3.20.1/protoc-3.20.1-linux-x86_64.zip\n", + "!unzip protoc-3.20.1-linux-x86_64.zip -d protoc3\n", + "!sudo mv protoc3/bin/* /usr/local/bin/\n", + "!sudo mv protoc3/include/* /usr/local/include" + ], + "metadata": { + "id": "Osks__H4TWzg" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "wh_HPMOqWH9z" + }, + "outputs": [], + "source": [ + "# Run Model Bulider Test file, just to verify everything's working properly\n", + "!python /content/models/research/object_detection/builders/model_builder_tf2_test.py\n" + ] + }, + { + "cell_type": "markdown", + "source": [ + "# 3. Upload Image Dataset and Prepare Training Data" + ], + "metadata": { + "id": "eydREUsMGUUR" + } + }, + { + "cell_type": "markdown", + "metadata": { + "id": "mSZVCxE4nSVI" + }, + "source": [ + "In this section, we'll upload our data and prepare it for training with TensorFlow. We'll upload our images, split them into train, validation, and test folders, and then run scripts for creating TFRecords from our data.\n", + "\n", + "First, on your local PC, zip all your training images and XML files into a single folder called \"images.zip\". The files should be directly inside the zip folder, or in a nested folder as shown below:\n", + "```\n", + "images.zip\n", + "-- images\n", + " -- img1.jpg\n", + " -- img1.xml\n", + " -- img2.jpg\n", + " -- img2.xml\n", + " ...\n", + "```" + ] + }, + { + "cell_type": "markdown", + "source": [ + "### 3.1 Upload images\n", + "There are three options for moving the image files to this Colab instance." + ], + "metadata": { + "id": "LE1MtX4HGQA4" + } + }, + { + "cell_type": "markdown", + "source": [ + "**Option 1. Upload through Google Colab**\n", + "\n", + "Upload the \"images.zip\" file to the Google Colab instance by clicking the \"Files\" icon on the left hand side of the browser, and then the \"Upload to session storage\" icon. Select the zip folder to upload it.\n", + "\n", + "![](https://raw.githubusercontent.com/EdjeElectronics/TensorFlow-Lite-Object-Detection-on-Android-and-Raspberry-Pi/master/doc/colab_upload_button.png)" + ], + "metadata": { + "id": "sFSJoDEnJotN" + } + }, + { + "cell_type": "markdown", + "source": [ + "**Option 2. Copy from Google Drive**\n", + "\n", + "You can also upload your images to your personal Google Drive, mount the drive on this Colab session, and copy them over to the Colab filesystem. This option works well if you want to upload the images beforehand so you don't have to wait for them to upload each time you restart this Colab. If you have more than 50MB worth of images, I recommend using this option.\n", + "\n", + "First, upload the \"images.zip\" file to your Google Drive, and make note of the folder you uploaded them to. Replace `MyDrive/path/to/images.zip` with the path to your zip file. (For example, I uploaded the zip file to folder called \"change-counter1\", so I would use `MyDrive/change-counter1/images.zip` for the path). Then, run the following block of code to mount your Google Drive to this Colab session and copy the folder to this filesystem." + ], + "metadata": { + "id": "hGsPlloAGIXB" + } + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "tLgAPsQsfTLs" + }, + "outputs": [], + "source": [ + "from google.colab import drive\n", + "drive.mount('/content/gdrive')\n", + "\n", + "!cp /content/gdrive/MyDrive/path/to/images.zip /content" + ] + }, + { + "cell_type": "markdown", + "source": [ + "**Option 3. Use coin detection dataset**\n", + "\n", + "If you don't have a dataset and just want to try training a model, you can download my coin image dataset to use as an example. I've uploaded a dataset containing 750 labeled images of pennies, nickels, dimes, and quarters. Run the following code block to download the dataset." + ], + "metadata": { + "id": "9xAJMKwpFilm" + } + }, + { + "cell_type": "code", + "source": [ + "!wget -O /content/images.zip https://www.dropbox.com/s/gk57ec3v8dfuwcp/CoinPics_11NOV22.zip?dl=0 # United States coin images" + ], + "metadata": { + "id": "suu_xPVZIEcH" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "CHjOhoSGYwT7" + }, + "source": [ + "## 3.2 Split images into train, validation, and test folders\n", + "At this point, whether you used Option 1, 2, or 3, you should be able to click the folder icon on the left and see your \"images.zip\" file in the list of files. Now that the dataset is uploaded, let's unzip it and create some folders to hold the images. These directories are created in the /content folder in this instance's filesystem. You can browse the filesystem by clicking the \"Files\" icon on the left." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "mGvoHH-unSVO" + }, + "outputs": [], + "source": [ + "!mkdir /content/images\n", + "!unzip -q images.zip -d /content/images/all\n", + "!mkdir /content/images/train; mkdir /content/images/validation; mkdir /content/images/test" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "n-6RIcrwbQMh" + }, + "source": [ + "Next, we'll split the images into train, validation, and test sets. Here's what each set is used for:\n", + "\n", + "\n", + "\n", + "* **Train**: These are the actual images used to train the model. In each step of training, a batch of images from the \"train\" set is passed into the neural network. The network predicts classes and locations of objects in the images. The training algorithm calculates the loss (i.e. how \"wrong\" the predictions were) and adjusts the network weights through backpropagation.\n", + "\n", + "\n", + "* **Validation**: Images from the \"validation\" set can be used by the training algorithm to check the progress of training and adjust hyperparameters (like learning rate). Unlike \"train\" images, these images are only used periodically during training (i.e. once every certain number of training steps).\n", + "\n", + "\n", + "* **Test**: These images are never seen by the neural network during training. They are intended to be used by a human to perform final testing of the model to check how accurate the model is.\n", + "\n", + "I wrote a Python script to randomly move 80% of the images to the \"train\" folder, 10% to the \"validation\" folder, and 10% to the \"test\" folder. Click play on the following block to download the script and execute it." + ] + }, + { + "cell_type": "code", + "source": [ + "!wget https://raw.githubusercontent.com/EdjeElectronics/TensorFlow-Lite-Object-Detection-on-Android-and-Raspberry-Pi/master/util_scripts/train_val_test_split.py\n", + "!python train_val_test_split.py" + ], + "metadata": { + "id": "PfuZpmdBLjh-" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "p--K1PJXEgNo" + }, + "source": [ + "## 3.3 Create Labelmap and TFRecords\n", + "Finally, we need to create a labelmap for the detector and convert the images into a data file format called TFRecords, which are used by TensorFlow for training. We'll use Python scripts to automatically convert the data into TFRecord format. Before running them, we need to define a labelmap for our classes.\n", + "\n", + "The code section below will create a \"labelmap.txt\" file that contains a list of classes. Replace the `class1`, `class2`, `class3` text with your own classes (for example, `penny`, `nickel`, `dime`, `quarter`), adding a new line for each class. Then, click play to execute the code." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "_DE_r4MKY7ln" + }, + "outputs": [], + "source": [ + "### This creates a a \"labelmap.txt\" file with a list of classes the object detection model will detect.\n", + "%%bash\n", + "cat <> /content/labelmap.txt\n", + "class1\n", + "class2\n", + "class3\n", + "EOF" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "5pa2VYhTIT1l" + }, + "source": [ + "Download and run the data conversion scripts from the [GitHub repository](https://github.com/EdjeElectronics/TensorFlow-Lite-Object-Detection-on-Android-and-Raspberry-Pi) by clicking play on the following three sections of code. They will create TFRecord files for the train and validation datasets, as well as a `labelmap.pbtxt` file which contains the labelmap in a different format." + ] + }, + { + "cell_type": "code", + "source": [ + "# Download data conversion scripts\n", + "! wget https://raw.githubusercontent.com/EdjeElectronics/TensorFlow-Lite-Object-Detection-on-Android-and-Raspberry-Pi/master/util_scripts/create_csv.py\n", + "! wget https://raw.githubusercontent.com/EdjeElectronics/TensorFlow-Lite-Object-Detection-on-Android-and-Raspberry-Pi/master/util_scripts/create_tfrecord.py" + ], + "metadata": { + "id": "laZZE0TlEeUF" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "5tdDbTmHYwu-" + }, + "outputs": [], + "source": [ + "# Create CSV data files and TFRecord files\n", + "!python3 create_csv.py\n", + "!python3 create_tfrecord.py --csv_input=images/train_labels.csv --labelmap=labelmap.txt --image_dir=images/train --output_path=train.tfrecord\n", + "!python3 create_tfrecord.py --csv_input=images/validation_labels.csv --labelmap=labelmap.txt --image_dir=images/validation --output_path=val.tfrecord" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "RNyv_YyDXwMs" + }, + "source": [ + "We'll store the locations of the TFRecord and labelmap files as variables so we can reference them later in this Colab session." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "YUd2wtfrqedy" + }, + "outputs": [], + "source": [ + "train_record_fname = '/content/train.tfrecord'\n", + "val_record_fname = '/content/val.tfrecord'\n", + "label_map_pbtxt_fname = '/content/labelmap.pbtxt'" + ] + }, + { + "cell_type": "markdown", + "source": [ + "# 4. Set Up Training Configuration" + ], + "metadata": { + "id": "eGEUZYAMEZ6f" + } + }, + { + "cell_type": "markdown", + "metadata": { + "id": "I2MAcgJ53STW" + }, + "source": [ + "In this section, we'll set up the model and training configuration. We'll specifiy which pretrained TensorFlow model we want to use from the [TensorFlow 2 Object Detection Model Zoo](https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/tf2_detection_zoo.md). Each model also comes with a configuration file that points to file locations, sets training parameters (such as learning rate and total number of training steps), and more. We'll modify the configuration file for our custom training job.\n", + "\n", + "The first section of code lists out some models availabe in the TF2 Model Zoo and defines some filenames that will be used later to download the model and config file. This makes it easy to manage which model you're using and to add other models to the list later.\n", + "\n", + "Set the \"chosen_model\" variable to match the name of the model you'd like to train with. It's currently set to use the popular \"ssd-mobilenet-v2\" model. Click play on the next block once the chosen model has been set.\n", + "\n", + "Not sure which model to pick? [Check out my blog post comparing each model's speed and accuracy.](https://ejtech.io/learn/tflite-object-detection-model-comparison)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "gN0EUEa3e5Un" + }, + "outputs": [], + "source": [ + "# Change the chosen_model variable to deploy different models available in the TF2 object detection zoo\n", + "chosen_model = 'ssd-mobilenet-v2-fpnlite-320'\n", + "\n", + "MODELS_CONFIG = {\n", + " 'ssd-mobilenet-v2': {\n", + " 'model_name': 'ssd_mobilenet_v2_320x320_coco17_tpu-8',\n", + " 'base_pipeline_file': 'ssd_mobilenet_v2_320x320_coco17_tpu-8.config',\n", + " 'pretrained_checkpoint': 'ssd_mobilenet_v2_320x320_coco17_tpu-8.tar.gz',\n", + " },\n", + " 'efficientdet-d0': {\n", + " 'model_name': 'efficientdet_d0_coco17_tpu-32',\n", + " 'base_pipeline_file': 'ssd_efficientdet_d0_512x512_coco17_tpu-8.config',\n", + " 'pretrained_checkpoint': 'efficientdet_d0_coco17_tpu-32.tar.gz',\n", + " },\n", + " 'ssd-mobilenet-v2-fpnlite-320': {\n", + " 'model_name': 'ssd_mobilenet_v2_fpnlite_320x320_coco17_tpu-8',\n", + " 'base_pipeline_file': 'ssd_mobilenet_v2_fpnlite_320x320_coco17_tpu-8.config',\n", + " 'pretrained_checkpoint': 'ssd_mobilenet_v2_fpnlite_320x320_coco17_tpu-8.tar.gz',\n", + " },\n", + " # The centernet model isn't working as of 9/10/22\n", + " #'centernet-mobilenet-v2': {\n", + " # 'model_name': 'centernet_mobilenetv2fpn_512x512_coco17_od',\n", + " # 'base_pipeline_file': 'pipeline.config',\n", + " # 'pretrained_checkpoint': 'centernet_mobilenetv2fpn_512x512_coco17_od.tar.gz',\n", + " #}\n", + "}\n", + "\n", + "model_name = MODELS_CONFIG[chosen_model]['model_name']\n", + "pretrained_checkpoint = MODELS_CONFIG[chosen_model]['pretrained_checkpoint']\n", + "base_pipeline_file = MODELS_CONFIG[chosen_model]['base_pipeline_file']" + ] + }, + { + "cell_type": "markdown", + "source": [ + "Download the pretrained model file and configuration file by clicking Play on the following section." + ], + "metadata": { + "id": "JMG3EEPqPggV" + } + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "kG4TmJUVrYQ7" + }, + "outputs": [], + "source": [ + "# Create \"mymodel\" folder for holding pre-trained weights and configuration files\n", + "%mkdir /content/models/mymodel/\n", + "%cd /content/models/mymodel/\n", + "\n", + "# Download pre-trained model weights\n", + "import tarfile\n", + "download_tar = 'http://download.tensorflow.org/models/object_detection/tf2/20200711/' + pretrained_checkpoint\n", + "!wget {download_tar}\n", + "tar = tarfile.open(pretrained_checkpoint)\n", + "tar.extractall()\n", + "tar.close()\n", + "\n", + "# Download training configuration file for model\n", + "download_config = 'https://raw.githubusercontent.com/tensorflow/models/master/research/object_detection/configs/tf2/' + base_pipeline_file\n", + "!wget {download_config}" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "BFAlqNrPn5y3" + }, + "source": [ + "Now that we've downloaded our model and config file, we need to modify the configuration file with some high-level training parameters. The following variables are used to control training steps:\n", + "\n", + "* **num_steps**: The total amount of steps to use for training the model. A good number to start with is 40,000 steps. You can use more steps if you notice the loss metrics are still decreasing by the time training finishes. The more steps, the longer training will take. Training can also be stopped early if loss flattens out before reaching the specified number of steps.\n", + "* **batch_size**: The number of images to use per training step. A larger batch size allows a model to be trained in fewer steps, but the size is limited by the GPU memory available for training. With the GPUs used in Colab instances, 16 is a good number for SSD models and 4 is good for EfficientDet models.\n", + "\n", + "Other training information, like the location of the pretrained model file, the config file, and total number of classes are also assigned in this step. To learn more about training configuration with the TensorFlow Object Detection API, read this [article from Neptune](https://neptune.ai/blog/tensorflow-object-detection-api-best-practices-to-training-evaluation-deployment)." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "1lYDvJN-n69v" + }, + "outputs": [], + "source": [ + "# Set training parameters for the model\n", + "num_steps = 40000\n", + "\n", + "if chosen_model == 'efficientdet-d0':\n", + " batch_size = 4\n", + "else:\n", + " batch_size = 16" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "b_ki9jOqxn7V" + }, + "outputs": [], + "source": [ + "# Set file locations and get number of classes for config file\n", + "pipeline_fname = '/content/models/mymodel/' + base_pipeline_file\n", + "fine_tune_checkpoint = '/content/models/mymodel/' + model_name + '/checkpoint/ckpt-0'\n", + "\n", + "def get_num_classes(pbtxt_fname):\n", + " from object_detection.utils import label_map_util\n", + " label_map = label_map_util.load_labelmap(pbtxt_fname)\n", + " categories = label_map_util.convert_label_map_to_categories(\n", + " label_map, max_num_classes=90, use_display_name=True)\n", + " category_index = label_map_util.create_category_index(categories)\n", + " return len(category_index.keys())\n", + "num_classes = get_num_classes(label_map_pbtxt_fname)\n", + "print('Total classes:', num_classes)\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "cwPyaIAXxyKu" + }, + "source": [ + "Next, we'll rewrite the config file to use the training parameters we just specified. The following section of code will automatically replace the necessary parameters in the downloaded .config file and save it as our custom \"pipeline_file.config\" file." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "5eA5ht3_yukT" + }, + "outputs": [], + "source": [ + "# Create custom configuration file by writing the dataset, model checkpoint, and training parameters into the base pipeline file\n", + "import re\n", + "\n", + "%cd /content/models/mymodel\n", + "print('writing custom configuration file')\n", + "\n", + "with open(pipeline_fname) as f:\n", + " s = f.read()\n", + "with open('pipeline_file.config', 'w') as f:\n", + "\n", + " # Set fine_tune_checkpoint path\n", + " s = re.sub('fine_tune_checkpoint: \".*?\"',\n", + " 'fine_tune_checkpoint: \"{}\"'.format(fine_tune_checkpoint), s)\n", + "\n", + " # Set tfrecord files for train and test datasets\n", + " s = re.sub(\n", + " '(input_path: \".*?)(PATH_TO_BE_CONFIGURED/train)(.*?\")', 'input_path: \"{}\"'.format(train_record_fname), s)\n", + " s = re.sub(\n", + " '(input_path: \".*?)(PATH_TO_BE_CONFIGURED/val)(.*?\")', 'input_path: \"{}\"'.format(val_record_fname), s)\n", + "\n", + " # Set label_map_path\n", + " s = re.sub(\n", + " 'label_map_path: \".*?\"', 'label_map_path: \"{}\"'.format(label_map_pbtxt_fname), s)\n", + "\n", + " # Set batch_size\n", + " s = re.sub('batch_size: [0-9]+',\n", + " 'batch_size: {}'.format(batch_size), s)\n", + "\n", + " # Set training steps, num_steps\n", + " s = re.sub('num_steps: [0-9]+',\n", + " 'num_steps: {}'.format(num_steps), s)\n", + "\n", + " # Set number of classes num_classes\n", + " s = re.sub('num_classes: [0-9]+',\n", + " 'num_classes: {}'.format(num_classes), s)\n", + "\n", + " # Change fine-tune checkpoint type from \"classification\" to \"detection\"\n", + " s = re.sub(\n", + " 'fine_tune_checkpoint_type: \"classification\"', 'fine_tune_checkpoint_type: \"{}\"'.format('detection'), s)\n", + "\n", + " # If using ssd-mobilenet-v2, reduce learning rate (because it's too high in the default config file)\n", + " if chosen_model == 'ssd-mobilenet-v2':\n", + " s = re.sub('learning_rate_base: .8',\n", + " 'learning_rate_base: .08', s)\n", + "\n", + " s = re.sub('warmup_learning_rate: 0.13333',\n", + " 'warmup_learning_rate: .026666', s)\n", + "\n", + " # If using efficientdet-d0, use fixed_shape_resizer instead of keep_aspect_ratio_resizer (because it isn't supported by TFLite)\n", + " if chosen_model == 'efficientdet-d0':\n", + " s = re.sub('keep_aspect_ratio_resizer', 'fixed_shape_resizer', s)\n", + " s = re.sub('pad_to_max_dimension: true', '', s)\n", + " s = re.sub('min_dimension', 'height', s)\n", + " s = re.sub('max_dimension', 'width', s)\n", + "\n", + " f.write(s)\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "GDySP7TLzdCM" + }, + "source": [ + "(Optional) If you're curious, you can display the configuration file's contents here in the browser by running the line of code below." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "HEsOLOMHzBqF" + }, + "outputs": [], + "source": [ + "# (Optional) Display the custom configuration file's contents\n", + "!cat /content/models/mymodel/pipeline_file.config" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "UXpnXYC908Zl" + }, + "source": [ + "Finally, let's set the locations of the configuration file and model output directory as variables so we can reference them when we call the training command." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "GMlaN3rs3zLe" + }, + "outputs": [], + "source": [ + "# Set the path to the custom config file and the directory to store training checkpoints in\n", + "pipeline_file = '/content/models/mymodel/pipeline_file.config'\n", + "model_dir = '/content/training/'" + ] + }, + { + "cell_type": "markdown", + "source": [ + "# 5. Train Custom TFLite Detection Model" + ], + "metadata": { + "id": "-19zML6oEO7l" + } + }, + { + "cell_type": "markdown", + "metadata": { + "id": "XxPj_QV43qD5" + }, + "source": [ + "We're ready to train our object detection model! Before we start training, let's load up a TensorBoard session to monitor training progress. Run the following section of code, and a TensorBoard session will appear in the browser. It won't show anything yet, because we haven't started training. Once training starts, come back and click the refresh button to see the model's overall loss.\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "TI9iCCxoNlAL" + }, + "outputs": [], + "source": [ + "%load_ext tensorboard\n", + "%tensorboard --logdir '/content/training/train'" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "5cuQpPJL2pUq" + }, + "source": [ + "Model training is performed using the \"model_main_tf2.py\" script from the TF Object Detection API. Training will take anywhere from 2 to 6 hours, depending on the model, batch size, and number of training steps. We've already defined all the parameters and arguments used by `model_main_tf2.py` in previous sections of this Colab. Just click Play on the following block to begin training!\n", + "\n", + "\n", + "\n", + "> *Note: It takes a few minutes for the program to display any training messages, because it only displays logs once every 100 steps. If it seems like nothing is happening, just wait a couple minutes.*" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "tQTfZChVzzpZ" + }, + "outputs": [], + "source": [ + "# Run training!\n", + "!python /content/models/research/object_detection/model_main_tf2.py \\\n", + " --pipeline_config_path={pipeline_file} \\\n", + " --model_dir={model_dir} \\\n", + " --alsologtostderr \\\n", + " --num_train_steps={num_steps} \\\n", + " --sample_1_of_n_eval_examples=1" + ] + }, + { + "cell_type": "markdown", + "source": [ + "If you want to stop training early, just click Stop a couple times or right-click on the code block and select \"Interrupt Execution\". Otherwise, training will stop by itself once it reaches the specified number of training steps.\n" + ], + "metadata": { + "id": "WHxbX4ZpzXIv" + } + }, + { + "cell_type": "markdown", + "source": [ + "# 6. Convert Model to TensorFlow Lite" + ], + "metadata": { + "id": "kPg8oMnQDYKl" + } + }, + { + "cell_type": "markdown", + "metadata": { + "id": "spQXdq8Y63pj" + }, + "source": [ + "Alright! Our model is all trained up and ready to be used for detecting objects. First, we need to export the model graph (a file that contains information about the architecture and weights) to a TensorFlow Lite-compatible format. We'll do this using the `export_tflite_graph_tf2.py` script." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "RaUU8tBlHifd" + }, + "outputs": [], + "source": [ + "# Make a directory to store the trained TFLite model\n", + "!mkdir /content/custom_model_lite\n", + "output_directory = '/content/custom_model_lite'\n", + "\n", + "# Path to training directory (the conversion script automatically chooses the highest checkpoint file)\n", + "last_model_path = '/content/training'\n", + "\n", + "!python /content/models/research/object_detection/export_tflite_graph_tf2.py \\\n", + " --trained_checkpoint_dir {last_model_path} \\\n", + " --output_directory {output_directory} \\\n", + " --pipeline_config_path {pipeline_file}\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "z_NuapO2VROu" + }, + "source": [ + "Next, we'll take the exported graph and use the `TFLiteConverter` module to convert it to `.tflite` FlatBuffer format." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "TsE_uVjlsz3u" + }, + "outputs": [], + "source": [ + "# Convert exported graph file into TFLite model file\n", + "import tensorflow as tf\n", + "\n", + "converter = tf.lite.TFLiteConverter.from_saved_model('/content/custom_model_lite/saved_model')\n", + "tflite_model = converter.convert()\n", + "\n", + "with open('/content/custom_model_lite/detect.tflite', 'wb') as f:\n", + " f.write(tflite_model)" + ] + }, + { + "cell_type": "markdown", + "source": [ + "# 7. Test TensorFlow Lite Model and Calculate mAP" + ], + "metadata": { + "id": "RDQrtQhvC3oG" + } + }, + { + "cell_type": "markdown", + "metadata": { + "id": "vtSmUZcxIAvt" + }, + "source": [ + "We've trained our custom model and converted it to TFLite format. But how well does it actually perform at detecting objects in images? This is where the images we set aside in the **test** folder come in. The model never saw any test images during training, so its performance on these images should be representative of how it will perform on new images from the field.\n", + "\n", + "### 7.1 Inference test images\n", + "The following code defines a function to run inference on test images. It loads the images, loads the model and labelmap, runs the model on each image, and displays the result. It also optionally saves detection results as text files so we can use them to calculate model mAP score.\n", + "\n", + "This code is based off the [TFLite_detection_image.py](https://github.com/EdjeElectronics/TensorFlow-Lite-Object-Detection-on-Android-and-Raspberry-Pi/blob/master/TFLite_detection_image.py) script from my [TensorFlow Lite Object Detection repository on GitHub](https://github.com/EdjeElectronics/TensorFlow-Lite-Object-Detection-on-Android-and-Raspberry-Pi); feel free to use it as a starting point for your own application." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "e4WtI8i5K96w" + }, + "outputs": [], + "source": [ + "# Script to run custom TFLite model on test images to detect objects\n", + "# Source: https://github.com/EdjeElectronics/TensorFlow-Lite-Object-Detection-on-Android-and-Raspberry-Pi/blob/master/TFLite_detection_image.py\n", + "\n", + "# Import packages\n", + "import os\n", + "import cv2\n", + "import numpy as np\n", + "import sys\n", + "import glob\n", + "import random\n", + "import importlib.util\n", + "from tensorflow.lite.python.interpreter import Interpreter\n", + "\n", + "import matplotlib\n", + "import matplotlib.pyplot as plt\n", + "\n", + "%matplotlib inline\n", + "\n", + "### Define function for inferencing with TFLite model and displaying results\n", + "\n", + "def tflite_detect_images(modelpath, imgpath, lblpath, min_conf=0.5, num_test_images=10, savepath='/content/results', txt_only=False):\n", + "\n", + " # Grab filenames of all images in test folder\n", + " images = glob.glob(imgpath + '/*.jpg') + glob.glob(imgpath + '/*.JPG') + glob.glob(imgpath + '/*.png') + glob.glob(imgpath + '/*.bmp')\n", + "\n", + " # Load the label map into memory\n", + " with open(lblpath, 'r') as f:\n", + " labels = [line.strip() for line in f.readlines()]\n", + "\n", + " # Load the Tensorflow Lite model into memory\n", + " interpreter = Interpreter(model_path=modelpath)\n", + " interpreter.allocate_tensors()\n", + "\n", + " # Get model details\n", + " input_details = interpreter.get_input_details()\n", + " output_details = interpreter.get_output_details()\n", + " height = input_details[0]['shape'][1]\n", + " width = input_details[0]['shape'][2]\n", + "\n", + " float_input = (input_details[0]['dtype'] == np.float32)\n", + "\n", + " input_mean = 127.5\n", + " input_std = 127.5\n", + "\n", + " # Randomly select test images\n", + " images_to_test = random.sample(images, num_test_images)\n", + "\n", + " # Loop over every image and perform detection\n", + " for image_path in images_to_test:\n", + "\n", + " # Load image and resize to expected shape [1xHxWx3]\n", + " image = cv2.imread(image_path)\n", + " image_rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)\n", + " imH, imW, _ = image.shape\n", + " image_resized = cv2.resize(image_rgb, (width, height))\n", + " input_data = np.expand_dims(image_resized, axis=0)\n", + "\n", + " # Normalize pixel values if using a floating model (i.e. if model is non-quantized)\n", + " if float_input:\n", + " input_data = (np.float32(input_data) - input_mean) / input_std\n", + "\n", + " # Perform the actual detection by running the model with the image as input\n", + " interpreter.set_tensor(input_details[0]['index'],input_data)\n", + " interpreter.invoke()\n", + "\n", + " # Retrieve detection results\n", + " boxes = interpreter.get_tensor(output_details[1]['index'])[0] # Bounding box coordinates of detected objects\n", + " classes = interpreter.get_tensor(output_details[3]['index'])[0] # Class index of detected objects\n", + " scores = interpreter.get_tensor(output_details[0]['index'])[0] # Confidence of detected objects\n", + "\n", + " detections = []\n", + "\n", + " # Loop over all detections and draw detection box if confidence is above minimum threshold\n", + " for i in range(len(scores)):\n", + " if ((scores[i] > min_conf) and (scores[i] <= 1.0)):\n", + "\n", + " # Get bounding box coordinates and draw box\n", + " # Interpreter can return coordinates that are outside of image dimensions, need to force them to be within image using max() and min()\n", + " ymin = int(max(1,(boxes[i][0] * imH)))\n", + " xmin = int(max(1,(boxes[i][1] * imW)))\n", + " ymax = int(min(imH,(boxes[i][2] * imH)))\n", + " xmax = int(min(imW,(boxes[i][3] * imW)))\n", + "\n", + " cv2.rectangle(image, (xmin,ymin), (xmax,ymax), (10, 255, 0), 2)\n", + "\n", + " # Draw label\n", + " object_name = labels[int(classes[i])] # Look up object name from \"labels\" array using class index\n", + " label = '%s: %d%%' % (object_name, int(scores[i]*100)) # Example: 'person: 72%'\n", + " labelSize, baseLine = cv2.getTextSize(label, cv2.FONT_HERSHEY_SIMPLEX, 0.7, 2) # Get font size\n", + " label_ymin = max(ymin, labelSize[1] + 10) # Make sure not to draw label too close to top of window\n", + " cv2.rectangle(image, (xmin, label_ymin-labelSize[1]-10), (xmin+labelSize[0], label_ymin+baseLine-10), (255, 255, 255), cv2.FILLED) # Draw white box to put label text in\n", + " cv2.putText(image, label, (xmin, label_ymin-7), cv2.FONT_HERSHEY_SIMPLEX, 0.7, (0, 0, 0), 2) # Draw label text\n", + "\n", + " detections.append([object_name, scores[i], xmin, ymin, xmax, ymax])\n", + "\n", + "\n", + " # All the results have been drawn on the image, now display the image\n", + " if txt_only == False: # \"text_only\" controls whether we want to display the image results or just save them in .txt files\n", + " image = cv2.cvtColor(image,cv2.COLOR_BGR2RGB)\n", + " plt.figure(figsize=(12,16))\n", + " plt.imshow(image)\n", + " plt.show()\n", + "\n", + " # Save detection results in .txt files (for calculating mAP)\n", + " elif txt_only == True:\n", + "\n", + " # Get filenames and paths\n", + " image_fn = os.path.basename(image_path)\n", + " base_fn, ext = os.path.splitext(image_fn)\n", + " txt_result_fn = base_fn +'.txt'\n", + " txt_savepath = os.path.join(savepath, txt_result_fn)\n", + "\n", + " # Write results to text file\n", + " # (Using format defined by https://github.com/Cartucho/mAP, which will make it easy to calculate mAP)\n", + " with open(txt_savepath,'w') as f:\n", + " for detection in detections:\n", + " f.write('%s %.4f %d %d %d %d\\n' % (detection[0], detection[1], detection[2], detection[3], detection[4], detection[5]))\n", + "\n", + " return" + ] + }, + { + "cell_type": "markdown", + "source": [ + "The next block sets the paths to the test images and models and then runs the inferencing function. If you want to use more than 10 images, change the `images_to_test` variable. Click play to run inferencing!" + ], + "metadata": { + "id": "-CJI4A0f_zqz" + } + }, + { + "cell_type": "code", + "source": [ + "# Set up variables for running user's model\n", + "PATH_TO_IMAGES='/content/images/test' # Path to test images folder\n", + "PATH_TO_MODEL='/content/custom_model_lite/detect.tflite' # Path to .tflite model file\n", + "PATH_TO_LABELS='/content/labelmap.txt' # Path to labelmap.txt file\n", + "min_conf_threshold=0.5 # Confidence threshold (try changing this to 0.01 if you don't see any detection results)\n", + "images_to_test = 10 # Number of images to run detection on\n", + "\n", + "# Run inferencing function!\n", + "tflite_detect_images(PATH_TO_MODEL, PATH_TO_IMAGES, PATH_TO_LABELS, min_conf_threshold, images_to_test)" + ], + "metadata": { + "id": "6t8CMarqBqP9" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "### 7.2 Calculate mAP\n", + "Now we have a visual sense of how our model performs on test images, but how can we quantitatively measure its accuracy?\n", + "\n", + "One popular methord for measuring object detection model accuracy is \"mean average precision\" (mAP). Basically, the higher the mAP score, the better your model is at detecting objects in images. To learn more about mAP, read through this [article from Roboflow](https://blog.roboflow.com/mean-average-precision/).\n", + "\n", + "We'll use the mAP calculator tool at https://github.com/Cartucho/mAP to determine our model's mAP score. First, we need to clone the repository and remove its existing example data. We'll also download a script I wrote for interfacing with the calculator." + ], + "metadata": { + "id": "N_ckqeWqBF0P" + } + }, + { + "cell_type": "code", + "source": [ + "%%bash\n", + "git clone https://github.com/Cartucho/mAP /content/mAP\n", + "cd /content/mAP\n", + "rm input/detection-results/*\n", + "rm input/ground-truth/*\n", + "rm input/images-optional/*\n", + "wget https://raw.githubusercontent.com/EdjeElectronics/TensorFlow-Lite-Object-Detection-on-Android-and-Raspberry-Pi/master/util_scripts/calculate_map_cartucho.py" + ], + "metadata": { + "id": "JlWarXEZDUqS" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "Next, we'll copy the images and annotation data from the **test** folder to the appropriate folders inside the cloned repository. These will be used as the \"ground truth data\" that our model's detection results will be compared to.\n" + ], + "metadata": { + "id": "qn22nGGqH5T6" + } + }, + { + "cell_type": "code", + "source": [ + "!cp /content/images/test/* /content/mAP/input/images-optional # Copy images and xml files\n", + "!mv /content/mAP/input/images-optional/*.xml /content/mAP/input/ground-truth/ # Move xml files to the appropriate folder" + ], + "metadata": { + "id": "5szFfVxwI3wT" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "The calculator tool expects annotation data in a format that's different from the Pascal VOC .xml file format we're using. Fortunately, it provides an easy script, `convert_gt_xml.py`, for converting to the expected .txt format.\n", + "\n" + ], + "metadata": { + "id": "u6aro817DGzx" + } + }, + { + "cell_type": "code", + "source": [ + "!python /content/mAP/scripts/extra/convert_gt_xml.py" + ], + "metadata": { + "id": "qdjtOUDnK2AA" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "Okay, we've set up the ground truth data, but now we need actual detection results from our model. The detection results will be compared to the ground truth data to calculate the model's accuracy in mAP.\n", + "\n", + "The inference function we defined in Step 7.1 can be used to generate detection data for all the images in the **test** folder. We'll use it the same as before, except this time we'll tell it to save detection results into the `detection-results` folder.\n", + "\n", + "Click Play to run the following code block!" + ], + "metadata": { + "id": "mnIUacAlLP0B" + } + }, + { + "cell_type": "code", + "source": [ + "# Set up variables for running inference, this time to get detection results saved as .txt files\n", + "PATH_TO_IMAGES='/content/images/test' # Path to test images folder\n", + "PATH_TO_MODEL='/content/custom_model_lite/detect.tflite' # Path to .tflite model file\n", + "PATH_TO_LABELS='/content/labelmap.txt' # Path to labelmap.txt file\n", + "PATH_TO_RESULTS='/content/mAP/input/detection-results' # Folder to save detection results in\n", + "min_conf_threshold=0.1 # Confidence threshold\n", + "\n", + "# Use all the images in the test folder\n", + "image_list = glob.glob(PATH_TO_IMAGES + '/*.jpg') + glob.glob(PATH_TO_IMAGES + '/*.JPG') + glob.glob(PATH_TO_IMAGES + '/*.png') + glob.glob(PATH_TO_IMAGES + '/*.bmp')\n", + "images_to_test = min(500, len(image_list)) # If there are more than 500 images in the folder, just use 500\n", + "\n", + "# Tell function to just save results and not display images\n", + "txt_only = True\n", + "\n", + "# Run inferencing function!\n", + "print('Starting inference on %d images...' % images_to_test)\n", + "tflite_detect_images(PATH_TO_MODEL, PATH_TO_IMAGES, PATH_TO_LABELS, min_conf_threshold, images_to_test, PATH_TO_RESULTS, txt_only)\n", + "print('Finished inferencing!')" + ], + "metadata": { + "id": "szzHFAhsMNFF" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "Finally, let's calculate mAP! One popular style for reporting mAP is the COCO metric for mAP @ 0.50:0.95. Basically, this means that mAP is calculated at several IoU thresholds between 0.50 and 0.95, and then the result from each threshold is averaged to get a final mAP score. [Learn more here!](https://blog.roboflow.com/mean-average-precision/)\n", + "\n", + "I wrote a script to run the calculator tool at each IoU threshold, average the results, and report the final accuracy score. It reports mAP for each class and overall mAP. Click Play on the following two blocks to calculate mAP!" + ], + "metadata": { + "id": "e_QRnTqNPX4z" + } + }, + { + "cell_type": "code", + "source": [ + "%cd /content/mAP\n", + "!python calculate_map_cartucho.py --labels=/content/labelmap.txt" + ], + "metadata": { + "id": "3DkjpIBARTQ7" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "The score reported at the end is your model's overall mAP score. Ideally, it should be above 50% (0.50). If it isn't, you can increase your model's accuracy by adding more images to your dataset. See my [dataset video](https://www.youtube.com/watch?v=v0ssiOY6cfg) for tips on how to capture good training images and improve accuracy." + ], + "metadata": { + "id": "R9HPoOBVKvxU" + } + }, + { + "cell_type": "markdown", + "source": [ + "# 8. Deploy TensorFlow Lite Model" + ], + "metadata": { + "id": "5i40ve0SCLaE" + } + }, + { + "cell_type": "markdown", + "metadata": { + "id": "phT8vvzriqQp" + }, + "source": [ + "Now that your custom model has been trained and converted to TFLite format, it's ready to be downloaded and deployed in an application! This section shows how to download the model and provides links to instructions for deploying it on the Raspberry Pi, your PC, or other edge devices." + ] + }, + { + "cell_type": "markdown", + "source": [ + "## 8.1. Download TFLite model\n", + "\n", + "Run the two following cells to copy the labelmap files into the model folder, compress it into a zip folder, and then download it. The zip folder contains the `detect.tflite` model and `labelmap.txt` labelmap files that are needed to run the model in your application." + ], + "metadata": { + "id": "zq3L2IoP4VHp" + } + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "awZMQGVqMpVL" + }, + "outputs": [], + "source": [ + "# Move labelmap and pipeline config files into TFLite model folder and zip it up\n", + "!cp /content/labelmap.txt /content/custom_model_lite\n", + "!cp /content/labelmap.pbtxt /content/custom_model_lite\n", + "!cp /content/models/mymodel/pipeline_file.config /content/custom_model_lite\n", + "\n", + "%cd /content\n", + "!zip -r custom_model_lite.zip custom_model_lite" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "FVPfAGbNPV56" + }, + "outputs": [], + "source": [ + "from google.colab import files\n", + "\n", + "files.download('/content/custom_model_lite.zip')" + ] + }, + { + "cell_type": "markdown", + "source": [ + "The `custom_model_lite.zip` file containing the model will download into your Downloads folder. It's ready to be deployed on your device!" + ], + "metadata": { + "id": "9Kb3ZBsMq95l" + } + }, + { + "cell_type": "markdown", + "metadata": { + "id": "GSJ2wgGCixy2" + }, + "source": [ + "## 8.2. Deploy model\n", + "TensorFlow Lite models can run on a wide variety of hardware, including PCs, embedded systems, and phones. This section provides instructions showing how to deploy your TFLite model on various devices.\n", + "\n", + "### 8.2.1. Deploy on Raspberry Pi\n", + "TFLite models are great for running on the Raspberry Pi, because they require less processing power than regular TensorFlow vision models. The Pi can run TFLite models in near real-time.\n", + "\n", + "To run your new model on the Raspberry Pi, you'll have to install TensorFlow Lite and prepare a Python environment for your application. I provide step-by-step instructions on how to set up TFLite on the Pi in my video, [How To Run TensorFlow Lite on Raspberry Pi for Object Detection](https://youtu.be/aimSGOAUI8Y).\n", + "\n", + "[![Link to my YouTube video!](https://raw.githubusercontent.com/EdjeElectronics/TensorFlow-Lite-Object-Detection-on-Android-and-Raspberry-Pi/master/doc/YouTube_video1.JPG)](https://www.youtube.com/watch?v=aimSGOAUI8Y)\n", + "\n", + "Once you've completed all the steps in the video, move the `custom_model_lite.zip` file downloaded from this Colab session over to your Raspberry Pi into the `~/tflite1` folder. Move into the folder and unzip it by issuing:\n", + "\n", + "```\n", + "cd ~/tflite1\n", + "unzip custom_model_lite.zip\n", + "```\n", + "\n", + "Then, run the image, video, or webcam TFLite detection program with the `--modeldir=fine_tuned_model_lite` argument. For example, to run the webcam detection program, issue:\n", + "\n", + "```\n", + "python TFLite_detection_webcam.py --modeldir=custom_model_lite\n", + "```\n", + "\n", + "A window will appear showing a live feed from your webcam with boxes drawn around detected objects in each frame.\n", + "\n", + "### 8.2.2. Deploy on Windows, Linux, or macOS\n", + "Follow the instructions linked below to quickly set up your Windows, Linux, or macOS computer to run TFLite models. It only takes a few minutes! Running a model on your PC is good for quickly testing your model with a webcam. However, keep in mind that the TFLite Runtime is optimized for lower-power processors, and it won't utilize the full capability of your PC's processor.\n", + "\n", + "Here are links to the deployment guides for Windows, Linux, and macOS:\n", + "* [How to Run TensorFlow Lite Models on Windows](https://github.com/EdjeElectronics/TensorFlow-Lite-Object-Detection-on-Android-and-Raspberry-Pi/blob/master/deploy_guides/Windows_TFLite_Guide.md)\n", + "* *link to Linux guide to be added (but really it's the same as Raspberry Pi)*\n", + "* [How to Run TensorFlow Lite Models on macOS](https://github.com/EdjeElectronics/TensorFlow-Lite-Object-Detection-on-Android-and-Raspberry-Pi/blob/master/deploy_guides/MacOS_TFLite_Guide.md)\n", + "\n", + "### 8.2.3. Deploy on other Linux-based edge devices\n", + "Instructions to be added! 🐧\n", + "\n", + "### 8.2.4. Deploy on Android\n", + "Instructions to be added! 🤖\n", + "\n", + "\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "source": [ + "# 9. (Optional) Post-Training Quantization" + ], + "metadata": { + "id": "WoptFnAhCSrR" + } + }, + { + "cell_type": "markdown", + "metadata": { + "id": "I54paUm8dUCr" + }, + "source": [ + "Want to make your TFLite model run even faster? Quantize it! Quantizating a model converts its weights from 32-bit floating-point values to 8-bit integer values. This allows the quantized model to run faster and occupy less memory without too much reduction in accuracy.\n", + "\n", + "> Note: If you observe an obvious decrease in detection accuracy when quantizing your model with TF2, I recommend using TensorFlow 1 to quantize your model instead. TF1 supports quantization-aware training, which helps improve the accuracy of quantized models. The ssd-mobilenet-v2-quantized model from the TF1 Model Zoo has fast and accurate performance when trained with a custom dataset. Visit my [TFLite v1 Colab notebook](https://colab.research.google.com/github/EdjeElectronics/TensorFlow-Lite-Object-Detection-on-Android-and-Raspberry-Pi/blob/master/Train_TFLite1_Object_Detection_Model.ipynb) for step-by-step instructions on how to train and quantize a model with TensorFlow 1." + ] + }, + { + "cell_type": "markdown", + "source": [ + "## 9.1. Quantize model\n", + "We'll use the \"TFLiteConverter\" module to perform [post-training quantization](https://www.tensorflow.org/lite/performance/post_training_quantization) on the model. To quantize the model, we need to provide a representative dataset, which is a set of images that represent what the model will see when deployed in the field. First, we'll create a list of images to include in the representative dataset (we'll just use the images in the `train` folder).\n" + ], + "metadata": { + "id": "VTyqlXFTJ0Uv" + } + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "XSNZtfj_k3NP" + }, + "outputs": [], + "source": [ + "# Get list of all images in train directory\n", + "image_path = '/content/images/train'\n", + "\n", + "jpg_file_list = glob.glob(image_path + '/*.jpg')\n", + "JPG_file_list = glob.glob(image_path + '/*.JPG')\n", + "png_file_list = glob.glob(image_path + '/*.png')\n", + "bmp_file_list = glob.glob(image_path + '/*.bmp')\n", + "\n", + "quant_image_list = jpg_file_list + JPG_file_list + png_file_list + bmp_file_list" + ] + }, + { + "cell_type": "markdown", + "source": [ + "Next, we'll define a function to yield images from our representative dataset. Refer to [TensorFlow's sample quantization code](https://colab.research.google.com/github/google-coral/tutorials/blob/master/retrain_classification_ptq_tf2.ipynb#scrollTo=kRDabW_u1wnv) to get a better understanding of what this is doing!" + ], + "metadata": { + "id": "cqbH1VlEgiuy" + } + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "ORzx0XRErSLV" + }, + "outputs": [], + "source": [ + "# A generator that provides a representative dataset\n", + "# Code modified from https://colab.research.google.com/github/google-coral/tutorials/blob/master/retrain_classification_ptq_tf2.ipynb\n", + "\n", + "# First, get input details for model so we know how to preprocess images\n", + "interpreter = Interpreter(model_path=PATH_TO_MODEL) # PATH_TO_MODEL is defined in Step 7 above\n", + "interpreter.allocate_tensors()\n", + "input_details = interpreter.get_input_details()\n", + "output_details = interpreter.get_output_details()\n", + "height = input_details[0]['shape'][1]\n", + "width = input_details[0]['shape'][2]\n", + "\n", + "import random\n", + "\n", + "def representative_data_gen():\n", + " dataset_list = quant_image_list\n", + " quant_num = 300\n", + " for i in range(quant_num):\n", + " pick_me = random.choice(dataset_list)\n", + " image = tf.io.read_file(pick_me)\n", + "\n", + " if pick_me.endswith('.jpg') or pick_me.endswith('.JPG'):\n", + " image = tf.io.decode_jpeg(image, channels=3)\n", + " elif pick_me.endswith('.png'):\n", + " image = tf.io.decode_png(image, channels=3)\n", + " elif pick_me.endswith('.bmp'):\n", + " image = tf.io.decode_bmp(image, channels=3)\n", + "\n", + " image = tf.image.resize(image, [width, height]) # TO DO: Replace 300s with an automatic way of reading network input size\n", + " image = tf.cast(image / 255., tf.float32)\n", + " image = tf.expand_dims(image, 0)\n", + " yield [image]" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "wqtu98mzebEj" + }, + "source": [ + "Finally, we'll initialize the TFLiteConverter module, point it at the TFLite graph we generated in Step 6, and provide it with the representative dataset generator function we created in the previous code block. We'll configure the converter to quantize the model's weight values to INT8 format." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "Ox0bGDWds_Ce" + }, + "outputs": [], + "source": [ + "# Initialize converter module\n", + "converter = tf.lite.TFLiteConverter.from_saved_model('/content/custom_model_lite/saved_model')\n", + "\n", + "# This enables quantization\n", + "converter.optimizations = [tf.lite.Optimize.DEFAULT]\n", + "# This sets the representative dataset for quantization\n", + "converter.representative_dataset = representative_data_gen\n", + "# This ensures that if any ops can't be quantized, the converter throws an error\n", + "converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS, tf.lite.OpsSet.TFLITE_BUILTINS_INT8]\n", + "# For full integer quantization, though supported types defaults to int8 only, we explicitly declare it for clarity.\n", + "converter.target_spec.supported_types = [tf.int8]\n", + "# These set the input tensors to uint8 and output tensors to float32\n", + "converter.inference_input_type = tf.uint8\n", + "converter.inference_output_type = tf.float32\n", + "tflite_model = converter.convert()\n", + "\n", + "with open('/content/custom_model_lite/detect_quant.tflite', 'wb') as f:\n", + " f.write(tflite_model)" + ] + }, + { + "cell_type": "markdown", + "source": [ + "## 9.2. Test quantized model\n", + "The model has been quantized and exported as `detect_quant.tflite`. Let's test it out! We'll re-use the function from Section 7 for running the model on test images and display the results, except this time we'll point it at the quantized model.\n", + "\n", + "Click Play on the code block below to test the `detect_quant.tflite` model." + ], + "metadata": { + "id": "dYVVlv5QUUZF" + } + }, + { + "cell_type": "code", + "source": [ + "# Set up parameters for inferencing function (using detect_quant.tflite instead of detect.tflite)\n", + "PATH_TO_IMAGES='/content/images/test' #Path to test images folder\n", + "PATH_TO_MODEL='/content/custom_model_lite/detect_quant.tflite' #Path to .tflite model file\n", + "PATH_TO_LABELS='/content/labelmap.txt' #Path to labelmap.txt file\n", + "min_conf_threshold=0.5 #Confidence threshold (try changing this to 0.01 if you don't see any detection results)\n", + "images_to_test = 10 #Number of images to run detection on\n", + "\n", + "# Run inferencing function!\n", + "tflite_detect_images(PATH_TO_MODEL, PATH_TO_IMAGES, PATH_TO_LABELS, min_conf_threshold, images_to_test)" + ], + "metadata": { + "id": "6OoirJuOtdOG" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "If your quantized model isn't performing very well, try using my TensorFlow Lite 1 notebook *(link to be added)* to train a SSD-MobileNet model with your dataset. In my experience, the `ssd-mobilenet-v2-quantized` model from the [TF1 Model Zoo](https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/tf1_detection_zoo.md) has the best quantized performance out of any other TensorFlow Lite model.\n", + "\n", + "TFLite models created with TensorFlow 1 are still compatible with the TensorFlow Lite 2 runtime, so your TFLite 1 model will still work with my [TensorFlow setup guide for the Raspberry Pi](https://github.com/EdjeElectronics/TensorFlow-Lite-Object-Detection-on-Android-and-Raspberry-Pi/blob/master/Raspberry_Pi_Guide.md)." + ], + "metadata": { + "id": "cKo7ZtfOyoxG" + } + }, + { + "cell_type": "markdown", + "source": [ + "## 9.3 Calculate quantized model mAP\n", + "\n", + "Let's calculate the quantize model's mAP using the calculator tool we set up in Step 7.2. We just need to perform inference with our quantized model (`detect_quant.tflite`) to get a new set of detection results.\n", + "\n", + "Run the following block to run inference on the test images and save the detection results." + ], + "metadata": { + "id": "vWdVxs6LUjbR" + } + }, + { + "cell_type": "code", + "source": [ + "# Need to remove existing detection results first\n", + "!rm /content/mAP/input/detection-results/*\n", + "\n", + "# Set up variables for running inference, this time to get detection results saved as .txt files\n", + "PATH_TO_IMAGES='/content/images/test' # Path to test images folder\n", + "PATH_TO_MODEL='/content/custom_model_lite/detect_quant.tflite' # Path to quantized .tflite model file\n", + "PATH_TO_LABELS='/content/labelmap.txt' # Path to labelmap.txt file\n", + "PATH_TO_RESULTS='/content/mAP/input/detection-results' # Folder to save detection results in\n", + "min_conf_threshold=0.1 # Confidence threshold\n", + "\n", + "# Use all the images in the test folder\n", + "image_list = glob.glob(PATH_TO_IMAGES + '/*.jpg') + glob.glob(PATH_TO_IMAGES + '/*.JPG') + glob.glob(PATH_TO_IMAGES + '/*.png') + glob.glob(PATH_TO_IMAGES + '/*.bmp')\n", + "images_to_test = min(500, len(image_list)) # If there are more than 500 images in the folder, just use 500\n", + "\n", + "# Tell function to just save results and not display images\n", + "txt_only = True\n", + "\n", + "# Run inferencing function!\n", + "print('Starting inference on %d images...' % images_to_test)\n", + "tflite_detect_images(PATH_TO_MODEL, PATH_TO_IMAGES, PATH_TO_LABELS, min_conf_threshold, images_to_test, PATH_TO_RESULTS, txt_only)\n", + "print('Finished inferencing!')" + ], + "metadata": { + "id": "ZMaumV-11Et0" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "Now we can run the mAP calculation script to determine our quantized model's mAP." + ], + "metadata": { + "id": "QgcmdLQf1Et1" + } + }, + { + "cell_type": "code", + "source": [ + "cd /content/mAP" + ], + "metadata": { + "id": "ZIRNp0Af1Et1" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "source": [ + "!python calculate_map_cartucho.py --labels=/content/labelmap.txt" + ], + "metadata": { + "id": "4TDgMBw_1Et1" + }, + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "XFsuasvxFHo8" + }, + "source": [ + "## 9.4. Compile model for Edge TPU\n", + "\n", + "Now that the model has been converted to TFLite and quantized, we can compile it to run on Edge TPU devices like the [Coral USB Accelerator](https://coral.ai/products/accelerator/) or the [Coral Dev Board](https://coral.ai/products/dev-board/). This allows the model to run much faster! For information on how to set up the USB Accelerator, my [TensorFlow Lite repository on GitHub](https://github.com/EdjeElectronics/TensorFlow-Lite-Object-Detection-on-Android-and-Raspberry-Pi/blob/master/deploy_guides/Raspberry_Pi_Guide.md#section-2---run-edge-tpu-object-detection-models-on-the-raspberry-pi-using-the-coral-usb-accelerator).\n", + "\n", + "First, install the Edge TPU Compiler package inside this Colab instance." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "mUd_SNC0JSq0" + }, + "outputs": [], + "source": [ + "! curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -\n", + "! echo \"deb https://packages.cloud.google.com/apt coral-edgetpu-stable main\" | sudo tee /etc/apt/sources.list.d/coral-edgetpu.list\n", + "! sudo apt-get update\n", + "! sudo apt-get install edgetpu-compiler" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "usfmdtSiJuuC" + }, + "source": [ + "Next, compile the quantize TFLite model. (If your model has a different filename than \"detect_quant.tflite\", use that instead.)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "mULCY0nb0ahH" + }, + "outputs": [], + "source": [ + "%cd /content/custom_model_lite\n", + "!edgetpu_compiler detect_quant.tflite\n", + "!mv detect_quant_edgetpu.tflite edgetpu.tflite\n", + "!rm detect_quant_edgetpu.log" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "oqGy2FgzKomN" + }, + "source": [ + "The compiled model will be output in the `custom_model_lite` folder as \"detect__quant_edgetpu.tflite\". It gets renamed to \"edgetpu.tflite\" to be consistent with my code. Zip the `custom_model_lite` folder and download it by running the two code blocks below." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "8nCdUouYJjQM" + }, + "outputs": [], + "source": [ + "%cd /content\n", + "!zip -r custom_model_lite.zip custom_model_lite" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "AmjqvKuuK8ZR" + }, + "outputs": [], + "source": [ + "from google.colab import files\n", + "\n", + "files.download('custom_model_lite.zip')" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "ptwpBBEWLfuJ" + }, + "source": [ + "Now you're all set to use the Coral model! For instructions on how to run an object detection model on the Raspberry Pi using the Coral USB Acclerator, please see my video, [\"How to Use the Coral USB Accelerator with the Raspberry Pi\"](https://www.youtube.com/watch?v=qJMwNHQNOVU)." + ] + }, + { + "cell_type": "markdown", + "source": [ + "# Appendix: Common Errors" + ], + "metadata": { + "id": "5VI_Gh5dCd7w" + } + }, + { + "cell_type": "markdown", + "metadata": { + "id": "sEbd9cO7I_o3" + }, + "source": [ + "Here are solutions to common errors that can occur while stepping through this notebook.\n", + "\n", + "**1. Training suddenly stops with ^C output**\n", + "\n", + "If your training randomly stops without any error messages except a `^C`, that means the virtual machine has run out of memory. To resolve the issue, try reducing the `batch_size` variable in Step 4 to a lower value like `batch_size = 4`. The value must be a power of 2. (e.g. 2, 4, 8 ...)\n", + "\n", + "Source: https://stackoverflow.com/questions/75901898/why-my-model-training-automatically-stopped-during-training" + ] + } + ], + "metadata": { + "accelerator": "GPU", + "colab": { + "provenance": [], + "toc_visible": true, + "collapsed_sections": [ + "4VAvZo8qE4u5", + "sxb8_h-QFErO", + "eydREUsMGUUR", + "eGEUZYAMEZ6f", + "-19zML6oEO7l", + "kPg8oMnQDYKl", + "RDQrtQhvC3oG", + "5i40ve0SCLaE", + "WoptFnAhCSrR", + "5VI_Gh5dCd7w" + ], + "authorship_tag": "ABX9TyPekJ7L67HZ5UdwmDoiqGpy", + "include_colab_link": true + }, + "gpuClass": "standard", + "kernelspec": { + "display_name": "Python 3", + "name": "python3" + }, + "language_info": { + "name": "python" + } + }, + "nbformat": 4, + "nbformat_minor": 0 +} \ No newline at end of file