In the previous part of this tutorial, we learned how to run the YOLO object recognition process on images, video, or in live mode by using a webcam. Now, we want to know how to train the YOLO neural network on our dataset.
The first thing we need to do is prepare the training dataset. If you haven’t done it yet, clone this ready-to-use code repository containing a version of the Darkflow neural network, and follow the setup instructions that were explained in the previous part of this tutorial. Once cloned, create the following directories structure inside the root folder of the repository:
- train
- images
- annotations
Now you can put your images inside the images folder. To generate an effective training model, you have to put many different images for each kind of object class that you want to recognise.
Inside the cfg folder of the repository, create a copy of the yolo.cfg file, named yolo-own.cfg, and open it by using a text editor. You have to make some little changes to the file content:
- Inside the [region] layer, update the classes value with the number of object classes that you want to recognise and keep in mind the num value.
- Inside the last [convolutional] layer, update the filters value with the result of the following expression: num * (classes + 5).
Once the file changes are saved, open the labels.txt file, contained inside the root folder of the repository, and add to its content the names of your object classes to recognise.
Before running the training phase, the last operation to perform is to generate the annotations, which allow us to specify the labels for each image contained inside the training dataset. This operation can be performed manually but, in this case, would require too much time. For this reason, we are going to use an external (free) labelling tool named LabelImg. This software has some setup requirements. For Windows, you have to install the following Python libraries:
pip install PyQt5
pip install lxml
Inside the labelImg folder of the repository, run the following commands by using a command prompt:
pyrcc5 -o libs/resources.py resources.qrc
python labelImg.py
This software tool is very easy to use, and allows us to generate the required annotations in the training phase. For each image, we can identify the position of one or more objects of interest and assign them a textual classification label. To label your entire dataset, you need to perform the following actions:
- Open the LabelImg GUI.
- Click on the “Change Save Dir” button and select the train/annotations folder from the repository.
- Click on the “Open Dir” button and select the train/images folder from the repository.
- Select one image from the dataset by using the right-side list.
- Click on the “Create RectBox” button and draw a box around the object of interest on the image.
- Write the label name that identifies the object class (you have to use the same label names from the labels.txt file).
- Repeat these operations for each training image, as in the image below:

Here we are! Now it’s time to execute the training phase. Inside the root folder of the repository, run the following command by using a command prompt:
python flow --model cfg/yolo-own.cfg --load bin/yolo.weights --train --annotation train/annotations --dataset train/images
The training process will analyse the available images as many times as the number of epochs. The default epochs value is set to 1000, so if you need a coffee break, now is the right time!
Once the training phase is finished, open the run.py script contained inside the root folder of the repository by using your favourite Python IDE. Inside the main function, update the options constant with the following value:
# You need to set the gpu parameter to 0.0 if you want to use the CPU support
options = {"model": "./cfg/yolo-own.cfg",
"load": "./bin/yolo.weights",
"threshold": 0.1,
"gpu": 1.0}
Great! Finally, we are ready to test the object recognition process on our dataset! As explained previously, the run.py Python script allows us to execute the YOLO object recognition process on images, video or in live mode by running one of the following commands:
python run.py image <YOUR_IMAGE_PATH>
python run.py video <YOUR_VIDEO_PATH>
python run.py camera
And that’s it!
Now you can run the YOLO object recognition system on your dataset.
Happy coding!