Object detection with PyTorch and YOLO
Created by Bigge, Julian, last modified by Potthoff, Sebastian on 02. Jul 2021 : The crucial goal of object detection on images is, to classify and find (multiple) occurrences of different object categories. There exists a wide variety of neural network architectures tackling this task.
These networks are as a general rule compared based on well-known datasets such as COCO with regards to their mAP (mean average precision) score.
Some popular choices at the time of writing this are e.g.
- Faster R-CNN
- SSD-MobileNet
- YOLOv4
Especially the YOLO architectures have proven to be a quite uniquely great balance between detection/classification accuracy and performance, even on lower-end devices such as mobiles.
In this article, we will have a look at how YOLO in combination mit PyTorch can be used on Palma to train a new YOLO-model which can be used for object detection on your own images. After the training procedure you can download your model and, for example, start the inference on your own device. Further, we will use the modified YOLOv4-CSP architecture, which is currently the best performing YOLO.
Preparation
Install requirements
Before we can start training, we will first have to install some
requirements for the PyTorch
implementation{.external-link}
of YOLOv4-CSP. Using Python, most of them can be installed by using
Python\'s package manager pip
. As we want to use a GPU to accelerate
the training process, we use Palma\'s gputitanrtx
queue. For
installation of the requirements it is recommended to queue an
interactive job on one of the gputitanrtx
hosts, so that any compiling
involved is done on the correct hardware architecture.
First thing to do on the GPU node is to load the required modules to our shell session, e.g.
module load palma/2019b
module load fosscuda
module load OpenCV
module load PyTorch
module load torchvision
Now we can start to install our requirements. You can download this requirements.txt which should include all necessary packages and then upload it to your home directory on Palma an install it via
pip3 install --user -r requirements.txt
One last required package has to be installed manually: mish-cuda, which provides a CUDA accelerated implementation of the MISH activation function. To install it you can use
git clone https://github.com/JunnYu/mish-cuda
pip3 install --user ./mish-cuda
Finally, as a last step, we can clone the PyTorch implementation of YOLO itself with
git clone --single-branch --branch yolov4-csp https://github.com/WongKinYiu/ScaledYOLOv4
You should now have a folder called ScaledYOLOv4
in which you can find
the necessary Python scripts for training and inference. You can now
exit you shell session on the GPU host.
Configure YOLO
Next thing to do is to fine-configure YOLO for your new custom model. First, we need to tell YOLO about the names of your classes.
vim ScaledYOLOv4/data/dataset.names
and put each class label in its own line. You can also have a look at
ScaledYolov4/data/coco.names
for the right format.
In a similar way, you will also have to create a new configuration file
with the path ScaledYolov4/data/dataset.yaml
and the following
contents:
dataset.yaml
train: data/train.txt
val: data/test.txt
nc: <NUMBER OF CLASS LABELS>
names: [<COMMA-SEPARATED LIST OF CLASS LABELS WITH THE SAME ORDER THAN USED IN dataset.names>]
Again, you can have a look at the file ScaledYolov4/data/coco.yaml
as
an example.
As a last step, the exact model structure must be adapted to your dataset. Pay close attention to the following steps. They are easily mixed up which can result in significant problems when training the model. First, copy the default configuration for COCO:
cp ScaledYolov4/models/yolov4-csp.cfg ScaledYolov4/models/yolov4-csp-custom.cfg
Then open the new file in the editor of your choice (e.g. vim
) and
proceed with the following sets of changes for a good first performance:
- hange line batch to
batch=64
- change line subdivisions to
subdivisions=16
- change line max_batches to (
classes*2000
, but not less than number of training images and not less than6000
), f.e.max_batches=6000
if you train for 3 classes
Upload your training data
Next thing to do, now that we have setup the software environment and
configured YOLO is to upload our training data to Palma to make it
available for the training procedure. The format for data
annotations/labels used in YOLO consists the image file, e.g.
image001.(png|jpeg|...)
and the file with the name image001.txt
that
holds the bounding-box coordinates and labels for this particular image.
For software you can use to create these labels, you can have a look
here{.external-link}.
Each line in the .txt-file corresponds to exactly one bounding-box and consists of the following 5 values, separated by a simple space:
- id of the class label
- x-coordinate of the center of the bounding-box relative to the image width
- y-coordinate of the center of the bounding-box relative to the image height
- width of the bounding-box relative to the image width
- height of the bounding-box relative to the image height
After successful creation of your dataset you should split that into
training- and testset which you can then upload (for example to
/scratch/tmp/<YOUR_USERID>/training
and
/scratch/tmp/<YOUR_USERID>/test
). To make YOLO find your dataset, you
now just have to create the files train.txt
and test.txt
, which have
a single path to an image file per line, in the ScaledYOLOv4/data/
directory. This can easily be done with some shell-magic:
find /scratch/tmp/<YOUR_USERID>/train -name "*.png" -print > ScaledYOLOv4/data/train.txt
find /scratch/tmp/<YOUR_USERID>/test -name "*.png" -print > ScaledYOLOv4/data/test.txt
Congratulations, you can now start the training!
Start the training
Now that you have prepared all your data, you can start YOLO\'s training procedure like any other job on PALMA by using YOLO\'s train script. You can provide various additional arguments to the scripts, some of which are required:
python train.py --help
--weights' initial weights path
--cfg' model.yaml path
--data' data.yaml path
--hyp' hyperparameters path
--epochs'
--batch-size' total batch size for all GPUs
--img-size' [train, test] image sizes
--rect' rectangular training
--resume' resume most recent training
--nosave' only save final checkpoint
--notest' only test final epoch
--noautoanchor' disable autoanchor check
--evolve' evolve hyperparameters
--bucket' gsutil bucket
--cache-images' cache images for faster training
--image-weights'use weighted image selection for training
--device' cuda device, i.e. 0 or 0,1,2,3 or cpu
--multi-scale' vary img-size +/- 50%
--single-cls' train as single-class dataset
--adam' use torch.optim.Adam() optimizer
--sync-bn' use SyncBatchNorm, only available in DDP mode
--local_rank' DDP parameter, do not modify
--log-imgs' number of images for W&B logging, max 100
--workers' maximum number of dataloader workers
--project' save to project/name
--name' save to project/name
--exist-ok' existing project/name ok , do not increment
train_submit.sh
#!/bin/bash
#SBATCH --nodes=1
#SBATCH --tasks-per-node=12
#SBATCH --partition=gputitanrtx
#SBATCH --mem=12G
#SBATCH --gres=gpu:1
#SBATCH --time=48:00:00
#SBATCH --export=NONE
#SBATCH --job-name=<YOUR_DATASET_TRAINING>
#SBATCH --output=output.dat
#SBATCH --mail-type=BEGIN,FAIL,END
#SBATCH --mail-user=<YOUR_MAIL_ADDRESS>
# load modules with available GPU support
module purge
module load palma/2019b
module load fosscuda
module load OpenCV
module load PyTorch
module load torchvision
cd <YOUR_HOME_DIR>ScaledYOLOv4
python train.py --epochs 300 --device 0 --img 640 640 --batch-size 16 --data data/dataset.yaml --cfg models/yolov4-csp-custom.cfg --weights '' --name train_0
[
https://pjreddie.com/darknet/yolo/
Publications
- YOLO{.external-link}
- YOLO9000{.external-link}
- YOLOv3{.external-link}
- YOLOv4{.external-link}
- YOLOv4-CSP{.external-link}
Articles
- Medium: YOLOv4 --- the most accurate real-time neural network on MS COCO dataset{.external-link}
- Medium: Scaled YOLO v4 is the best neural network for object detection on MS COCO dataset{.external-link}
YOLOv4 Implementations (Python)