Real-time Object Detection on CPU

Praneet Bomma
Towards Data Science
5 min readJun 13, 2021

--

Real-time Object Detection on CPU using OpenVino. Sounds magical right? Well, let’s actually do it to believe it.

The first time someone asked me if it was possible to do real-time object detection on a CPU, I shrugged off the question, thinking, “Must be kidding?”.

Well, that’s possible! I tried it to actually believe it. We will cover the implementation of object detection on a CPU at a whooping 40 FPS.

I would suggest, you read a bit about what the below terms mean and then continue to the tutorial. You can still do it without understanding these terms by just blindly copying the tutorial, but what’s the fun in that, isn’t it?

This blog assumes you have an idea of what the above terms mean and a general idea of deep learning and object detection. We will be using yolov3-tiny trained using darknet for object detection. Let’s dive in!

We will need pre-trained yolov3-tiny weights which can be downloaded from here and its configuration file which can be copied from here.

The next steps are to convert the pre-trained weights to onnx which are then converted to OpenVINO’s IR format which are xml, bin and mapping files as required by OpenVINO for inference. We will be doing this in 4 sequential steps.

  1. Converting YOLO weights to ONNX
  2. Finding names of the output layers using Netron
  3. Converting ONNX to IR Format
  4. Run inference using OpenVINO

Step 1: Converting YOLO weights to ONNX

I will be using jkjung-avt’s tensorrt_demos repository to convert yolo weights to onnx. This is my personal preference as I have used it for TensorRT conversions before. There are many other repositories available to do the same. We will be following the steps mentioned in the Readme to convert the yolov3-tiny weights to onnx. Let’s do this!

$ git clone https://github.com/jkjung-avt/tensorrt_demos.git$ pip3 install onnx==1.4.1$ cd tensorrt_demos/yolo/

Copy the weights file and cfg file to this directory i.e. the yolo directory in tensorrt_demos repository wherever you cloned it.

  1. Rename the weights file to yolov3-tiny-416.weights
  2. Rename the cfg file to yolov3-tiny-416.cfg
  3. Create a new file named labels.txt with the class names given here.

These steps are necessary as they are required by the repository to do the conversion for specific model and input size.

Note: We are using 416x416 dimensions as input. You can modify it if you have custom dimensions.

Once the files are moved and renamed, let's run the script to do the actual conversion.

$ python3 yolo_to_onnx.py -m yolov3-tiny-416

This will create an output file named yolov3-tiny-416.onnx. Let’s carry out the next step where we find the names of output layers of the model which are required to convert to IR format. To find the names of the output layers we will be using netron which helps visualize the model graph/architecture.

Step 2: Finding names of the output layers using Netron

$ pip3 install netron$ netron
Serving at http://localhost:8080

Open the link and upload your onnx file. It will show the model architecture like shown below.

Scroll down the architecture to the output node and click on it. A detail box will appear on the right side with the output node names like shown below.

Copy the output node names. We need the names to convert the onnx file to IR format. The node names are: 016_convolutional,023_convolutional

Step 3: Converting ONNX to IR Format

We need to setup OpenVINO and its dependencies to use it. Install OpenVINO 2021.1.110 using this link.

Note: Install the development version and NOT the runtime version as it does not include model conversion script.

Note: I recommend using a virtual environment to install these packages as it is easy to manage the dependencies.

You might have already installed the packages, but if you haven’t, do install the packages given below using pip.

$ cd /opt/intel/openvino_2021/python$ source /opt/intel/openvino_2021/bin/setupvars.sh$ pip3 install networkx defusedxml test-generator==0.1.1$ cd /opt/intel/openvino_2021.1.110/deployment_tools/model_optimizer/install_prerequisites/$ ./install_prerequisites_onnx.sh

Finally, once the setup and installation is done, let’s run the script to convert the ONNX file to IR format. Replace the model_dir and output_dir with your local paths.

$ python3 /opt/intel/openvino_2021.1.110/deployment_tools/model_optimizer/mo.py --input_model /path_to_model_dir/yolov3-tiny-416.onnx -s 255 --reverse_input_channels --output 016_convolutional,023_convolutional --output_dir /path_to_output_dir/

This will output 3 files, viz. a bin file, an xml file and a mapping file. These files are the intermediate representations for OpenVINO inference.

Step 4: Run inference using OpenVINO

With all the conversions finally completed, let’s run a demo on our webcam or an image to check out the inference speed.

$ git clone https://github.com/Chen-MingChang/pytorch_YOLO_OpenVINO_demo.git# activate environment that you used to install the above packages before running below commands.$ cd /opt/intel/openvino_2021/python$ source /opt/intel/openvino_2021/bin/setupvars.sh$ cd /path_to_cloned_repository/pytorch_YOLO_OpenVINO_demo$ python3 yolo_openvino_demo.py -m /path_to_model_directory/yolov3-tiny-416.xml -i 'cam' -at yolov3 --labels /path_to_labels/labels.txt

You will see the output video with faster than real-time inference.

If you don’t have a webcam, try it on a video with the command below.

$ python3 yolo__openvino_demo.py -m /path_to_model_directory/yolov3-tiny-416.xml -i /path_to_video_dir/video.mp4 -at yolov3 --labels /path_to_labels/labels.txt

We have successfully converted yolov3-tiny to do real-time inference on a CPU. Well, I would say 40 FPS is almost twice as faster than real-time. Please comment down if I need to elaborate on any points or if there are any corrections needed.

References

--

--