The objective of this project is to ensure safety in a work environment, i.e., building sites, laboratories, and other places that require wearing protective equipment. The output can be used to make sure that every worker or person present on a building site, for example, is following safety regulations.
The original LogicTronix project can be found on Hackster.io here.
Application on Xilinx Kria KV260 with Machine Learning! This application is targeted for the construction site where different PPE set (Helmet, Vest, Shoes, Gloves) have to be worn while working.
This application uses an optimized neural network and runs in the Xilinx DPU IP core for inferencing. We can achieve 20-30 FPS of inference on Xilinx Kria KV260 for this application.
Hardware components:
AMD Kria KV260 Vision AI Starter Kit
Webcam, Logitech® HD Pro
Software apps and online services:
AMD Vitis Unified Software Platform
TensorFlow
OpenCV
1. Pretrained custom YoloV3 model
Due to the lack of time and resources, we chose to use a publicly available model with pre-trained weights for the live detection of PPE. We will use a model developed by Nipun D. Nath et Al. for Real-time detection of PPE. It is a custom YoloV3, in their article, they present 3 different approaches of detection each with its advantages and disadvantages.
In this project, we used the second approach which consists of training the YOLO model to detect a person in a frame and classify him/her in one of the following four classes: “Worker”, “Worker + Helmet”, “Worker + Vest” and “Worker + Helmet + Vest”. We can notice that the number of classes can be increased as we add more equipment.
Pretrained weights can be found on Google Drive. To be able to use this model on Kria KV260, we had to quantize and compile it first using the VITIS-AI library.
We used the Vitis-AI 2.0 which is the latest at the time of the edition of this project.
docker pull xilinx/vitis-ai-cpu:latest
2. Model Quantization
After docker installation, we copied the model file to the docker and ran the following script for quantization. The Quantization process required the model that we uploaded but also a calibration dataset. For consistency, we used images from the original training dataset also available here.
import os
import numpy as np
import cv2
import tensorflow as tf
from tensorflow_model_optimization.quantization.keras import vitis_quantize
def letterbox_image(image, size):
'''
Resize image with unchanged aspect ratio using padding to fit yolov3 specifications
'''
ih, iw, ic = image.shape
h, w = size
scale = min(w/iw, h/ih)
nw = int(iw*scale)
nh = int(ih*scale)
new_image = np.zeros((h, w, ic), dtype='uint8') + 128
top, left = (h - nh)//2, (w - nw)//2
new_image[top:top+nh, left:left+nw, :] = cv2.resize(image, (nw, nh))
return new_image
if __name__ == '__main__':
float_model = tf.keras.models.load_model('/workspace/pictor_ppe_a2.h5')
float_model.get_config()
quantizer = vitis_quantize.VitisQuantizer(float_model)
# Get calibration images
filenames = []
path = f'./Images/'
print('Reading files ...')
for fname in os.listdir(path):
f = os.path.join(path, fname)
if(os.path.isfile(f)):
filenames.append(f)
# Reshaping images use letterbox function. We only used the first 100 Images in 784 Images available
print('Reshaping images ...')
image_data = []
for fname in filenames[:100]:
act_img = cv2.imread(fname)
image_data.append(letterbox_image(act_img,(416,416))/255.)
image_data = np.array(image_data)
print('Starting Quantization ...')
quantized_model = quantizer.quantize_model(calib_dataset = image_data, calib_batch_size=10)
quantized_model.save('quantized_model.h5') # Save the new model
3. Model Compilation
The quantized model is saved on the workspace and the next step was to compile it. The compilation was done by running the following command:
vai_c_tensorflow2 -m quantized_model.h5 -a arch.json -n yolov3_ppe
When the compilation was done the output is written in “yolov3_ppe.xmodel”.
4. Kria Configuration
The converted model is then copied on the FPGA (Kria KV260) using ssh. This model is then stored under “/usr/share/vitis_ai_library/models/yolov3_ppe/” along with a new file “yolov3_custom.prototxt” (here below) that contains the description of the model and values of its anchors.
model {
name: "yolov3_416x416_ppe"
kernel {
name: "yolov3_416x416_ppe"
mean: 0.0
mean: 0.0
mean: 0.0
scale: 0.00390625
scale: 0.00390625
scale: 0.00390625
}
model_type : YOLOv3
yolo_v3_param {
num_classes: 4
anchorCnt: 3
layer_name: "58"
layer_name: "66"
layer_name: "74"
conf_threshold: 0.5
nms_threshold: 0.45
biases: 6
biases: 11
biases: 11
biases: 23
biases: 19
biases: 36
biases: 32
biases: 50
biases: 40
biases: 104
biases: 76
biases: 73
biases: 73
biases: 158
biases: 128
biases: 209
biases: 224
biases: 246
test_mAP: false
}
is_tf : true
}
5. Results
To run tests we used a modified version of samples/yolov3 available in Vitis-AI-Library. (Files available onGithub)
Points of Improvements
This project can be expanded to include other protective equipment like face masks, gloves, headphones, lab coats… Many researchers spent time trying to improve the accuracy of real-time PPE detection models and a lot of interesting approaches came out, these can be used to further test the limits of Kria KV260.
The use of pre-trained models was especially useful as it helped us focus more on the hardware and worry less about the deep learning part (i.e., data collection, data preparation, training,…). This means that we were also able to get started with cheap hardware and yet have remarkable results.
Sundance is excited to have added IP Core products from LogicTronix to our store, with more of their products to be added soon.
LogicTronix is an FPGA Design and Machine Learning Company. They are an AMD Certified Partner and Design Service Partner for AMD Kria SoM FPGA for Artificial Intelligence and Machine Learning.
They provide 25+ IP Cores on Computer Vision, Crypto-graphic Hashing, HFT and AI/ML. LogicTronix provides Machine Learning Accelerated Solutions and Design Services over Edge and Cloud with FPGA/GPU.
The Solutions and Services are targeted for Surveillance , Security , Video Analytics, Automotive and Finance.