Machine-learning-based top-view safety monitoring of ground workforce on complex industrial sites


Gelayol Golcarenarenji, Ignacio Martinez-Alpiste, Qi Wang, Jose M. Alcaraz-Calero

Journal Neural Computing and Applications
Impact Factor 5.606
Number of pages 14
Early online date 22 Oct 2021
Original language English


Telescopic cranes are powerful lifting facilities employed in construction, transportation, manufacturing and other industries. Since the ground workforce cannot be aware of their surrounding environment during the current crane operations in busy and complex sites, accidents and even fatalities are not avoidable. Hence, deploying an automatic and accurate top-view human detection solution would make significant improvements to the health and safety of the workforce on such industrial operational sites. The proposed method (CraneNet) is a new machine learning empowered solution to increase the visibility of a crane operator in complex industrial operational environments while addressing the challenges of human detection from top-view on a resource-constrained small-form PC to meet the space constraint in the operator’s cabin. CraneNet consists of 4 modified ResBlock-D modules to fulfill the real-time requirements. To increase the accuracy of small humans at high altitudes which is crucial for this use-case, a PAN (Path Aggregation Network) was designed and added to the architecture. This enhances the structure of CraneNet by adding a bottom-up path to spread the low-level information. Furthermore, three output layers were employed in CraneNet to further improve the accuracy of small objects. Spatial Pyramid Pooling (SPP) was integrated at the end of the backbone stage which increases the receptive field of the backbone, thereby increasing the accuracy. The CraneNet has achieved 92.59% of accuracy at 19 FPS on a portable device. The proposed machine learning model has been trained with the Standford Drone Dataset and Visdrone 2019 to further show the efficacy of the smart crane approach. Consequently, the proposed system is able to detect people in complex industrial operational areas from a distance up to 50 meters between the camera and the person. This system is also applicable to the detection of any other objects from an overhead camera.

DOI: 10.1007/s00521-021-06489-3