Comparison of Object Detection Models using Convolutional Neural Networks in Aerial Image from Unmanned Aerial Vehicles

Main Article Content

Kittakorn Viriyasatr
Warakorn Luangluewut
Piyarose Maleecharoen
Siraphob Santironnarong
Wichai Pawgasame
Pantape Kaewmongkol
Sanya Mitaim
Phunsak Thiennviboon


This research article studies and compares various models used for object detection in aerial imagery captured by Unmanned Aerial Vehicle (UAV). Two types of objects are detected: buildings and vehicles. Machine learning models are used for object detection, and various models are compared to identify their advantages and disadvantages. The following models are compared: Faster R-CNN, MobileNetv1, Retinanet50, YOLOV4, YOLOV4-tiny, YOLOv7, and EfficientDet. The experiments found that YOLOV7 achieved the highest detection accuracy of 58.5%, outperforming MobileNetv1, YOLOV4, Faster R-CNN, YOLOV-tiny, EfficientDet, and Retinanet50, which achieved accuracies of 49.5%, 45.1%, 21.2%, 17.6%, 14.5%, and 1.2%, respectively. The model with the highest speed was MobileNetv1, which achieved a speed of 196.01 frames per second. This accuracy and speed are sufficient for object detection tasks in aerial image from Unmanned Aerial Vehicle.


Download data is not yet available.

Article Details

How to Cite
K. Viriyasatr, “Comparison of Object Detection Models using Convolutional Neural Networks in Aerial Image from Unmanned Aerial Vehicles”, Def. Technol. Acad. J., vol. 6, no. 13, pp. 90–107, May 2024.
Research Articles


Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based Learning Applied to Document Recognition,” Proc. IEEE, vol. 86, no. 11, pp. 2278 – 2324, 1998.

D. G. Lowe, “Object Recognition from Local Scale-invariant Features,” in Proc. 7th IEEE Int. Conf. Comput. Vision (ICCV’99), Kerkyra, Greece, 1999, pp. 1150-1157.

H. Bay, A. Ess, T. Tuytelaars, and L. V. Gool, “Speeded-Up Robust Features (SURF),” Comput. Vis. Image Underst., vol. 110, no. 3, pp. 346 - 359, 2008.

N. Dalal and B. Triggs, “Histograms of Oriented Gradients for Human Detection,” in 2005 IEEE Comput. Soc. Conf. Comput. Vision Pattern Recognit. (CVPR’05), San Diego, CA, USA, 2005, pp. 886-893.

W. Pei et al., “Mapping and Detection of Land Use Change in a Coal Mining Area Using Object-based Image Analysis,” Environ. Earth Sci., vol. 76, pp. 1 – 16, 2017.

Y. Liu, S. U. Din, and Y. Jiang, “Urban Growth Sustainability of Islamabad, Pakistan, Over the Last 3 Decades: A Perspective based on Object-based Backdating Change Detection,” GeoJournal, vol. 86, pp. 2035 – 2055, 2020.

M. Choinski, M. Rogowski, P. Tynecki, D. P. J. Kuijper, M. Churski, and J. W. Bubnicki, “A First Step Towards Automated Species Recognition from Camera Trap Images of Mammals Using AI in a European Temperate Forest,” in Int. Conf. Comput. Inf. Syst. Ind. Manage. (CISIM 2021), Ełk, Poland, 2021, pp. 299 – 310.

W. Dai, H. Wang, Y. Song, and Y. Xin, “Wildlife Small Object Detection based on Enhanced Network in Ecological Surveillance,” in 2021 33rd Chin. Control Decis. Conf. (CCDC), Kunming, China, 2021, pp. 1164-1169.

L. Dutrieux et al., “Tree Species Detection and Identification from UAV Imagery to Support Tropical Forest Monitoring,” in EGU General Assem. Conf. (EGU 2020), 2020, p. 17759.

W. Lim, K. Choi, W. Cho, B. Chang, and D. W. Ko, “Efficient Dead Pine Tree Detecting Method in the Forest Damaged by Pine Wood Nematode (Bursaphelench usxylophilus) Through Utilizing Unmanned Aerial Vehicles and Deep Learning-based Object Detection Techniques,” Forest Sci. Technol., vol. 18, no. 1, pp. 36 – 43, 2022.

G. D. Georgiev, G. Hristov, P. Zahariev, and D. Kinaneva, “Forest Monitoring System for Early Fire Detection Based on Convolutional Neural Network and UAV Imagery,” in 2020 28th Nat. Conf. Int. Participation (TELECOM 2020), Sofia, Bulgaria, 2020, pp. 57 - 60.

L. Shumilo, M. Lavreniuk, N. Kussul, and B. Shevchuk, “Automatic Deforestation Detection based on the Deep Learning in Ukraine,” in 2021 11th IEEE Int. Conf. Intell. Data Acquisition Adv. Comput. Syst.: Technol. Appl. (IDAACS 2021), Cracow, Poland, 2021, pp. 337 - 342.

K. - C. Chang, S. - H. Lin, J. - W. Huang, and Y. - F. Wu, “Automatic Incremental Training of Object Detection by Using GAN for River Level Monitoring,” in 2021 IEEE Int. Conf. Consum. Electron. - Taiwan (ICCE-TW), Penghu, Taiwan, China, 2021, pp. 1 – 2.

R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation,” in 2014 IEEE Conf. Comput. Vision Pattern Recognit., Columbus, OH, USA, 2013, pp. 580 - 587.

R. Girshick, “Fast R-CNN,” in 2015 IEEE Int. Conf. Comput. Vision (ICCV 2015), Santiago, Chile, 2015, pp. 1440 - 1448.

S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 39, no. 6, pp. 1137 - 1149, 2017.

W. Liu et al., “SSD: Single Shot MultiBox Detector,” in 14th Eur. Conf. Comput. Vision (ECCV 2016), Amsterdam, Netherlands, 2016, pp. 21 – 37.

T. - Y. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan, and S. Belongie, “Feature Pyramid Networks for Object Detection,” in 2017 IEEE Conf. Comput. Vision Pattern Recognit. (CVPR 2017), Honolulu, HI, USA, 2017, pp. 936 - 944.

T. - Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollár, “Focal Loss for Dense Object Detection,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 42, no. 2, pp. 318 – 327, 2017.

J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You Only Look Once: Unified, Real Time Object Detection,” in 2016 IEEE Conf. Comput. Vision Pattern Recognit. (CVPR), Las Vegas, NV, USA, 2016, pp. 779 - 788.

A. Bochkovskiy, C. - Y. Wang, and H. - Y. Liao, “YOLOv4: Optimal Speed and Accuracy of Object Detection,” 2020, arXiv: 2004.10934.

C. - Y. Wang, A. Bochkovskiy, and H. - Y. M. Liao, “YOLOv7: Trainable Bag-of-Freebies Sets New State-of-the-Art for Real-Time Object Detectors,” in 2023 IEEE/CVF Conf. Comput. Vision Pattern Recognit. (CVPR), Vancouver, BC, Canada, 2023, pp. 7464 - 7475.

M. Tan, R. Pang, and Q. V. Le, “EfficientDet: Scalable and Efficient Object Detection,” in 2020 IEEE/CVF Conf. Comput. Vision Pattern Recognit. (CVPR), Seattle, WA, USA, 2020, pp. 10778 - 10787.

M. Tan and Q. V. Le, “EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks,” 2019, ArXiv: 1905.11946.

W. Luangluewut, K. Viriyasatr, W. Pawgasame, P. Kaewmongkol, and S. Mitaim, “Detecting Objects in Aerial Photographs Using Neural Network Techniques”, Def. Technol. Acad. J., vol. 5, no. 12, pp. 4–11, Nov. 2023.

K. He, X. Zhang, S. Ren, and J. Sun, “Deep Residual Learning for Image Recognition,” in 2016 IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Las Vegas, NV, USA, 2016, pp. 770 - 778.

A. G. Howard et al., “MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications,” 2017, ArXiv: 1704.04861.

Most read articles by the same author(s)