基于改进Tiny-YOLO模型的群养生猪脸部姿态检测
作者:
基金项目:

国家高技术研究发展计划(863计划)资助项目(2013AA102306);国家自然基金面上项目资助(31772651);山西省重点研发计划专项(农业)(201803D221028-7)


Detection of facial gestures of group pigs based on improved Tiny-YOLO
Author:
  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [43]
  • |
  • 相似文献 [20]
  • |
  • 引证文献
  • | |
  • 文章评论
    摘要:

    生猪脸部包含丰富的生物特征信息,对其脸部姿态的检测可为生猪的个体识别和行为分析提供依据,而在生猪群养场景下,猪舍光照、猪只黏连等复杂因素给生猪脸部姿态检测带来极大挑战。该文以真实养殖场景下的群养生猪为研究对象,以视频帧数据为数据源,提出一种基于注意力机制与Tiny-YOLO相结合的检测模型DAT-YOLO。该模型将通道注意力和空间注意力信息引入特征提取过程中,高阶特征引导低阶特征进行通道注意力信息获取,低阶特征反向指引高阶特征进行空间注意力筛选,可在不显著增加参数量的前提下提升模型特征提取能力、提高检测精度。对5栏日龄20~105 d的群养生猪共35头的视频抽取504张图片,共计3 712个脸部框,并标注水平正脸、水平侧脸、低头正脸、低头侧脸、抬头正脸和抬头侧脸6类姿态,构建训练集,另取420张图片共计2 106个脸部框作为测试集。试验表明,DAT-YOLO模型在测试集上对群养生猪的水平正脸、水平侧脸、低头正脸、低头侧脸、抬头正脸和抬头侧脸6类姿态预测的AP值分别达到85.54%、79.30%、89.61%、76.12%、79.37%和84.35%,其6类总体mAP值比Tiny-YOLO模型、仅引入通道注意力的CAT-YOLO模型以及仅引入空间注意力的SAT-YOLO模型分别提高8.39%、4.66%和2.95%。为进一步验证注意力在其余模型上的迁移性能,在同等试验条件下,以YOLOV3为基础模型分别引入两类注意力信息构建相应注意力子模型,试验表明,基于Tiny-YOLO的子模型与加入相同模块的YOLOV3子模型相比,总体mAP指标提升0.46%~1.92%。Tiny-YOLO和YOLOV3系列模型在加入注意力信息后检测性能均有不同幅度提升,表明注意力机制有利于精确、有效地对群养生猪不同类别脸部姿态进行检测,可为后续生猪个体识别和行为分析提供参考。

    Abstract:

    The face of the pig contains rich biometric information, and the detection of the facial gestures can provide a basis for the individual identification and behavior analysis of the pig. Detection of facial posture can provide basis for individual recognition and behavioral analysis of pigs. However, under the scene of group pigs breeding, there always have many factors, such as pig house lighting and pig adhesion, which brings great challenges to the detection of pig face. In this paper, we take the group raising pigs in the real breeding scene as the research object, and the video frame data is used as the data source. Latter we propose a new detection algorithm named DAT-YOLO which based on the attention mechanism and Tiny-YOLO model, and channel attention and spatial attention information are introduced into the feature extraction process. High-order features guide low-order features for channel attention information acquisition, and low-order features in turn guide high-order features for spatial attention screening, meanwhile the model parameters don't have significant increase, the model feature extraction ability is improved and the detection accuracy is improved. We collect 504 sheets total 3 712 face area picture for the 5 groups of 20 days to 3 and a half months of group health pig video extraction, the number of pigs is 35. In order to obtain the model input data set, we perform a two-step pre-processing operation of filling pixel values and scaling for the captured video. The model outputs are divided into six classes, which are horizontal face, horizontal side-face, bow face, bow side-face, rise face and rise side-face. The results show that for the test set, the detection precision(AP) reaches 85.54%, 79.3%, 89.61%, 76.12%, 79.37%, 84.35% of the horizontal face, horizontal side-face, bow face, bow side-face, rise face and rise side-face respectively, and the mean detection precision(mAP) is 8.39%, 4.66% and 2.95% higher than that of the general Tiny-YOLO model, the CAT-YOLO model only refers to channel attention and the SAT-YOLO model only introduces spatial attention respectively. In order to further verify the migration performance of attention on the remaining models, under the same experimental conditions, two attentional information were introduced to construct the corresponding attention sub-models based on the YOLOV3-based model. The experiment shows that compared to the YOLOV3 submodel, the sub-model based on Tiny-YOLO increase by 0.46% to 1.92% in the mAP. The Tiny-YOLO and YOLOV3 series models have different performance improvements after adding attention information, indicating that the attention mechanism is beneficial to the accurate and effective group gestures detection of different groups of pigs. In this study, the data is pseudo-equalized from the perspective of loss function to avoid the data imbalance caused by the number of poses of different facial categories, and actively explore the reasons for the difference in the accuracy of different facial gesture detection. The study can provide reference for the subsequent individual identification and behavior analysis of pigs.

    参考文献
    [1]孙龙清,李玥,邹远炳,等. 基于改进Graph Cut算法的生猪图像分割方法[J]. 农业工程学报,2017,33(16):196-202.Sun Longqing, Li Yue, Zou Yuanbing, et al. Pig image segmentation method based on improved Graph Cut algorithm[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2017, 33(16): 196-202. (in Chinese with English abstract)
    [2]邹远炳,孙龙清,李玥,等. 基于分布式流式计算的生猪养殖视频监测分析系统[J]. 农业机械学报,2017,48(S):365-373.Zou Yuanbing, Sun Longqing, Li Yue, et al. Video monitoring and analysis system for pig breeding based on distributed flow Computing[J]. Transactions of the Chinese Society for Agricultural Machinery, 2017, 48(S): 365-373. (in Chinese with English abstract)
    [3]薛月菊,朱勋沐,郑婵,等. 基于改进 Faster R-CNN 识别深度视频图像哺乳母猪姿态[J]. 农业工程学报,2018,34(9):189-196.Xue Yueju, Zhu Xunmu, Zheng Chan, et al. Lactating sow postures recognition from depth image of videos based on improved Faster R-CNN[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2018, 34(9): 189-196. (in Chinese with English abstract)
    [4]胡志伟,杨华,娄甜田,等. 基于全卷积网络的生猪轮廓提取[J]. 华南农业大学学报,2018,39(6):111-119.Hu Zhiwei, Yang Hua, Lou Tiantian, et al. Extraction of pig contour based on fully convolutional networks[J]. Journal of South China Agricultural University, 2018, 39(6): 111-119. (in Chinese with English abstract)
    [5]杨阿庆,薛月菊,黄华盛,等. 基于全卷积网络的哺乳母猪图像分割[J]. 农业工程学报,2017,33(23):219-225.Yang Aqing, Xue Yueju, Huang Huasheng, et al. Lactating sow image segmentation based on fully convolutional networks[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2017, 33(23): 219-225. (in Chinese with English abstract)
    [6]Yang Aqing, Huang Huasheng, Zheng Chan, et al. High-accuracy image segmentation for lactating sows using a fully convolutional network[J]. Biosystems Engineering, 2018, 176: 36-47.
    [7]Psota E, Mittek M, Pérez L, et al. Multi-pig part detection and association with a fully-convolutional network[J]. Sensors, 2019, 19(4): 852.
    [8]Wang Jianzong, Liu Aozhi, Xiao Jing. Video-Based Pig Recognition with Feature-Integrated Transfer Learning[C]// Biometric Recognition, 2018: 620-631.
    [9]Chen Zuge, Wu Kehe, Li Yuanbo, et al. SSD-MSN: An improved multi-scale object detection network based on SSD[J]. IEEE Access, 2019, 7: 80622-80632.
    [10]GhiasiG, Lin T Y, Le Q V. Nas-fpn: Learning scalable feature pyramid architecture for object detection[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE Computer Society: Piscataway, NJ. 2019: 7036-7045.
    [11]Law H, Deng J. Cornernet: Detecting objects as paired keypoints[C]//Proceedings of the European Conference on Computer Vision (ECCV). Cham:SpringerInternational Publishing, 2018: 734-750.
    [12]Lin T Y, Dollár P, Girshick R, et al. Feature pyramid networks for object detection[C]//Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). IEEE Computer Society: Piscataway, NJ. 2017: 2117-2125.
    [13]Liu W, Anguelov D, Erhan D, et al. Ssd: Single shot multibox detector[C]//European conference on computer vision(CVPR). IEEE Computer Society: Piscataway, NJ. 2016: 21-37.
    [14]Girshick R, Donahue J, Darrell T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[J]. Computer Science, 2013: 580-587.
    [15]Ren S, He K, Girshick R, et al. Faster R-CNN: Towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2015, 39(6): 1137-1149.
    [16]Redmon J, Divvala S, Girshick R, et al. You only look once: Unified, real-time object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR). IEEE Computer Society: Piscataway, NJ. 2016: 779-788.
    [17]Redmon J, Farhadi A. YOLO9000: Better, Faster, Stronger[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE Computer Society: Piscataway, NJ. 2017: 6517-6525.
    [18]Redmon J, Farhadi A. Yolov3: An incremental improvement[J/OL]. [2019-07-10]. USA: 2018. https://arxiv.org/abs/1804.02767
    [19]Pedoeem J, Huang R. YOLO-LITE: A real-time object fetection algorithm optimized for non-GPU computers[J/OL]. [2019-07-10]. USA: 2018. https://arxiv.org/abs/1811.05588
    [20]薛月菊,黄宁,涂淑琴,等. 未成熟芒果的改进 YOLOv2识别方法[J]. 农业工程学报,2018,34(7):173-179.Xue Yueju, Huang Ning, Tu Shuqin, et al. Immature mango detection based on improved YOLOv2[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2018, 34(7): 173-179. (in Chinese with English abstract)
    [21]赵德安,吴任迪,刘晓洋,等. 基于YOLO 深度卷积神经网络的复杂背景下机器人采摘苹果定位[J]. 农业工程学报,2019,35(3):164-173.Zhao Dean, Wu Rendi, Liu Xiaoyang, et al. Apple positioning based on YOLO deep convolutional neural network for picking robot in complex background[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2019, 35(3): 164-173. (in Chinese with English abstract)
    [22]Ju M, Choi Y, Seo J, et al. A kinect-based segmentation of touching-pigs for real-time monitoring[J]. Sensors, 2018, 18(6): 1746.
    [23]Wang Fei, Jiang Mengqing, Qian Chen, et al. Residual Attention Network for Image Classification[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE Computer Society: Piscataway, NJ. 2017: 3156-3164.
    [24]Yu Changqian, Wang Jingbo, Peng Chao, et al. BiSeNet: Bilateral segmentation network for real-time semantic segmentation[C]//Proceedings of the European Conference on Computer Vision (ECCV). Cham:SpringerInternational Publishing, 2018: 325-341.
    [25]徐诚极,王晓峰,杨亚东. Attention-YOLO:引入注意力机制的YOLO检测算法.计算机工程与应用[J],2019,55(6):13-23.Xu Chengji, Wang Xiaofeng, Yang Yadong. Attention-YOLO: YOLO detection algorithm that introduces attention mechanism[J]. Computer Engineering and Applications, 2019, 55(6): 13-23. (in Chinese with English abstract)
    [26]TzuTa L. LabelImg [CP/DK]. (2017-01-09) [2019-06-20] https:// github.com/tzutalin/labelImg
    [27]Raykar V C, Saha A. Data Split Strategies for Evolving Predictive Models[C]//Machine Learning and Knowledge Discovery in Databases. 2015: 3-19.
    [28]Long Jonathan, Shelhamer Evan, Darrell Trevor. Fully convolutional networks for semantic segmentation[J]. IEEE Transactions on Pattern Analysis & Machine Intelligence, 2014, 39(4): 640-651.
    [29]刘军,后士浩,张凯,等. 基于增强Tiny YOLOV3算法的车辆实时检测与跟踪[J]. 农业工程学报,2019,35(8):118-125.Liu Jun, Hou Shihao, Zhang Kai, et al. Real-time vehicle detection and tracking based on enhanced Tiny YOLOV3 algorithm[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2019, 35(8): 118-125. (in Chinese with English abstract)
    [30]Neubeck A, Van Gool L. Efficient non-maximum suppression[C]//18th International Conference on Pattern Recognition (ICPR). Springer: Berlin, German. 2006, 3: 850-855.
    [31]Lin Min, Chen Qiang, Yan Shuicheng. Network in network[J/OL]. [2019-07-20]. USA: 2014. https://arxiv.org/ abs/1312.4400
    [32]Zhou Bolei, Khosla A, Lapedriza A, et al. Learning deep features for discriminative localization[C]//In: Computer Vision and Pattern Recognition (CVPR), IEEE Computer Society: Piscataway, NJ. 2016: 2921-2929.
    [33]Chollet F. Keras[CP/DK]. (2015-03-28)[2019-07-05]. https://github.com/keras-team/keras/
    [34]Kingma D P, Ba J. Adam: A method for stochastic optimization[J/OL]. [2019-07-20]. https://arxiv.org/ abs/1412.6980
    [35]Ioffe S, Szegedy C. Batch normalization: Accelerating deep network training by reducing internal covariate shift[C]// International Conference on Machine Learning(ICML). 2015: 448-456.
    [36]Microsoft. PASCAL-VOC2012 [DB/OL]. (2012-02-20) [2019-08-02]. http://host.robots.ox.ac.uk/pascal/VOC/voc2012
    [37]Rezatofighi H, Tsoi N, Gwak J Y, et al. Generalized intersection over union: A metric and a loss for bounding box regression[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, IEEE Computer Society: Piscataway, NJ. 2019: 658-666.
    [38]Lin T Y, Goyal P, Girshick R, et al. Focal loss for dense object detection[C]//Proceedings of the IEEE International Conference on Computer Vision. 2017: 2980-2988.
    [39]Li B, Liu Y, Wang X. Gradient harmonized single-stage detector[C]//Proceedings of the AAAI Conference on Artificial Intelligence. 2019, 33: 8577-8584.
    [40]Zhang Quanshi, Wu Yingnian, Zhu Songchun. Interpretable Convolutional Neural Networks[C]// The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018: 8827-8836.
    [41]Zhang Quanshi, Yang Yu, Ma Haotian, et al. Interpreting CNNs via Decision Trees[C]// The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019: 6261-6270.
    [42]Bolei Zhou, David Bau, Aude Oliva, et al. Interpreting deep visual representations via network dissection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019, 41(9): 2131-2145.
    [43]Jin H, Song Q, Hu X. Auto-keras: An efficient neural architecture search system[C]//Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, ACM, 2019: 1946-1956.
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

燕红文,刘振宇,崔清亮,胡志伟,李艳文.基于改进Tiny-YOLO模型的群养生猪脸部姿态检测[J].农业工程学报,2019,35(18):169-179. DOI:10.11975/j. issn.1002-6819.2019.18.021

Yan Hongwen, Liu Zhenyu, Cui Qingliang, Hu Zhiwei, Li Yanwen. Detection of facial gestures of group pigs based on improved Tiny-YOLO[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE),2019,35(18):169-179. DOI:10.11975/j. issn.1002-6819.2019.18.021

复制
分享
文章指标
  • 点击次数:1686
  • 下载次数: 3819
  • HTML阅读次数: 0
  • 引用次数: 0
历史
  • 收稿日期:2019-08-16
  • 最后修改日期:2019-08-28
  • 在线发布日期: 2019-10-12
文章二维码
您是第39814559位访问者
ICP:京ICP备06025802号-3
农业工程学报 ® 2025 版权所有
技术支持:北京勤云科技发展有限公司
请使用 Firefox、Chrome、IE10、IE11、360极速模式、搜狗极速模式、QQ极速模式等浏览器,其他浏览器不建议使用!