Abstract:Pigs like to fight with each other to form a hierarchy relationship in groups. Aggressive behaviors, mostly fighting, are frequently found in intensive pig raising facilities. Strong aggressive behaviors can cause other pigs lack of food and water, growing slowly, wounds, sick and even dead in serious situation. This considerably reduces health and welfare of pigs and further decreases economic benefits of pig industries. Monitoring and recognizing aggressive behaviors among pig group is the first step to manage the aggressive behaviors in group pigs effectively. Traditional human recording method is time-consuming and labor-intensive. This method can't be used 24 hours a day, 7 days a week. Machine vision technique brings an automatic monitoring method to solve this problem. In this paper, we introduced a new method for aggressive behaviors monitoring based on deep learning. The experiments were held under controlled environments, which were achieved in an environment-controlled chamber designed previously. The details of the chamber were depicted in a published paper written by our research group. Nursery pigs were fed under three different concentration levels of NH3 gas, which were <3.80 mg/m3, 15.18 mg/m3, 37.95 mg/m3, with a suitable temperature of around 27 ℃ and the comfortable humidity between 50%-70%. Each nursery group had six pigs and were weight around 9.6 kg. During each 28 days' experiment of three concentration levels of NH3, videos were taken from the top of the chamber. An end-to-end network, named 3D CONVNet, was proposed for aggressive behavior recognition of group pigs in this paper, which based on a C3D network and built with 3D convolution kernels. The network structure of the 3D CONVNet was improved in both width and depth dimensions. The number of main convolutional layers was increased to 19, extra batch normalization and dropout layers were added to deepen the network. Furthermore, the multi-scale feature fusion method was introduced to widen the network. This improvement had bettered the performance of the algorithm considerably. To train the 3D CONVNet, 380 aggressive (14 074 frames) and 360 none-aggressive videos (13 040 frames) were chosen from experimental videos recording in experiments of two concertation levels. These videos were randomly divided into training set and validation set, and the ratio of each set is 3:1. Another 556 aggressive videos and 510 none-aggressive videos from the three experimental batches were chosen to build the testing set. There was no overlap among training set, validation set, and testing set. Results showed a total of 981 videos, including aggressive and non-aggressive behaviors, was correctly recognized from the whole 1066 testing videos. The precision of the 3D CONVNet was proved to be 92.03% on testing set. Among them, the precision, recall rate and F1-Score for aggressive behaviors were 94.86%, 89.57%, and 92.14%, respectively. The precision for different NH3 concentration experimental levels were 94.29%, 89.44%, and 85.91%, respectively, which showed the generalization performance of the 3D CONVNet. With the similar heat environments, the 3D CONVNet also showed the good performances under different illumination condition. The comparison with C3D,C3D_1 (19 layers) and C3D_2 (BN) networks resulted in 95.7% on validation set, 43.27 percent higher than the C3D network. The recognition on single image using the 3D CONVNet was only 0.5 s, which was much faster than the other three networks. Therefore, the 3D CONVNet was effective and robust in aggressive behavior recognition among group pigs. The algorithm provides a new method and technique for aggressive behavior auto-monitoring of group pigs and helps improve establishment of auto-monitoring system in pig farms and manage level of pig industry.