Abstract:Rumination enables cows to chew grass more completely for better digestion, thereby closely relating to the health, production, reproduction, and welfare of cows. To perceive ruminant behavior has widely been one of the most important steps in modern dairy farm management. However, the traditional monitoring ruminant behavior of cows depends mainly on human labor, time-consuming and laborious. In this study, feasible intelligent monitoring was proposed for multi-target automatic tracking mouth and ruminant behavior of cows in the complex environment of dairy farms using the Kalman filter and Hungarian algorithm. The upper and lower jaw regions of cow mouths were firstly recognized by the YOLOv4 model. Subsequently, the region of the upper jaw was tracked by the Kalman filter and Hungarian algorithm. The chewing curve of the mouth region was then obtained to match the upper and lower jaw regions of the same cow. Finally, the related rumination information was achieved to further realize the mouth tracking and ruminant behavior monitoring of multi-target cows. As such, the unmatched tracking boxes were remained and expanded to deal with the identity change of cows caused by rapid head swing or shed railing occlusion. 66 videos were collected for ruminant cows in the actual farm environment, where 58 videos were divided into frames to make the dataset for the YOLOv4 model, and the remaining 8 videos were used to verify the tracking and rumination behavior. Videos data was involved sunny, cloudy, and rainy days, in which the number of cows varied from 2 to 3. Lying or standing was included in the ruminant posture of cows. Additionally, there were some interference factors, such as the rapid head swing of ruminating cows, shed railing occlusion, and the movement of other cows. Two indexes were selected to evaluate the detection performance of the YOLOv4 model, including average precision and mean average precision. 6 400 images of the dataset were trained, and 800 images were tested. The results showed that the average precisions of the YOLOv4 model were 93.92% and 92.46% for the detection of the upper and lower jaw region, respectively. The mean average precision of YOLOv4 reached 93.19%, which was 1.04, 4.25, and 1.74 percentage points higher than that of YOLOv5, SSD, and Faster RCNN models, respectively. Four indexes were selected to verify the performance of tracking and rumination behavior under different environments, including the rate of identity switch, the rate of identity match, the detection rate of chewing times, and the tracking speed. It was found that the YOLOv4 model realized the stable multi-target tracking of mouth regions of cows in complex environments, while effectively alleviating the identity change of cows resulted from the rapid head swing and shed railing occlusion. The average rate of identity match was 99.89% for the upper and lower jaws, and the average tracking speed was 31.85 frames/s. The average detection rate of chewing times was 96.93% after the evaluation of rumination behavior, and the average error of rumination time was 1.48 s. This finding can provide a strong reference for the intelligent monitoring and ruminant behavior of multi-target cows (or moving parts of animals) in actual breeding.