Abstract:Robot technology has been gradually applied in modern agriculture in recent years. Among them, a duck egg is one of the most important agricultural products during food processing. However, the current collection of duck eggs can usually require a large amount of manual labor, leading to the time-consuming and labor-intensive task. A smart robot has been developed to automatically collect the duck eggs, in order to improve the production efficiency for the harvesting cost-saving. Specifically, an important technical challenge of harvesting robots can be the rapid and accurate identification and positioning of duck eggs, especially under complex environments, such as occlusion, crowding, and darkness. In this study, duck egg detection was proposed for complex environments using an improved YOLOv7 model. A convolutional attention module (CBAM) was added to the backbone network. The network information transmission was enhanced for better sensitivity to the specific features, while the interference of complex environments was reduced on the duck egg recognition. At the same time, the depth-wise separable convolution (DSC) was utilized to adjust the spatial pyramid pooling (SPP), in order to reduce the number of model parameters and operation costs. A high accuracy was achieved to identify and locate the duck eggs under complex environments, thus providing technical support for the development of harvesting robots. Some materials (such as feathers, straw, and sediment) were also used to simulate the complex environment of the duck house. A duck egg image collection platform was then constructed to evaluate the accuracy and environmental adaptability of the model. Actual duck eggs were photographed in the duck house of Wuhan Yujia Bay Duck Farm using an Honor HLK-AL10 camera for both simulated and actual conditions. The duck egg image data was collected, including the multiple angles, positions, different distances and occlusion forms. A total of 2 600 JPG format images were divided into the training set (1 560 images), validation set (520 images), and test set (520 images), according to a 6:2:2 ratio. At the same time, data augmentation was performed on the training set, in order to improve the robustness and generalization ability of the model. The following procedures were used: 1) To add 12% Gaussian noise. 2) To add 2.5% salt and pepper noise. 3) To set the image gains a=0.3 and a=0.5 to change image brightness. After that, the improved YOLOv7 model was trained with an iteration number of 150. Test results show that the improved YOLOv7 model increased the F1 score by 8.3, 10.1, 8.7, and 7.6 percentage points, respectively, and the F1 score reached 95.5%, compared with the common detection models, such as SSD, YOLOv4, YOLOv5_M, and YOLOv7. The occupied memory space was only 68.7 M, while the average time was 0.022 s for the single-image detection, and the average precision value (mAP, Mean Average Precision) was 85.2%. There were no missed or false detections for the feather occlusion or clustering of duck eggs, with an average confidence level of 93.6% and 85.7%, respectively. The more accurate detection was achieved in the improved YOLOv7 model under the complex environments, indicating superior performance.