Abstract:Robotic harvesting had been constrained by the low positioning accuracy of strawberry stem picking points and the significant challenge of identifying occluded strawberries. In this study, we proposed an improved YOLO v8 model combined with Pose key-point detection for enhanced strawberry recognition and localization. The accuracy of picking point localization was also improved, especially for occluded strawberries in complex environments. To optimize the YOLO v8 model, we introduced the Bidirectional Feature Pyramid Network (BiFPN) and the Generalized Attention Module (GAM), which enhanced bidirectional information flow, dynamically allocated feature weights, and focused on extracting features of small targets and enhancing the features of occluded regions. As a result, the model's ability to accurately detect and localize strawberries in complex environments was significantly improved.Experimental results showed that the improved YOLO v8-pose model outperformed the original model in several metrics: the Precision (P) increased by 6.01 percentage points, Recall (R) by 1.98 percentage points, mean Average Precision (mAP) by 6.67 percentage points, and mean Average Precision for key points (mAPkp) by 7.85 percentage points. The positioning accuracy for strawberry stem picking points, based on key-point detection, achieved errors of just 1.4 mm in both the X and Y directions and 2.2 mm in the Z direction. Additionally, the occlusion level was classified according to the overlap area of occluded strawberries, and the model's performance under varying occlusion conditions was assessed. Under these conditions, the mAPkp of the improved YOLO v8-pose model increased by 9.78 percentage points compared to the original model.Field trials further validated the model's effectiveness, with the strawberry-picking robot achieving a 95% success rate, picking each strawberry within 10 seconds. The high success rate and short picking time demonstrated the practicality of the model in real-world agricultural settings, indicating its high efficiency and accuracy. The improved YOLO v8 model with key-point detection accurately and robustly recognized strawberries, leveraged multi-scale features with the BiFPN architecture, and focused attention on relevant regions with the GAM, especially for occluded strawberries. These advancements significantly improved overall performance in precision, recall, and average precision, particularly under occlusion scenarios.In conclusion, these advanced techniques were integrated into a more capable strawberry-picking robot system. The enhanced accuracy and efficiency achieved in recognizing and localizing strawberries, even in challenging occlusion scenarios, highlighted the system's potential for practical agricultural applications. The findings contributed significantly to automated strawberry harvesting in agricultural robotics, paving the way for more efficient and cost-effective farming solutions in sustainable production.