Abstract:Ginger is widely cultivated in temperate zone, tropical and subtropics. China is the largest ginger producer and exporter in the world. Sowing seeds can be the second step in the ginger production, after the soil preparation is ready. It is necessary to lay the ginger flat in the trench, and keep the shoots in the same direction when sowing, in order to ensure that the shoots can emerge in the same direction under the requirement of avoiding light in the production. All the shoots emerge towards the south in an east-west trench, whereas those towards the west in a north-south trench. Therefore, shoots recognition has become a type of key technology to ensure the same direction of shoots, and then realize automatic and accurate sowing. In this study, a feasible way was proposed to realize the rapid recognition and accurate determination of ginger shoots using deep learning. Firstly, the dataset of ginger images was established, including image acquisition, enhancement, and labeling. Secondly, in training a small sample dataset, the data was augmented using online data enhancement to increase the diversity of images, and address the lack of generalization capability. The Mosaic method was used to enrich the background of ginger shoots training without introducing non-informative pixels. Thirdly, the position of ginger shoots regression bounding box directly determined the specific position of shoots, thus DioU (Distance Intersection over Union) bounding box regression loss function was introduced instead of the traditional loss function of IOU, in order to improve the regression effect of regression bounding box. Fourthly, in order to improve the convergence rate of model, the K-means clustering using the IoU measurement was used to derive 9 anchor boxes after linear scaling, indicating more in line with the shoots size. In addition, the Darknet-53 model pre-trained on the ImageNet data set was used for transfer learning, aiming to reduce the training time of model. Finally, after the identification of shoots were completed using the YOLO v3 network, in order to facilitate the selection of the strongest shoot, the area of the prediction bounding box was used as the basis for selection, and only the prediction bounding box with a larger area was retained. A Cartesian coordinate system was established with the center of the image as the origin, and the orientation of shoots was discriminated by calculating the azimuth of the center of prediction bounding box. The average precision and F1 were used to evaluate the performance of ginger shoots recognition model. In test, the IoU threshold and the confidence threshold were analyzed to obtain the best detection effect, while the improved strategies were verified one by one. After training and testing, the detection index was the best, when the IoU threshold was 0.6, and the confidence threshold was 0.001. The average precision and F1 measure reached 98.2% and 94.9% in the shoot recognition model, respectively, where the detection speed was 112 frames/s for a single 416×416 pixels image on the GPU. Compared with the original YOLOv3, the average precision and F1 measure increased by 1.5% and 4.4%, respectively. The recognition model of ginger shoots can be used to achieve significantly excellent recognition, providing a sound theoretical basis to realize automatic and precise ginger sowing.