Abstract:Abstract: Anoectochilus roxburghii (A.roxburghii) is a rare medicinal herb that mainly distributed in China. It is necessary to identify strains of A.roxburghii for the guidance of clinical medication, due to different strains distinctly vary in medicinal values. However, similar leaf morphology has made difficult to discern different strains directly by naked eyes. In this study, a sub-interval segmentation method was proposed to identify the different strains of A.roxburghii, based on leaf identification methods. Firstly, 6 strains of A.roxburghii were selected, including Taiwan, Hongxia, Xiaoyuanye, Jianye, Yizhu, Dayuanye. A total of 317 images with the resolution of 800×800 pixels were taken, while two filtering methods were used to remove noise. The maximum inner variance algorithm was used for automatic threshold segmentation, in order to obtain the binary image. In the binary image, the leaf contour was drawn, and the mass center of the leaf was calculated. The square area with 150 pixels centered on the mass center was selected as the sub-interval of the leaf, to obtain the target image with the same position and size. Secondly, a combination of texture and color features was applied for the target image, in which texture features were derived by local binary patterns (LBP), gray level co-occurrence matrix (GLCM) and gabor filters, whereas, the color feature was composed of the first, second and third moments. After that, 114 merged features were obtained. Thirdly, the stacking ensemble learning was proposed to improve the accuracy of traditional single classifier. The stacking framework consisted of a base classifier, and a meta-classifier. Logistic regression (LR), K nearest neighbor (KNN), random forest (RF), and gradient boosting decision tree (GBDT) were used as the base classifiers, whereas, GBDT was used as the meta-classifier for stacking. Finally, the cross-validation method different from conventional model was used to divide the data set. The original data was normalized and randomly segmented, where 60% for training and 40% for testing. The training data set was randomly divided into 5 training subsets, and then testing subset for training each base classifier. The prediction results of base classifiers were used as the input vectors of the GBDT. The final prediction result was output by GBDT. The experiment results showed that the average recognition accuracy of the stacking reached 94.49%, while that of LR, KNN, RF and GBDT was 89.13%, 83.15%, 87.56%, 82.36%, respectively. Moreover, the Precision, Recall, and F1-Score of the stacking model for the identification of Taiwan, Hongxia, and Dayuanye were all 100%. The Recall performance of stacking model was better than any of the single classifiers for identification of the Xiaoyuanye, just slightly worse than that of the LR and KNN models. The F1-Score of stacking model reached the maximum in each strain identification, showing the excellent overall performance of the model. Therefore, the proposed method can significantly improve the classification performances of A.roxburghii with different strains. The findings can provide a promising application method to recognize leaves of different plants using shape features. A further research is still necessary to select proper configuration, in order to improve learning efficiency of stacking model.