Abstract:Unmanned aerial vehicle (UAV) remote sensing has the promising potential for the precise and efficient classification and mapping of forest tree species. Deep learning also requires a large number of datasets for training, typically on manual annotation. In this study, the framework of forest tree species classification was proposed to fully utilize a large amount of unlabeled data and a small amount of annotated data using semi-supervised learning. A rapid and accurate classification was also achieved in the high-precision distribution of dominant tree species in forests. The experimental areas were taken as the complex mountainous forest environment in Fujian Province. The composition of tree species was then obtained in a rapid, effective, and cost-saving manner. Taking four experimental areas in Fuzhou, Longyan, and Sanming in Fujian Province as examples, the simplified classification was constructed in the UNet tree species (ResNet14 *) model with ResNet18 as the backbone. ResNet14 * was different from ResNet18: ResNet14 * was used to remove the layer4 part of ResNet18, i.e., the last downsampled cascaded block, which retained slightly higher spatial information; At the end of the layer2 and layer3 sections of ResNet14 *, a max pooling layer was added to reduce the training parameters of the neural network while retaining the original features. A joint loss function of cross entropy and Dice coefficient was used to optimize the model parameters. The generalization of Self-training and Mean teacher was evaluated on the classification models with semi-supervised learning using UVA images. The results show that the overall accuracy of the ResNet14 * network reached 91.15%, with a Kappa coefficient of 0.827, which was within 1% of the accuracy of the rest ResNet models. At the same time, a smaller number of parameters and the shortest prediction time were achieved to balance the accuracy and efficiency of tree species classification. The best prediction performance of ResNet14 * was achieved with the joint loss function weight of 0.5, indicating an overall accuracy of 91.15%. Therefore, the joint loss function weight of 0.5 was an optimal value for semi-supervised learning in this case. Self-training and Mean teacher semi-supervised learning were implemented with UNet (ResNet14 *) as the main network. The experiment showed that the overall accuracy of the Self-training on the test set reached 91.08%, slightly lower than the original. The higher category accuracy was also achieved in the categories of Schima superba, Pinus massoniana, and Chinese fir with sufficient samples. Furthermore, the overall accuracy of the self-training with pseudo labels was improved among two semi-supervised models in experimental area D, reaching 88.50% compared with the original; There was a significant decrease in the overall accuracy of the Mean teacher model with consistency loss. The total accuracy of the Mean teacher model was 88.56%, where the accuracy was 73.56% in the independent validation area. Accuracy evaluation was also conducted on an independent validation area. The classification accuracy of above 80% was found in the three types of tree species, namely Schima superba, Pinus Massoniana, and Chinese fir. A relatively large area was accounted for to meet the accuracy requirements of tree species mapping in the experimental area. Therefore, the semi-supervised learning of the Self-training model can be expected to rapidly obtain the composition of tree species in the experimental area.