Abstract:Abstract: Apple quality has been ever increasingly required with the improvement of living standards in recent years. The core ratio is one of the most significant factors to determine the quality of apples. But, the manual measurement on the fruit core cannot fully meet the current detection requirements, in terms of cost and accuracy at present. In this study, an automatic segmentation was proposed for the fruit core using a TMU-Net network model. Firstly, three common types of apples were selected in the Xinjiang of China. An acquisition device was then used to capture the 311 cross-sectional images of the fruit core. Secondly, the preprocessing operations were also conducted to enhance the original images, including translation, vertical mirroring, horizontal mirroring, and adding Gaussian noise. Better training was achieved in the expanded dataset, compared with the original. Specifically, the Intersection Over Union (IOU), Precision, Recall, and F1-score of the TMU-Net network increased by 27.28, 36.62, 29.81, and 32.06 percentage points, respectively. It infers that the data enhancement improved the robustness and generalization of the model after training. The Multiple Residual Dilated Convolution (MRDC) module was also constructed with the Cavity convolution in the different void ratios and shortcut connections. Shortcut connections are skipping one layers, they simply perform identity mapping. As such, the information loss was reduced in the jump connection part of the model. There was also less semantic difference between the encoder and the decoder. The MRDC module was finally used to verify the TMU-Net jump connection. The results showed that: 1) The MRDC module was introduced to effectively improve the segmentation performance of the model, in which the IOU, Precision, and F1-score were improved by 1.59, 6.49, and 4.65 percentage points, respectively. 2) The first 13 layers of VGG-16 network were used as the backbone to capture the low-level features. The Transformer encoder was integrated into the network structure to enhance the global extraction of the network, particularly for the locality of convolution operations. The segmentation shows that the TMU-Net network was much more precise to process the sharp corner and edge details of the fruit center, indicating the feasibility of the model in the segmentation task of the fruit center. 3) The TMU-Net model was trained under a variety of transfer learning. Therefore, freezing the training of specific network layers can be expected to effectively improve the indicators of the model. The training curve of the model showed that the training was used to accelerate the convergence speed. Subsequently, the TMU-Net, DeeplabV3+, U-Net, and PSPNet models were trained to verify the test set under the same experimental parameters. The IOU, Precision, Recall, and F1-score of the TMU-Net model increased by 3.96, 7.15, 9.49, and 6.30 percentage points, respectively, compared with the DeeplabV3+ model with better effect. Therefore, this TMU-Net model can be expected to accurately and effectively realize the fruit core segmentation. The finding can also provide a strong reference for the intelligent detection of apple quality.