Abstract:Here rapid and non-destructive detection of tomato root phenotypes was realized using THz (Terahertz) imaging and ensemble learning. Firstly, three groups of tomato seedlings were grown in three sandy soil substrates watered with different concentrations of nitrogen nutrient solution. 12 groups of tomato root seedlings were collected for THz imaging during 20 days of growth. Secondly, the THz pseudo-color map was imaged on the root system after the optimal reconstruction of time-domain peaks. Color value of pseudo-color map was directly related to the intensity of the root system's THz absorption. The noisy data were removed from the overlapping and main root region in the THz maps of the root system, according to the color value of pseudo-color map. Again, the Rosenfeld refinement was used to obtain the skeleton map of tomato root system. The lengths of the root system pixel points were calculated using the sliding window traversal method. Finally, THz time-domain data and refractive index data were extracted from the effective feature region of the root system. The tomato root length was predicted by the Stacking ensemble model. Among them, the first layer of Stacking ensemble model was integrated with the four base models, namely, GBDT (gradient boosting decision tree), XGBoost (eXtreme gradient boosting), Catboost (categorical boosting), and Adaboost (adaptive boosting). The second layer was employed the linear regression as a meta-learner, in order to prevent the over data fitting. A 5-fold cross-validation was used to train the base models. The extraction of root skeleton showed that the RGB three-channel separation was effectively removed the overlapping roots and the spectral data containing noise, in order to fully display the root framework. Therefore, the calculation error of root length was reduced significantly. Only 4.16% was found in the average relative error of tomato root length value calculated by between the THz false-color image and Image-J software. The linear fitting determination coefficient of two types was 0.967. The THz time domain and refractive index of Stacking model were effectively predicted the root length, indicating the better performance than that of the sub-models. The optimal prediction of tomato root length was obtained using the THz data after WD denoising. The optimal determination coefficient of THz time-domain data prediction was 0.999, and the minimum root-mean-square error was 0.743. The optimal determination coefficient of THz refractive index data prediction was 0.998, and the root-mean-square error was 0.976. The length of tomato root system was accurately calculated to predict the root length phenotypes using THz spectra images. The finding can provide a theoretical basis for the rapid and nondestructive detection of root system phenotypes. Further research can be required to nondestructively detect the complete parameters of root phenotype, and then detect the root internal characteristics using THz spectral data. ?