Abstract:Deep learning represented by Convolutional Neural Networks (CNN) has been increasing rapidly in recent years, due to its powerful feature learning for computer vision and natural language processing. But there are few studies in the field of hyperspectral remote sensing in soil. Therefore, this study aims to estimate Soil Organic Matter (SOM) using hyperspectral images in small sample dataset, thereby to investigate the modeling effects of different network structures. A total of 248 red soil samples were collected from the northern Fengxin county, Jiangxi province, China. A geospectrometer was used to capture the spectral data. The original spectral data was resampled at 10 nm intervals, after removing the edge bands of 350-399 nm and 2 451-2 500 nm with a low signal-to-noise ratio. A total of 205 original spectral bands and their derivative transformation were obtained as input data, while the SOM content as output data of the model. Firstly, the modeling effects of CNN were compared, such as Multilayer Perceptron (MLP), Random Forest (RF) and Support Vector Machine (SVM) under different spectral pretreatments. Five CNN structures were established, including the earliest LeNet-5, AlexNet-8 with large convolutional core, VGGNet-7 with small convolutional core, GoogLeNet-7 with inception structure, and ResNet-13 with residual learning, particularly on the modeling effects of VGGNet model at five depths. Secondly, all models were evaluated using random deactivation (Dropout) and early stopping to prevent overfitting of the model by three indicators: decision coefficient (R2), Root Mean Square Error (RMSE) and Relative Analytical Error (RPD). Finally, the black box of CNN model was explained. The results showed that: 1) Due to the strong capability of feature learning in CNN models, the RPD of each CNN model in the validation set was greater than 2.5 in the case of the original spectral data, indicating excellent prediction capability and a better way to predict SOM content using hyperspectral images. 2) In the comparison of different network structures, an optimal model was determined in the network structures of LeNet-5 and VGGNet-7 with small convolutional nuclei, step length, and pooling range of hyper parameters, although the later GoogLeNet-7 and ResNet-13 both incorporated special structures. Therefore, the setting of some hyper parameters in the CNN model can be more critical than the network structure. In different depths, the model was prone to overfitting and unstable, as the network depth increased, where the shallow CNN structure was better than the deep one. 3) An optimal model was achieved in the VGGNet-7 network structure with the excellent model estimation power: R2 was 0.895 and RMSE was 4.145 g/kg on the training set, while R2 was 0.901, RMSE was 4.647 g/kg and RPD was 3.291 on the verification set. 4) The wavelengths of 680, 1 360, 1 390, 1 920, 2 310 nm and its vicinities were the important for SOM and they were extracted from the process of VGGNet-7 model establishment. The CNN can be expected for very broad application prospects, due to its simple spectral pre-processing, and feasibility in small samples of soil hyperspectral remote sensing. Therefore, the VGGNet-7 can be applied to the red soil area for rapid and accurate estimation of SOM content using hyperspectral data.