Abstract:Abstract: Soil organic matter (SOM) can improve the physical, chemical and biological properties of the soil through a variety of functions. An important role of SOM is performed on the soil function and quality, further to prevent the emission of greenhouse gas in global carbon circulation. Spectral characteristics of SOM depend mainly on types of soils, as well as different physical and chemical properties. Previous models constructed by the hyperspectral reflectance or spectral absorption characteristics often lead to the low accuracy in SOM prediction, due mainly to the input type structure was single. In order to improve the accuracy and speed of the prediction model, specific characteristic variables can be selected to reduce the high collinearity between spectral bands, where there is a large amount of hyperspectral data in the presence of redundancy and overlap. The spectral index is set to minimize the influence of independent wavelengths on iterative calculation. Furthermore, the topography significantly determines the surface microclimate, the movement of water on the surface and in the soil, as well as the process of material redistribution. In this study, taking the Hailun City, Heilongjiang Province as the research area, a SOM prediction random forest (RF) model was established for the different types of soil, in order to improve the accuracy of the SOM hyperspectral model. The characteristic bands were selected by a Competitive Adaptive Reweighted Sampling (CARS), while, the Digital Elevation Model (DEM) data and spectral index were data sources. The results showed that: 1) In CARS screening, the characteristic bands of each soil type were compressed to less than 16% of the total wavelength number, which greatly reduced the dimension of soil hyperspectral variables and computational complexity, thereby improving the prediction ability of the calibration model. The CARS was suitable for the extraction of characteristic key wavelength variables, further optimizing model structure. 2) Three types of input variables that extracted by the grouping experiment were then utilized for the prediction of different types of SOM. After grouping, the SOM prediction accuracy depended mainly on the type of soil. Specifically, the maximum prediction accuracy achieved in the Boggy soil of 0.768, where the Ratio of Performance to Interquartile distance (RPIQ) was 3.568. Black soil was the second most accurate. The prediction accuracy of meadow soil was the lowest, only 0.674, and RPIQ was 1.848. The RPIQ for the three types of soil was above 1.8, indicating the good prediction ability of the model. 3) Local regression was conducted to improve the prediction accuracy of SOM. The local regression prediction accuracy was the best. The adjusted coefficient of determination, RMSE and RPIQ of the validation set were 0.777, 0.581%, and 2.689, respectively, indicating the model was highly stable. The proposed prediction factors can be used to realize the rapid prediction of RF-SOM, where the traditional complex program can be simplified. The findings can provide a promising basis for the selection of input variables, thereby predicting the types of SOM in different regions.