Abstract:Abstract: Zinc (Zn) is one of the common elements of heavy metal pollutants in open-pit coal mines. Heavy metals in tailings have been accumulated around the mining areas, leading to a huge threat to the farmlands and urban ecosystems. The serious environmental issue has been highly urgent for the largest producer and consumer of coal in the world, China. Particularly, coal is one of the major fossil fuels, accounting for more than 70% of the total energy. What is worse, heavy metals may accumulate toxicity via the food chain, which may cause harmful impacts on public health. Specifically, long-term exposure to Zn pollution can lead to gastrointestinal discomfort, nausea, vomiting, abdominal pain, and respiratory symptoms. In addition, exposure to Zn with a high dose may also affect the cholesterol balance and fertility in agricultural production. A promising and practical way can be to determine the spatial distribution of soil heavy metals for soil pollution evaluation and control, land reclamation, and soil remediation. Most efforts were made on the potential risks of heavy metal pollution around the open-pit coal mines. However, the rapid detection is still lacking in the soil element distribution for the proper treatment with the heavy metals pollution. The traditional examination can also be time-consuming, laborious, and easy to cause secondary pollution. Fortunately, hyperspectral remote sensing technology can provide a new perspective to identify the concentrations of soil heavy metals in recent years. This study aims to implement the hyperspectral remote sensing inversion of soli Zn contents in an opencast coal mine using Random Forest (RF) and Continuous Wavelet Transform (CWT). 111 soil samples were collected to measure the Zn concentrations and spectral reflectance. The Savitsky Golay (SG) smoothing, Continuum Removal (CR), and CWT were used to reduce the noise for the spectral response characteristics of soil samples. Moreover, the characteristic bands were selected via the Boruta Algorithm. The Partial Least squares regression (PLSR) and Random Forest (RF) were introduced to construct the inversion model of soil Zn contents. Furthermore, the leave one out cross-validation was carried out to examine the robustness of the optimal model. Last but not least, the spatial distribution of Zn concentrations was mapped by the geographical interpolation using the optimum models. The results indicated that: 1) The CWT significantly enhanced the spectral responses to reduce the noise of spectral data. 2) The Boruta algorithm accurately retrieved the characteristic bands to remove the redundancy of spectral information. The number of characteristic bands varied with the different CWT decomposition scales and spectral measurement conditions. 3) The RF performed better to estimate the soil Zn contents, compared with the PLSR. The CWT combined with RF presented a higher accuracy than the rest. Particularly, the errors of the optimal field in-situ spectral inversion model (the determination coefficients of calibration dataset and validation dataset were 0.92, 0.54 respectively) were significantly lower than those of the laboratory spectral inversion model (the determination coefficients of calibration dataset and validation dataset were 0.95, 0.72 respectively). 4) There was significant heterogeneity in the spatial distribution of soil Zn. Specifically, the high values were concentrated in the southwest and northeast of the study area. This finding can also provide a strong reference to detect the soil heavy metal concentrations using the hyperspectral remote sensing technology, particularly for the soil pollution evaluation, land reclamation, and remediation, as well as soil remediation in the mining areas.