COMPUTERS AND ELECTRONICS IN AGRICULTURE, vol.219, 2024 (SCI-Expanded)
Contamination of agricultural soils with trace metals is of concern as it poses potential long-term threats to water resources, aquatic species, and human health. Therefore, fast, accurate and reliable methods should be developed to monitor trace metal content of agricultural soils. This study was conducted to compare performance of different machine learning models (Artificial Neural Network - ANN, Deep Neural Network - DNN, Random Forest - RF, K-Nearest Neighbors - KNN and Adaptive Boosting - AB) in estimation of heavy metal (Cu, Fe, Mn, and Zn) contents of the soils over which intensive paddy-farming has been practiced for years. Model stability was also investigated. Based on correlation analysis, some soil physicochemical parameters (EC, pH, Na, K, N) and soil depth were defined as covariates to improve estimation accuracy for soil heavy metals. Model performance was assessed through coefficient of determination (R2), mean absolute error (MAE), and root mean square error (RMSE). Scatter plots, box plots and Taylor diagrams were used for graphical comparison of model performances. Present findings revealed that with greater R2 and lower RMSE values, RF model (RMSE = 1.11 ppm, R2 = 0.90) yielded more accurate outcomes for Cu, RF (RMSE = 25.40 ppm, R2 = 0.67) model for Fe, RF (RMSE = 9.05 ppm, R2 = 0.59) model for Mn and ANN (RMSE = 0.35 ppm, R2 = 0.49) model for Zn than the other models. Besides, AB model yielded more stable estimations for Cu contents and ANN models for the other heavy metals. The smallest change in RMSE values of training and testing datasets was 2.5 % (AB) for Cu, 10.38 % (ANN) for Fe, 21.35 % (ANN) for Mn and 6.79 % (ANN) for Zn. Besides, overfitting was observed in RF model. Moreover, the sensitivity analysis of the best and most stable models showed that EC, pH, and N in particular had a significant impact on the Zn, Cu, Mn, and Fe accumulation of soils. Better performance of ANN models was resulted from better modeling of complex nonlinear relationships between heavy metal contents of soils and covariates. It was concluded based on present findings that artificial intelligence-based methods could reliably and successfully be use to predict trace metal content of paddy fields.