Asadollah, S. Babak H.S., Jódar-Abellán, Antonio, Pardo Picazo, Miguel Ángel Optimizing machine learning for agricultural productivity: A novel approach with RScv and remote sensing data over Europe Agricultural Systems. 2024, 218: 103955. https://doi.org/10.1016/j.agsy.2024.103955 URI: http://hdl.handle.net/10045/142523 DOI: 10.1016/j.agsy.2024.103955 ISSN: 0308-521X (Print) Abstract: CONTEXT: Accurate estimating of crop yield is crucial for developing effective global food security strategies which can lead to reduce of hunger and more sustainable development. However, predicting crop yields is a complex task as it requires frequent monitoring of many weather and socio-economic factors over an extended period. Satellite remote sensing products have become a reliable source for climate-based variables. They are easier to obtain and provide detailed spatial and temporal coverage. OBJECTIVE: The aim of this study is to assess the effectiveness of implement a novel optimization algorithm, called Randomized Search cross validation (RScv), on various machine learning algorithms and measure the prediction accuracy enhancement. METHODS: Annual yields of four crops (Barley, Oats, Rye, and Wheat) were predicted across 20 European countries for 20 years (2000–2019). Two NASA missions, namely GPCP and GLDAS satellites, provided us with climate- and soil-based input variables. Those variables were employed as the input of four ensemble Machine Learning (ML) algorithms (Ada-Boost (AB), Gradient Boost (GB), Random Forest (RF) and Extra Tree (ET)) which are faster and more adoptable compare to classic AI algorithms. RESULTS AND CONCLUSIONS: Main results show that applying RScv improves the prediction ability of all ML models over the four crops. In particular, the RScv-AB reaches the overall highest accuracy for predicting yields (R2max = 0.9). Spatial evaluation of predicting errors depicts that the proposed models were more shifted toward underestimation. An uncertainty analysis was also carried out which shows that applying ML algorithms creates higher and lowers uncertainty in Barley and Wheat respectively. SIGNIFICANCE: Considering the robustness of the optimised ML models and the global coverage of remote sensing data, our current methodology demonstrates great transferability and can be applied in other regions across the globe with higher temporal extents. In addition, this tool could be beneficial to decision makers in various sectors to improve the water allocations, deal with climate change effects and keep sustainable agricultural development. Keywords:Crop yield, Remote sensing, Machine learning, Randomized search, Agricultural prediction Elsevier info:eu-repo/semantics/article