Skip to content

Estimating and Understanding Crop Yields with explainable deep learning in the Indian Wheat Belt

Climate change and the growing global population are two of several factors that challenge global food security. Finding solutions to the challenge of producing more food under increasingly adverse conditions requires the thoughtful application of the sciences. Predicting crop yields and knowing which variables are driving that prediction supplies farmers, policy makers, and scientists with information that contributes to the understanding of the current state of food production. The authors of this paper apply deep learning to a set of satellite imagery to predict crop yields for districts in the Indian Wheat Belt.

Methodologies Used

For the task of predicting crop yields from satellite imagery with an explainable model, the authors use several models (CNN, RR, RF, Null Model) to make predictions and regression activation mapping (RAM) to explain the CNN model. The convolutional neural network (CNN) is coupled with the regression activation mapping and compared to the ridge regression (RR), random forest (RF), and null models to show that despite the explainability constraints the CNN performs well compared to the base models. Regression activation mapping take the inputs at time t and multiplies it by the weights of the CNN at time t. RAM tells us how heavily a model is weighing a variable in it’s prediction, and the more activated an input at time t the more it drives the prediction. The time series data they are feeding these models is composed of 3 vegetation indices (NDVI, NDWI, NIRv) and 7 environmental variables (air temperatures, radiation, precipitation, day length, etc.). The three vegetation indices are slightly different ways to measure if there is vegetation growing in an area using satellite imagery. When a model makes a prediction it needs to be compared with the ground truth using an objective function. In this study they use the nash-sutcliffe efficiency (NSE) coefficient, which was created to check the predictive power of hydrological models. However, NSE can be use for other things that discharge like nutrient loadings, temperature, concentrations, etc. The data is collected from 2001-2013 in the Indian Wheat Belt Region, and the models are given 1-10 of the input variables.

Key Takeaways

The CNN’s performance is similar to RR, RF, and the Null Model except it performs slightly better with more input variables and losses very little predictive accuracy on abnormal years. In 2012 there was an abnormally high production of wheat, and RR, RF and the null model have a significantly lower NSE in 2012 than for all of the years, Where as the CNN’s NSE only decrease slightly. In addition, the RAM of the CNN demonstrates that this model is explainable. The RAM show that as the growing season progress, the input variables fluctuate in how they affect their model. Meaning that the timing of precipitation is important in the same way that the amount of precipitation is.

Citation

Wolanin, A., Mateo-García, G., Camps-Valls, G., Gómez-Chova, L., Meroni, M., Duveiller, G., … & Guanter, L. (2020). Estimating and understanding crop yields with explainable deep learning in the Indian Wheat Belt. Environmental research letters15(2), 024019. https://iopscience.iop.org/article/10.1088/1748-9326/ab68ac/meta

Leave a Reply

Your email address will not be published. Required fields are marked *