Statistical,
data-driven methods are considered good alternatives to process-based models
for the sub-national monitoring of cereal crop yields, since they can flexibly
handle large datasets and can be calibrated simultaneously to different areas.
Here, we assess the influence of several characteristics on the ability of
these methods to forecast cereal yields at the local scale. We look at two
diverse agro-climatic Italian regions and analyze the most relevant types of
cereal crops produced (wheat, barley, maize and rice). Models of different
complexity levels are built for all species by considering six meteorological
and remote sensing indicators as candidate predictive variables. Yield data at
three different spatial aggregation scales were retrieved from a comprehensive,
farm-level dataset over the period 2001–2015. Overall, our results suggest the
better predictability of summer crops compared to winter crops, irrespective of
the model considered, reflecting a more intricate relationship among winter cereals,
their physiology and weather patterns. At higher spatial resolutions, more
sophisticated modelling techniques resting on feature selection from multiple
indicators outperformed more parsimonious linear models. These gains, however,
vanished as data were further aggregated spatially, with the predictive ability
of all competing models converging at the agricultural district and province
levels. Feature-selection models tended to elicit more satellite-based than
meteorological indicators, with a preference for temperature indicators in
summer crops, whereas variables describing the water content of the soil/plant
were more often selected in winter crops. The selected features were, in
general, equally distributed along the plant growing cycle.