在机器学习和基于卫星图像的贫困预测的新领域中重要的事情？与潜在下游应用和开发研究相关的审查

论文标题

在机器学习和基于卫星图像的贫困预测的新领域中重要的事情？与潜在下游应用和开发研究相关的审查

What matters in the new field of machine learning and satellite imagery-based poverty predictions? A review with relevance for potential downstream applications and development research

论文作者

Hall, Olan, Dompae, Francis, Wahab, Ibrahim, Dzanku, Fred Mawunyo

论文摘要

本文回顾了基于卫星和机器学习的贫困估计的最新技术，并找到了一些有趣的结果。在审查的研究中，与福利的预测能力相关的最重要因素是所采用的预处理步骤的数量，所使用的数据集数量，目标指标的类型以及AI模型的选择。正如预期的那样，将硬指标用作目标的研究在预测福利方面取得了更好的绩效，而不是针对软柔软的福利。还可以预期的是，与福利估计绩效具有正相关关系的预处理步骤和数据集的数量。更重要的是，我们发现ML和DL的组合与单独使用相比，预测能力显着提高了15个百分点。令人惊讶的是，我们发现所使用的卫星图像的空间分辨率对性能很重要，但对于性能而言并不重要，因为这种关系是积极的，但在统计学上没有意义。没有证据表明统计学上显着影响随着时间的推移的预测性能也是出乎意料的。这些发现对该领域的未来研究具有重要意义。例如，鉴于中等决议似乎取得了相似的结果，因此必须重新考虑用于获取更昂贵，更高分辨率的SI的精力和资源水平。以并发或迭代方式组合ML，DL和TL的越来越流行的方法可能成为实现更好结果的标准方法。

This paper reviews the state of the art in satellite and machine learning based poverty estimates and finds some interesting results. The most important factors correlated to the predictive power of welfare in the reviewed studies are the number of pre-processing steps employed, the number of datasets used, the type of welfare indicator targeted, and the choice of AI model. As expected, studies that used hard indicators as targets achieved better performance in predicting welfare than those that targeted soft ones. Also expected was the number of pre-processing steps and datasets used having a positive and statistically significant relationship with welfare estimation performance. Even more important, we find that the combination of ML and DL significantly increases predictive power by as much as 15 percentage points compared to using either alone. Surprisingly, we find that the spatial resolution of the satellite imagery used is important but not critical to the performance as the relationship is positive but not statistically significant. The finding of no evidence indicating that predictive performance of a statistically significant effect occurs over time was also unexpected. These findings have important implications for future research in this domain. For example, the level of effort and resources devoted to acquiring more expensive, higher resolution SI will have to be reconsidered given that medium resolutions ones seem to achieve similar results. The increasingly popular approach of combining ML, DL, and TL, either in a concurrent or iterative manner, might become a standard approach to achieving better results.

下载PDF全文

下载文献需遵守相关版权规定

论文标题