Contact person
Olof Mogren
Senior Researcher
Contact OlofAt RISE Learning Machines Seminar on June 12, 2025, we have the pleasure to listen to Markus Pettersson, Chalmers University of Technology, give his talk: Debiasing AI predictions for causal inference without fresh ground truth data.
This seminar is a collaboration between RISE and Climate AI Nordics – climateainordics.com.
When: June 12, 2025, 15:00 CET
Where: Online via Zoom.
Machine learning models trained on Earth observation data, particularly satellite imagery, have recently shown impressive performance in predicting household-level wealth indices, potentially addressing chronic data scarcity in global development research. While these predictions exhibit strong predictive power, they inherently suffer from shrinkage toward the mean, resulting in attenuated estimates of causal treatment effects and thus limiting their utility in policy evaluations. Existing debiasing methods, such as Prediction-Powered Inference (PPI), require additional fresh ground-truth data at the downstream causal inference stage, severely restricting their applicability in data-poor environments.
In this paper, we introduce and rigorously evaluate two novel correction methods—linear calibration correction and Tweedie's correction—that substantially reduce prediction bias without relying on newly collected labeled data. Our methods operate on out-of-sample predictions from pre-trained models, treating these models as black-box functions. Linear calibration corrects bias through a straightforward linear transformation derived from held-out calibration data, while Tweedie's correction leverages empirical Bayes principles to directly address shrinkage-induced biases by exploiting score functions derived from predicted outcomes.
Through analytical exercises and experiments using Demographic and Health Survey (DHS) data, we demonstrate that both proposed methods outperform existing data-free approaches, can achieve significant reductions in attenuation bias and thus providing more accurate, actionable, and policy-relevant estimates. Our approach represents a generalizable, lightweight toolkit that enhances the reliability of causal inference when direct outcome measures are limited or unavailable.
Markus B. Pettersson is a PhD student working at the intersection of machine learning and earth observation, with a focus on large-scale poverty mapping and its applications in development research. His work explores how satellite imagery and data-driven models can be used to estimate socioeconomic conditions in data-scarce regions, and how these maps can support causal analysis in policy and intervention design.