Cost-effective incentive allocation via structured counterfactual inference

Mar 1, 2020·

Romain Lopez

Chenchen Li

Xiang Yan

Junwu Xiong

Michael I. Jordan

Yuan Qi

Le Song

· 0 min read

PDF Cite

Abstract

We address a practical problem ubiquitous in modern industry, in which a mediator tries to learn a policy for allocating strategic financial incentives for customers in a marketing campaign and observes only bandit feedback. In contrast to traditional policy optimization frameworks, we rely on a specific assumption for the reward structure and we incorporate budget constraints. We develop a new two-step method for solving this constrained counterfactual policy optimization problem. First, we cast the reward estimation problem as a domain adaptation problem with supplementary structure. Subsequently, the estimators are used for optimizing the policy with constraints. We establish theoretical error bounds for our estimation procedure and we empirically show that the approach leads to significant improvement on both synthetic and real datasets.

Type

Journal article

Publication

AAAI Conference on Artificial Intelligence

Last updated on Aug 18, 2025

← Enhancing scientiﬁc discoveries in molecular biology with deep generative models Jul 1, 2020

A joint model of RNA expression and surface protein abundance in single cells Oct 1, 2019 →