Talk from Archives

Optimal dynamic treatment allocation

30.10.2017 16:45 - 17:45

In a treatment problem the individuals to be treated often arrive gradually. Initially, when the first treatments are made, little is known about the effect of the treatments but as more treatments are assigned the policy maker gradually learns about their effects by observing the outcomes. Thus, one faces a tradeoff between exploring the available treatments and exploiting the best treatment, i.e. administering it as often as possible, in order to maximize the cumulative welfare of all the assignments made. Furthermore, a policy maker may not only be interested in the expected effect of the treatment but also its riskiness. Thus, we allow for the welfare function to depend on the first and second moments of the distribution of each of the treatments. We propose a dynamic treatment policy which attains the minimax optimal regret relative to the unknown best treatment in this dynamic setting. We allow for the data to arrive in batches as, say, unemployment programs only start once a month or blood samples are only send to the laboratory for investigation in batches. Furthermore, we show that the minimax optimality does not come at the price of overly aggressive experimentation in the beginning of the treatment period as we provide upper bounds on the expected number of times any suboptimal treatment is assigned. Next, we consider the case where the outcome of a treatment is only observed with delay as it may take time for the treatment to work. Thus, a doctor faces a tradeoff between getting imprecise information quickly by making the measurement soon after the treatment is given or getting precise information later at the expense of less information for the individuals who are treated in the meantime.

Homepage of Anders Bredahl Kock

Location:
HS 7 OMP1