In-class exercise
Designate group members for each subtask (e.g., task 1 / task 2 question 1, etc.) so that everyone is prepared to take a lead on at least one subtask.
Task 1: Heckman selection formula (30min)¶
Think of the transformation of the selection equation
into one expressing selection in terms of propensity scores. Be explicit about what you condition on when expressing the selection rule. Then explain intuitively or formally why assuming normally distributed unobservables in
leads to the Heckman selection model with the inverse Mill’s ratio controlling for selection bias.
Task 2: Graphical representation of selection and outcome equations (70min)¶
2.1 Policy-relevant treatment effect (PRTE)¶
Assume is an actual policy. Explain the PRTE for moving from to in the graph.
Hint: You can read the answer almost immediately off the graph.
Now assume there is another policy, denoted by , which increases treatment take-up to 40%. Calculate the PRTE for moving from to for both the Heckman and the linear selection models. Explain the difference between the two values. Do you think it is large or small?
Hint: This is the case of an “Additive PRTE”.
Try to find one example each where using the Roy model for such an exercise (observing data on and with all assumptions on the instrument being fulfilled and extrapolating to ) seems reasonable and where it seems unreasonable. Explain your reasoning.
Be specific about the extrapolation problem itself — i.e., whether it is credible that the MTE for the newly induced compliers is well-approximated by the functional form of the selection model. Do not lean on violations of standard IV/exogeneity assumptions (SUTVA, exclusion, monotonicity) — assume those hold.
2.2 Selection on levels¶
Sketch how the average selection bias (see Table 1 in Mogstad et al.) is represented in the graphs when you only observe data with (i.e., forget about ).
Calculate it for the two Roy models in the notebook. Explain the difference between the two values. Do you think it is large or small?
2.3 Selection on gains¶
Sketch how the average selection on the gain (see Table 1 in Mogstad et al.) is represented in the graphs when you only observe data with (i.e., forget about ).
Calculate it for the two Roy models in the notebook. Explain the difference between the two values. Do you think it is large or small?
Note: You may find it helpful to look at the LaLonde (1986) formulation as written down in the very beginning of Section 7 (Equivalence Failures) of Kline and Walters, which rules out selection on gains.
2.4 An own example¶
Come up with an example (binary policy and binary treatment) where you would find the Roy model useful. What would you expect regarding selection on levels and on gains? Add a graph with a parametrisation of the Heckit or the linear function, which would imply this type of selection for some value of the instrument.