In-class exercise - Applied Microeconomics

Reference: key objects from Liu–Mogstad–Salvanes (2025)¶

Keep this section open while you work — the tasks refer back to it, so you should not need to flip to the PDF during group work.

Table 1 — quantities of interest (complier moments)¶

Panel A — life-cycle potential outcomes. Identified by the Imbens–Rubin argument alone:

Quantity	Estimand	$h(\mathbf{Y}_i(d))$
Mean earnings	$E(Y_{it}(d) \mid D_i(1) > D_i(0))$	$Y_{it}(d)$
Variance of earnings	$\mathrm{var}(Y_{it}(d) \mid D_i(1) > D_i(0))$	$(\tilde Y_{it}(d))^2$
Mean employment	$E(H_{it}(d) \mid D_i(1) > D_i(0))$	$H_{it}(d)$
Autocovariance of employment	$E(H_{it}(d) H_{it-k}(d) \mid D_i(1) > D_i(0))$	$H_{it}(d) H_{it-k}(d)$

Panel B — moments of potential log outcomes. Need the employment-selection correction on top of the Imbens–Rubin argument:

Quantity	Estimand	$h(\mathbf{Y}_i(d))$
Mean log earnings	$E(\log Y_{it}(d) \mid D_i(1) > D_i(0),\, H_{it}(d) = 1)$	$\log Y_{it}(d)$
Autocovariance of log earnings	$\mathrm{cov}(\log Y_{it}(d), \log Y_{it-k}(d) \mid D_i(1) > D_i(0),\, H_{it}(d) = H_{it-k}(d) = 1)$	$\widetilde{\log Y}_{it}(d)\, \widetilde{\log Y}_{it-k}(d)$

( $\tilde\cdot$ denotes the deviation of a variable from its mean.)

The parametric system (Section 4.1)¶

Earnings process — eq. (11):

\log Y_{it}^*(d) = \underbrace{m_t(d)}_{\text{common life-cycle component}} + \underbrace{u_{it}(d) + v_{it}(d)}_{\text{idiosyncratic component}}

(1)

with a permanent (random-walk) component and a transitory MA( $q$ ) component:

u_{it}(d) = u_{it-1}(d) + \phi_{r(d)}\, \alpha_i + \zeta_{it}(d), \qquad u_{i0}(d) = \phi_{0(d)}\, \alpha_i

(2)

v_{it}(d) = \sum_{j=0}^{q} \theta_{j(d)}\, \xi_{it-j}(d), \qquad \theta_{0(d)} = 1

(3)

E(\zeta_{it}(d)) = E(\xi_{it}(d)) = E(\alpha_i) = 0, \qquad \mathrm{var}(\zeta_{it}(d)) = \sigma^2_{\zeta(d)}, \quad \mathrm{var}(\xi_{it}(d)) = \sigma^2_{\xi(d)}

(4)

$\alpha_i$ is pre-determined, education-independent latent ability. $\phi_{0(d)}$ and $\phi_{r(d)}$ are its loadings on the initial level and on the growth rate of latent earnings — both education-specific, so education can make earnings more or less ability-dependent.

Employment — eq. (12):

H_{it}(d) = \mathbb{I}\{g_{(d)}(t) + \phi_{h(d)}\, \alpha_i + \epsilon_{it}(d) \geq 0\}

(5)

$g_{(d)}(t)$ is a polynomial in age (with a constant); $\phi_{h(d)}$ loads latent ability onto employment.

Measurement (ability test score) — eq. (13):

M_i(d) = \alpha_{m(d)} + \phi_{m(d)}\, \alpha_i + \varepsilon_{m,i}

(6)

$M_i(d)$ is the potential log IQ score and $\varepsilon_{m,i}$ is measurement error. The latent factor is fixed in scale and location by $\phi_{m(0)} = 1$ and $E(\alpha_i) = 0$ .

Task 1: What does a Mincer regression actually identify? (30min)¶

Recall the Mincer earnings equation:

\log Y_i = \alpha + \rho \cdot S_i + \beta_1 \cdot X_i + \beta_2 \cdot X_i^2 + \varepsilon_i, \qquad X_i = \mathrm{age}_i - S_i - 6.

(7)

Suppose we estimate $\rho$ by OLS on a Norwegian cross-section of, say, 45-year-old men in 1995. Write down every assumption under which $\hat\rho$ would equal “the causal return to schooling” — being concrete about the population, the outcome moment, and the part of the life cycle the estimate refers to.
Now suppose we instrument $S_i$ with $Z_i = \mathbb{I}[\text{post-reform}]$ (the LMS instrument) and run a Wald estimator on the same cross-section. Which of the assumptions in (1) does the IV procedure relax, and which does it leave in place?
The LMS paper does not estimate a Mincer regression. Why not? Be specific about which economic question the Mincer parameter $\rho$ answers, and which question LMS are actually trying to answer.

Task 2: Imbens–Rubin on Table 1 of LMS (60min)¶

LMS use one binary instrument $Z_i$ (compulsory schooling 7 vs 9 years) to identify an entire table of complier moments. Make this concrete.

2.1 Compliance types¶

Write down the three compliance types in the LMS setting, given monotonicity:

always-takers ( $a$ ): $D_i(0) = D_i(1) = 1$
never-takers ( $n$ ): $D_i(0) = D_i(1) = 0$
compliers ( $c$ ): $D_i(0) = 0, D_i(1) = 1$

In words, who is a complier in this setting? In a Norwegian municipality that adopted the reform, what does it mean for an individual to be a complier?
Why are there no defiers, and is monotonicity plausible for this instrument?
Estimate the share of compliers $\pi_c$ from the first-stage in equation (1) of LMS, in terms of conditional probabilities.

2.2 Identifying complier moments¶

Write down explicitly how LMS identify, for an arbitrary function $h$ ,

E[h(\mathbf{Y}_i(1)) \mid D_i(1) > D_i(0)]

(8)

using the two equations (4) and (5) in the paper. Pick three concrete examples of $h$ that correspond to rows of Table 1 (Panel A) and state in words what each gives you.

2.3 Why are log-earnings moments different? (Panel B vs Panel A)¶

In Table 1 of LMS, Panel A is on $\mathbf{Y}_i(d)$ — earnings including zeros, employment, and their variances. Panel B is on log earnings, conditional on employment.

Why does the Imbens–Rubin argument above directly identify Panel A but not Panel B?
Write down the bias term that appears if you naively apply the Panel A argument to $\log Y_{it}(d) \cdot \mathbb{I}[H_{it}(d) = 1]$ .
What additional structure does LMS impose to identify Panel B? (See the text around eq. (5)–(7) and the start of Section 4.1.)

Task 3: The IV ↔ structural interface (45min)¶

This is the heart of the session. Section 4 of LMS estimates a parametric earnings-and-employment system by minimum distance, matching model-implied moments $\mathbf{g}(\Omega)$ to data moments $\mathbf{s}$ :

\min_\Omega \; (\mathbf{s} - \mathbf{g}(\Omega))^\prime (\mathbf{s} - \mathbf{g}(\Omega)).

(9)

The data-side vector $\mathbf{s}$ is the IV-identified moment vector from Section 3. 1,732 moments → 105 parameters.

3.1 Which IV moments enter where?¶

For each of the following structural objects, name the IV-identified moment from Section 3 that primarily pins it down:

Structural object	Where in eq. (11)–(13)	Identifying moment
Common life-cycle component $m_t(d)$	eq. (11), level	?
Variance of permanent shocks $\sigma^2_{\zeta(d)}$	eq. (11)	?
Variance of transitory shocks $\sigma^2_{\xi(d)}$	eq. (11)	?
Factor loading on ability in growth rates $\phi_{r(d)}$	eq. (11), $u_{it}$	?
Factor loading on ability in employment $\phi_{h(d)}$	eq. (12)	?

Some boxes can be filled in with a single moment; others require a change (in $t$ , in lag $k$ , or across $d$ ). Be explicit.

3.2 What does the IV step not identify, even with the structural functional forms in place?¶

For each of the following economic quantities, say whether the IV step alone identifies it (with or without the structural-form assumptions), and what additional ingredient is needed if not:

The complier ATE on mean log earnings at age 50.
The variance of complier latent earnings $\mathrm{Var}(\log Y_{it}^*(d) \mid c)$ at age 50 (latent = “would-be earnings”, including individuals not employed).
The decomposition of complier earnings variance into risk (within individual) versus heterogeneity (between individuals of differing ability).
The willingness-to-pay for $D=1$ versus $D=0$ , in units of annual consumption.
The counterfactual return to schooling under a regime that flattened the progressive Norwegian tax-transfer system.

3.3 What goes wrong if you skip the IV step?¶

Suppose a researcher took Section 4 of LMS but dropped the IV: estimate the parametric earnings process in eq. (11)–(13) directly on observed log earnings, using OLS on $\log Y_{it} = m_t(d) + u_{it}(d) + v_{it}(d)$ with $d$ being observed schooling.

What bias would this introduce in the estimated $m_t(1) - m_t(0)$ ?
Would the bias in $\phi_{r(d)}$ (the loading on ability in growth rates) be the same sign as the bias in $\phi_{r(0)}$ , or different? Why?
Why does the IV step not fix the analogous selection problem on $H$ , even though it fixes the selection on $S$ ?

Task 4: What does structure buy you? (15min, open discussion)¶

Looking at LMS Figure 4 and Table 3 results:

$\phi_{r(1)} \approx 4.4 > \phi_{r(0)} \approx 1.2$ : education raises the ability-driven heterogeneity in earnings growth
High-ability individuals have high employment regardless of education
Education improves employment prospects mostly for low-ability individuals

Which of these three findings could you have learned from a pure IV exercise (any cleverness allowed within Section 3’s framework)? Which require Section 4?
Suppose you don’t believe the linear factor structure in eq. (11)–(13). What is your honest answer to “what is the return to schooling in this paper”? (Hint: you can still keep Panel A of Table 1.)
Bonus: where in the LMS analysis would you push back hardest, if you were a referee?