8 Causal inference: Other paths to observational identification

8.1 Fixed effects and differencing

8.2 DiD

Key issue: FE/DiD does not rule out a correlated dynamic unobservable, causing a bias

Helpful links from a Twitter thread:

Ok, let’s say you’ve neither written nor refereed a Diff-in-Diff paper in the last two yeasts. What are the key methodological papers I need to brush up on? #EconTwitter

Didn’t someone put together a list?
— Damon Jones (nomadj1s) May 20, 2020

Some other resources and discussion (unfold)

BB:

From that Twitter thread i also found this really useful google drive folder of recent developments.

This Freyaldenhoven paper might also be handy. And it looks like there is quite a lot by Susan Athey that we should look at too.

Re Athey, she has a paper with Imbens (2006) which proposes the change in changes approach which allows estimation of hetergoenous treatment effects in a “better” way than quantile DiD. The assumptions of CiC are pretty similar to those of DiD.

That method (CiC) is applied by this Assuncao paper to the Amazon. We actually looked at this paper at the journal club in LEEP and the point was raised that actually you could see the observed effect (reduced deforestation in targetted municipalities) simply owing to the fact there is very little other forest left to harvest.

On DiD more generally, I think it suffers from the flaw that we have to somehow establish pre-treatment parallel trends which is basically an issue of ‘accepting a null hypothesis’ which is clearly not possible with traditional statistical inference.

DR: Yes, the problem of diagnostic testing the assumption that the identification entirely relies on. Does CiC suffer from the same issue?

8.3 RD

8.4 Time-series-ish panel approaches to micro

See esp:

Wooldridge on distinct structures in panel data, First differences vs Fixed effects, etc.
Nickell bias and Arellano-Bond and related procedures
Diagnosing structural breaks… outside of Macroeconomics

Ben: Ferraro has a couple of paper which together i think basically just caution against just accepting panel regression approaches on the basis that the covariates are matched well. They are here and here

8.4.1 Lagged dependent variable and fixed effects –> ‘Nickel bias’

8.4.2 Age-period-cohort effects

From Columbia Public health, in the context of epidemiology:

Age effects are variations linked to biological and social processes of aging specific to individuals … unrelated to the time period or birth cohort to which an individual belongs.

We might want to further disentangle this (at least conceptually) into:

overall-age effects: effect of an individual, organization, or any “unit’s” age in years vs
‘tenure effects’: the effect of time spent participating (e.g., time in the labor force or as part of a political movement)

Period effects result from external factors that equally affect all age groups at a particular calendar time. … [e.g.] environmental, social and economic factors, e.g. war, famine, economic crisis.

Cohort effects are variations resulting from the unique experience/exposure of a group of subjects (cohort) as they move across time. The most commonly defined group in epidemiology is the birth cohort based on year of birth and it is described as difference in the risk of a health outcome based on birth year.

“identification problem” in APC… s due to the exact linear dependency among age, period, and cohort: Period – Age = Cohort; that is, given the calendar year and age, one can determine the cohort (birth year)

This apc package writeup looks promising both as a tool and an explanation

A standard APC model of the age-period-specific statistic (e.g., the average income) for age group \(i\) in period \(j\): (O’Brien et al, 2008)

\[Y_{ij} = \mu + \alpha_i + \pi_j + \chi_k + \epsilon_{ij} \] They refer to the “effect of the ith age group” \(\alpha_i\) but this need not be interpreted causally.

\(\pi_j\) captures period \(j\), \(\chi_k\) the kth cohort \(\mu\) the intercept, and \(\epsilon_{ij}\) the “random disturbance.”

[The] linear dependency among the variables representing age, period, and cohort, since knowing the values of any two of these variables allows one to know the value of the third … means that we cannot estimate the A – 1 age effects, the P − 1 period effects, and the A + P – 1 cohort effects in a single OLS model.

(I believe) this holds whether or not we restrict the age, period, or cohort effects to being linear in a time variable.

The above equation also depicts the “age–period–cohort-characteristic (APCC)” model, letting \(\chi_k\) represent “the value of [here, a single] cohort characteristic for the kth cohort.” In other words, we replace a set of simple cohort dummies with (I presume) a variable meant to capture the meaningful characteristic of the cohort leading to the aforementioned ‘cohort effects.’

" This formulation eliminates the linear dependency … since the cohorts’ values on the cohort characteristic are unlikely to be perfectly linearly related to the age and period."

They next depict an ‘age-period’ model:

\[Y_{ij} = \mu + \alpha_i + \pi_j + \epsilon_{ij}\]

Estimating this model leads to the ‘observed residuals’ \(e_{ij}\)

We could then examine the pattern of these residuals to see if they indicate an effect of cohort membership (after controlling for the effects of age and period).