Over 10 Million Study Resources Now at Your Fingertips

Download as :
Rating : ⭐⭐⭐⭐⭐
Price : $10.99
Pages: 328

Imbens and rubin covers the potential outcomes model


C AU S A L I N F E R E N C E :

First printing, April 2018


Potential outcomes causal model 81

Matching and subclassification 105

Synthetic control 287



2 Wright’s graphical demonstration of the identification problem 21

3 Graphical representation of bivariate regression from y on x 46

and talent (horizontal axis). Top right right figure: Star sample scat-

ter plot of beauty and talent. Bottom left figure: Entire (stars and non-





with imputed test scores for all post-kindergarten ages [Krueger, 1999].

Switching of students into and out of the treatment arms between

12 first and second grade [Krueger, 1999].

Lung cancer at autopsy trends


14 Smoking and Lung Cancer
15 Covariate distribution by job trainings and control.
16 Covariate distribution by job trainings and matched sample. 126

Lalonde [1986] Table 5(b)


20 (using PSID)





23 Histogram of propensity score by treatment status


24 Angrist and Lavy [1999] descriptive statistics
26 sample.

Average reading scores vs. enrollment size [Angrist and Lavy, 1999].


Reading score residual and class size function by enrollment count


grist and Lavy, 1999].


32 Second stage regressions [Angrist and Lavy, 1999].

Sharp vs. Fuzzy RDD [van der Klaauw, 2002].





Simulated nonlinear data from Stata

38 Illustration of a boundary problem
39 Insurance status and age
40 Card et al. [2008] Table 1

Investigating the CPS for discontinuities at age 65 [Card et al., 2008]


Investigating the NHIS for the impact of Medicare on care and uti-


Mortality and Medicare [Card et al., 2009]







Example of outcome plotted against the running variable. 185


76 Table 3 from Cornwell and Trumbull [1994] 252

77 Table 2 from Draca et al. [2011] 253

82 Simple DD using sample averages 268

83 DD regression diagram 270

88 Comparison of Internet user and non-user groups 277

89 Theoretical predictions of abortion legalization on age profiles of gon-

92 Subset of coefficients (year-repeal interactions) for the DDD model,

Table 3 of Cunningham and Cornwell [2013]. 283

97 California cigarette sales vs synthetic California 294

98 Placebo distribution 295

103 Synthetic control graph: Differences between West Germany and Syn-

thetic West Germany 298

108 Gap between actual Texas and synthetic Texas 305

109 Histogram of the distribution of ratios of post-RMSPE to pre-RMSPE.

List of Tables

1 Examples of Discrete and Continuous Random Processes.


2 Total number of ways to get a 7 with two six-sided dice. 25
4 Two way contingency table.



Simulated data showing the sum of residuals equals zero

9 parity



Yule regressions [Yule, 1899].


Post-treatment observed lifespans in years for surgery D = 1 ver-


sus chemotherapy D = 0.

14 Death rates per 1,000 person-years [Cochran, 1968]
15 Mean ages, years [Cochran, 1968].



Subclassification example.

18 Subclassification example of Titanic survival for large K 118

Training example with exact matching (including matched sample)


Another matching example (this time to illustrate bias correction)

22 Nearest neighbor matched sample
23 Nearest neighbor matched sample with fitted values for bias correc-
Completed matching example with single covariate



Distribution of propensity score for treatment group.

26 Distribution of propensity score for CPS Control group. 142

29 OLS and 2SLS regressions of Log Quantity on Log Price with wave

height instrument 241

32 Compared to what? Different cities 264

33 Compared to what? Before and After 264

To my son, Miles, one of my favorite people. I

love you. You’ve tagged my head and heart.

I like to think of causal inference as the space between theory and estimation. It’s where we test primarily social scientific hypotheses in the wild. Some date the beginning of modern causal inference with Fisher [1935], Haavelmo [1943], Rubin [1974] or applied labor eco-nomics studies; but whenever you consider its start, causal inference is now a distinct field within econometrics. It’s sometimes listed as a lengthy chapter on “program evaluation” [Wooldridge, 2010], or given entire book-length treatments. To name just a few textbooks in the growing area, there’s Angrist and Pischke [2009], Morgan and Winship [2014], Imbens and Rubin [2015] and probably a half dozen others, not to mention numerous, lengthy treatments of spe-cific strategies such as Imbens and Lemieux [2008] and Angrist and Krueger [2001]. The field is crowded and getting more crowded every year.

So why does my book exist? I believe there’s some holes in the market, and this book is an attempt to fill them. For one, none of the materials out there at present are exactly what I need when I teach my own class on causal inference. When I teach that class, I use Morgan and Winship [2014], Angrist and Pischke [2009], and a bunch of other stuff I’ve cobbled together. No single book at present has everything I need or am looking for. Imbens and Rubin [2015] covers the potential outcomes model, experimental design, matching and instrumental variables, but does not contain anything about directed

synthetic control. Angrist and Pischke [2009] is very close, but does not include anything on synthetic control nor the graphical models that I find so useful. But maybe most importantly, Imbens and Rubin [2015], Angrist and Pischke [2009] and Morgan and Winship [2014]


Finally, this book is written for people very early in their careers, be it undergraduates, graduate students, or newly minted PhDs.

My hope is that this book can give you a jump start so that you don’t have to, like many of us had to, meander through a somewhat labyrinthine path to these methods.

This is probably because I remain deep down a teacher who cares about education. I love helping students discover; I love sharing in that discovery. And if someone is traveling the same windy path that I traveled, then why not help them by sharing what I’ve learned and now believe about this field? I could sell it, and maybe one day I will, but for the moment I’ve decided to give it away – at least, the first few versions.

The second reason, which supports the first, is something that Al Roth once told me. He had done me a favor, which I could never repay, and I told him that. To which he said:

3 I give a lot of thought to anything and everything that Roth says or has ever said actually.

life, and what they want to say about the life they were given to live when they look back on it. Economic models take preferences as given and unchanging [Becker, 1993], but I have found that figuring out one’s preferences is the hard work of being a moral person.

How I got here

“Started from the bottom now we’re here”
– Drake

and working as a qualitative research analyst doing market research and slowly, stopped writing poetry altogether.4
My job as a qualitative research analyst was eye opening in part because it was my first exposure to empiricism. My job was to do“grounded theory” – a kind of inductive approach to generating ex-planations of human behavior based on focus groups and in-depth interviews, as well as other ethnographic methods. I approached each project as an opportunity to understand why people did the things they did (even if what they did was buy detergent or pick a ca-ble provider). While the job inspired me to develop my own theories

4 Rilke said you should quit writing poetry when you can imagine yourself living without it [Rilke, 2012]. I could imagine living without poetry, so I
took his advice and quit. I have no
regrets whatsoever. Interestingly, when I later found economics, I believed I would never be happy unless I was a professional economist doing research on the topics I found interesting. So I like to think I followed Rilke’s advice on multiple levels.

about human behavior, it didn’t provide me a way of falsifying those theories.

After passing my prelims, I took Mustard’s labor economics field class, and learned about the kinds of topics that occupied the lives of labor economists. These topics included the returns to education, inequality, racial discrimination, crime and many other fascinating and important topics. We read many, many empirical papers in that class, and afterwards I knew that I would need a strong background in econometrics to do the kind of empirical work I desired to do.

And since econometrics was the most important area I could ever learn, I decided to make it my main field of study. This led to me working with Christopher Cornwell, an econometrician at Georgia from whom I learned a lot. He became my most important mentor, as well as a coauthor and friend. Without him, I wouldn’t be where I am today.

Optimization Makes Everything Endogeneous

“I gotta get mine, you gotta get yours”
– MC Breed


lation with other things. The reason we think is because of what we learn from the potential outcomes model: a correlation, in order to be a measure of a causal effect, must be completely independent of the potential outcomes under consideration. Yet if the person is making some choice based on what she thinks is best, then it necessarily vio-lates this independence condition. Economic theory predicts choices will be endogenous, and thus naive correlations are misleading.

One of the cornerstones of scientific methodologies is empirical analysis.5By empirical analysis, I mean the use of data to test a the-ory or to estimate a relationship between variables. The first step in conducting an empirical economic analysis is the careful formulation of the question we would like to answer. In some cases, we would

5 It is not the only cornerstone, nor
even necessarily the most important cornerstone, but empirical analysis
has always played an important role in scientific work.

prediction is falsifiable insofar as we can evaluate, and potentially re-ject the prediction, with data.6The economic model is the framework
with which we describe the relationships we are interested in, the intuition for our results and the hypotheses we would like to test.7 After we have specified an economic model, we turn it into what is called an econometric model that we can estimate directly with data. One clear issue we immediately face is regarding the functional form

7 Economic models are abstract, not realistic, representations of the world.

George Box, the statistician, once quipped that “all models are wrong, but some are useful.”

To illustrate this idea, let’s begin with a basic economic model: supply and demand equilibrium and the problems it creates for estimating the price elasticity of demand. Policy-makers and business managers have a natural interest in learning the price elasticity of demand. Knowing this can help firms maximize profits, help governments choose optimal taxes, as well as the conditions under which quantity restrictions are preferred [Becker et al., 2006]. But, the problem is that we do not observe demand curves, because demand curves are theoretical objects. More specifically, a demand curve is a collection of paired potential outcomes of price and quantity. We observe price and quantity equilibrium values, not the potential price and potential quantities along the entire demand curve. Only by tracing out the potential outcomes along a demand curve can we calculate the elasticity.

To see this, consider this graphic from Philip Wright’s Appendix B [Wright, 1928], which we’ll discuss in greater detail later (Figure 2). The price elasticity of demand is the ratio of percentage changes in quantity to price for a single demand curve. Yet, when there are shifts in supply and demand, a sequence of quantity and price pairs emerge in history which reflect neither the demand curve nor the supply

How It Works
Login account
Login Your Account
Add to cart
Add to Cart
Make payment
Document download
Download File
PageId: ELIA7D251A
Uploaded by :
Page 1 Preview
imbens and rubin covers the potential outcomes mod
Sell Your Old Documents & Earn Wallet Balance