Wooldridge Source: A. Abadie (2003), “Semiparametric Instrumental Variable Estimation of Treatment Response Models,” Journal of Econometrics 113, 231-263. Professor Abadie kindly provided these data. He obtained them from the 1991 Survey of Income and Program Participation (SIPP). Data loads lazily.
data('k401ksubs')
A data.frame with 9275 observations on 11 variables:
e401k: =1 if eligble for 401(k)
inc: annual income, $1000s
marr: =1 if married
male: =1 if male respondent
age: in years
fsize: family size
nettfa: net total fin. assets, $1000
p401k: =1 if participate in 401(k)
pira: =1 if have IRA
incsq: inc^2
agesq: age^2
https://www.cengage.com/cgi-wadsworth/course_products_wp.pl?fid=M20b&product_isbn_issn=9781111531041
This data set can also be used to illustrate the binary response models, probit and logit, in Chapter 17, where, say, pira (an indicator for having an individual retirement account) is the dependent variable, and e401k [the 401(k) eligibility indicator] is the key explanatory variable.
Used in Text: pages 166, 174, 223, 264, 283, 301-302, 340, 549
str(k401ksubs)
#> 'data.frame': 9275 obs. of 11 variables:
#> $ e401k : int 0 1 0 0 0 0 0 0 0 1 ...
#> $ inc : num 13.2 61.2 12.9 98.9 22.6 ...
#> $ marr : int 0 0 1 1 0 1 1 1 1 0 ...
#> $ male : int 0 1 0 1 0 0 0 0 0 1 ...
#> $ age : int 40 35 44 44 53 60 49 38 52 45 ...
#> $ fsize : int 1 1 2 2 1 3 5 5 2 1 ...
#> $ nettfa: num 4.57 154 0 21.8 18.45 ...
#> $ p401k : int 0 1 0 0 0 0 0 0 0 0 ...
#> $ pira : int 1 0 0 0 0 0 1 0 1 1 ...
#> $ incsq : num 173 3749 165 9777 511 ...
#> $ agesq : int 1600 1225 1936 1936 2809 3600 2401 1444 2704 2025 ...
#> - attr(*, "time.stamp")= chr "25 Jun 2011 23:03"