Wooldridge Source: J.J. Heckman, J.L. Tobias, and E. Vytlacil (2003), “Simple Estimators for Treatment Parameters in a Latent-Variable Framework,” Review of Economics and Statistics 85, 748-755. Professor Tobias kindly provided the data, which were obtained from the 1991 National Longitudinal Survey of Youth. All people in the sample are males age 26 to 34. For confidentiality reasons, I have included only a subset of the variables used by the authors. Data loads lazily.
data('htv')
A data.frame with 1230 observations on 23 variables:
wage: hourly wage, 1991
abil: abil. measure, not standardized
educ: highest grade completed by 1991
ne: =1 if in northeast, 1991
nc: =1 if in nrthcntrl, 1991
west: =1 if in west, 1991
south: =1 if in south, 1991
exper: potential experience
motheduc: highest grade, mother
fatheduc: highest grade, father
brkhme14: =1 if broken home, age 14
sibs: number of siblings
urban: =1 if in urban area, 1991
ne18: =1 if in NE, age 18
nc18: =1 if in NC, age 18
south18: =1 if in south, age 18
west18: =1 if in west, age 18
urban18: =1 if in urban area, age 18
tuit17: college tuition, age 17
tuit18: college tuition, age 18
lwage: log(wage)
expersq: exper^2
ctuit: tuit18 - tuit17
https://www.cengage.com/cgi-wadsworth/course_products_wp.pl?fid=M20b&product_isbn_issn=9781111531041
Because an ability measure is included in this data set, it can be used as another illustration of including proxy variables in regression models. See Chapter 9. Also, one can try the IV procedure with the ability measure included as an exogenous explanatory variable.
Used in Text: pages 550, 628
str(htv)
#> 'data.frame': 1230 obs. of 23 variables:
#> $ wage : num 12.02 8.91 15.51 13.33 11.07 ...
#> $ abil : num 5.03 2.04 2.48 3.61 2.64 ...
#> $ educ : int 15 13 15 15 13 18 13 12 13 12 ...
#> $ ne : int 0 1 1 1 1 1 1 0 1 1 ...
#> $ nc : int 0 0 0 0 0 0 0 0 0 0 ...
#> $ west : int 1 0 0 0 0 0 0 0 0 0 ...
#> $ south : int 0 0 0 0 0 0 0 1 0 0 ...
#> $ exper : int 9 8 11 6 15 8 13 14 9 9 ...
#> $ motheduc: int 12 12 12 12 12 12 13 12 10 14 ...
#> $ fatheduc: int 12 10 16 12 15 12 12 12 12 12 ...
#> $ brkhme14: int 0 1 0 0 1 0 0 1 1 0 ...
#> $ sibs : int 1 4 2 1 2 2 5 4 3 1 ...
#> $ urban : int 1 1 1 1 1 1 1 0 1 1 ...
#> $ ne18 : int 1 1 1 1 1 1 1 1 1 1 ...
#> $ nc18 : int 0 0 0 0 0 0 0 0 0 0 ...
#> $ south18 : int 0 0 0 0 0 0 0 0 0 0 ...
#> $ west18 : int 0 0 0 0 0 0 0 0 0 0 ...
#> $ urban18 : int 1 1 1 1 1 1 1 1 1 1 ...
#> $ tuit17 : num 7.58 8.6 7.31 9.5 7.31 ...
#> $ tuit18 : num 7.26 9.5 7.31 10.16 7.31 ...
#> $ lwage : num 2.49 2.19 2.74 2.59 2.4 ...
#> $ expersq : int 81 64 121 36 225 64 169 196 81 81 ...
#> $ ctuit : num -0.323 0.904 0 0.663 0 ...
#> - attr(*, "time.stamp")= chr "25 Jun 2011 23:03"