Wooldridge Source: J.J. Heckman, J.L. Tobias, and E. Vytlacil (2003), “Simple Estimators for Treatment Parameters in a Latent-Variable Framework,” Review of Economics and Statistics 85, 748-755. Professor Tobias kindly provided the data, which were obtained from the 1991 National Longitudinal Survey of Youth. All people in the sample are males age 26 to 34. For confidentiality reasons, I have included only a subset of the variables used by the authors. Data loads lazily.

data('htv')

Format

A data.frame with 1230 observations on 23 variables:

  • wage: hourly wage, 1991

  • abil: abil. measure, not standardized

  • educ: highest grade completed by 1991

  • ne: =1 if in northeast, 1991

  • nc: =1 if in nrthcntrl, 1991

  • west: =1 if in west, 1991

  • south: =1 if in south, 1991

  • exper: potential experience

  • motheduc: highest grade, mother

  • fatheduc: highest grade, father

  • brkhme14: =1 if broken home, age 14

  • sibs: number of siblings

  • urban: =1 if in urban area, 1991

  • ne18: =1 if in NE, age 18

  • nc18: =1 if in NC, age 18

  • south18: =1 if in south, age 18

  • west18: =1 if in west, age 18

  • urban18: =1 if in urban area, age 18

  • tuit17: college tuition, age 17

  • tuit18: college tuition, age 18

  • lwage: log(wage)

  • expersq: exper^2

  • ctuit: tuit18 - tuit17

Notes

Because an ability measure is included in this data set, it can be used as another illustration of including proxy variables in regression models. See Chapter 9. Also, one can try the IV procedure with the ability measure included as an exogenous explanatory variable.

Used in Text: pages 550, 628

Examples

 str(htv)
#> 'data.frame':	1230 obs. of  23 variables:
#>  $ wage    : num  12.02 8.91 15.51 13.33 11.07 ...
#>  $ abil    : num  5.03 2.04 2.48 3.61 2.64 ...
#>  $ educ    : int  15 13 15 15 13 18 13 12 13 12 ...
#>  $ ne      : int  0 1 1 1 1 1 1 0 1 1 ...
#>  $ nc      : int  0 0 0 0 0 0 0 0 0 0 ...
#>  $ west    : int  1 0 0 0 0 0 0 0 0 0 ...
#>  $ south   : int  0 0 0 0 0 0 0 1 0 0 ...
#>  $ exper   : int  9 8 11 6 15 8 13 14 9 9 ...
#>  $ motheduc: int  12 12 12 12 12 12 13 12 10 14 ...
#>  $ fatheduc: int  12 10 16 12 15 12 12 12 12 12 ...
#>  $ brkhme14: int  0 1 0 0 1 0 0 1 1 0 ...
#>  $ sibs    : int  1 4 2 1 2 2 5 4 3 1 ...
#>  $ urban   : int  1 1 1 1 1 1 1 0 1 1 ...
#>  $ ne18    : int  1 1 1 1 1 1 1 1 1 1 ...
#>  $ nc18    : int  0 0 0 0 0 0 0 0 0 0 ...
#>  $ south18 : int  0 0 0 0 0 0 0 0 0 0 ...
#>  $ west18  : int  0 0 0 0 0 0 0 0 0 0 ...
#>  $ urban18 : int  1 1 1 1 1 1 1 1 1 1 ...
#>  $ tuit17  : num  7.58 8.6 7.31 9.5 7.31 ...
#>  $ tuit18  : num  7.26 9.5 7.31 10.16 7.31 ...
#>  $ lwage   : num  2.49 2.19 2.74 2.59 2.4 ...
#>  $ expersq : int  81 64 121 36 225 64 169 196 81 81 ...
#>  $ ctuit   : num  -0.323 0.904 0 0.663 0 ...
#>  - attr(*, "time.stamp")= chr "25 Jun 2011 23:03"