Wooldridge Source: J. Mullahy (1997), “Instrumental-Variable Estimation of Count Data Models: Applications to Models of Cigarette Smoking Behavior,” Review of Economics and Statistics 79, 596-593. Professor Mullahy kindly provided the data. Data loads lazily.

data('smoke')

Format

A data.frame with 807 observations on 10 variables:

  • educ: years of schooling

  • cigpric: state cig. price, cents/pack

  • white: =1 if white

  • age: in years

  • income: annual income, $

  • cigs: cigs. smoked per day

  • restaurn: =1 if rest. smk. restrictions

  • lincome: log(income)

  • agesq: age^2

  • lcigpric: log(cigprice)

Notes

If you want to do a “fancy” IV version of Computer Exercise C16.1, you could estimate a reduced form count model for cigs using the Poisson regression methods in Section 17.3, and then use the fitted values as an IV for cigs. Presumably, this would be for a fairly advanced class.

Used in Text: pages 183, 288-289, 298, 301, 578, 627

Examples

 str(smoke)
#> 'data.frame':	807 obs. of  10 variables:
#>  $ educ    : num  16 16 12 13.5 10 6 12 15 12 12 ...
#>  $ cigpric : num  60.5 57.9 57.7 57.9 58.3 ...
#>  $ white   : int  1 1 1 1 1 1 1 1 1 1 ...
#>  $ age     : int  46 40 58 30 17 86 35 48 48 31 ...
#>  $ income  : int  20000 30000 30000 20000 20000 6500 20000 30000 20000 20000 ...
#>  $ cigs    : int  0 0 3 0 0 0 0 0 0 0 ...
#>  $ restaurn: int  0 0 0 0 0 0 0 0 0 0 ...
#>  $ lincome : num  9.9 10.3 10.3 9.9 9.9 ...
#>  $ agesq   : int  2116 1600 3364 900 289 7396 1225 2304 2304 961 ...
#>  $ lcigpric: num  4.1 4.06 4.05 4.06 4.07 ...
#>  - attr(*, "time.stamp")= chr "25 Jun 2011 23:03"