Wooldridge Source: O. Baser and E. Pema (2003), “The Return of Publications for Economics Faculty,” Economics Bulletin 1, 1-13. Professors Baser and Pema kindly provided the data. Data loads lazily.

data('big9salary')

Format

A data.frame with 786 observations on 30 variables:

  • id: person identifier

  • year: 92, 95, or 99

  • salary: annual salary, $

  • pubindx: publication index

  • totpge: standardized total article pages

  • assist: =1 if assistant professor

  • assoc: =1 if associate professor

  • prof: =1 if full professor

  • chair: =1 if department chair

  • top20phd: =1 if Ph.D. from top 20 dept.

  • yearphd: year Ph.D. obtained

  • female: =1 if female

  • osu: =1 if Ohio State U.

  • iowa: =1 if U. Iowa

  • indiana: =1 if Indiana U.

  • purdue: =1 if Purdue U.

  • msu: =1 if Michigan State U.

  • minn: =1 if U. Minnesota

  • mich: =1 if U. Michigan

  • wisc: =1 if U. Wisconsin

  • illinois: =1 if U. Illinois

  • y92: =1 if year == 92

  • y95: =1 if year == 95

  • y99: =1 if year == 99

  • lsalary: log(salary)

  • exper: years since first teaching job

  • expersq: exper^2

  • pubindxsq: pubindx^2

  • pubindx0: =1 if pubindx == 0

  • lpubindx: log(pubindx) if pubindx > 0

Source

https://www.cengage.com/cgi-wadsworth/course_products_wp.pl?fid=M20b&product_isbn_issn=9781111531041

Notes

This is an unbalanced panel data set in the sense that as many as three years of data are available for each faculty member but where some have fewer than three years. It is not clear that something like a fixed effects or first differencing analysis makes sense: in effect, approaches that remove the heterogeneity control for too much by controlling for unobserved heterogeneity which, in this case, includes faculty intelligence, talent, and motivation. Presumably these factors enter into the publication index. It is hard to think we want to hold the main factors driving productivity fixed when trying to measure the effect of productivity on salary. Pooled OLS regression with “cluster robust” standard errors seems more natural. On the other hand, if we want to measure the return to having a degree from a top 20 Ph.D. program then we would want to control for factors that cause selection into a top 20 program. Unfortunately, this variable does not change over time, and so FD and FE are not applicable.

Used in Text: not used

Examples

str(big9salary)
#> 'data.frame': 786 obs. of 30 variables: #> $ id : int 101 101 101 102 102 102 103 103 103 104 ... #> $ year : int 92 95 99 92 95 99 92 95 99 92 ... #> $ salary : int NA NA 107100 79420 88239 100450 87450 96831 108290 NA ... #> $ pubindx : num 30.5 31 40.5 33.5 33.9 ... #> $ totpge : num 92.7 107.2 186.5 127.5 133 ... #> $ assist : int 0 0 0 0 0 0 0 0 0 1 ... #> $ assoc : int 0 0 0 0 0 0 0 0 0 0 ... #> $ prof : int 1 1 1 1 1 1 1 1 1 0 ... #> $ chair : int 0 0 0 0 0 0 0 0 0 0 ... #> $ top20phd : int 0 0 0 0 0 0 1 1 1 1 ... #> $ yearphd : int 73 73 73 76 76 76 61 61 61 91 ... #> $ female : int 0 0 0 0 0 0 0 0 0 0 ... #> $ osu : int 0 0 0 0 0 0 0 0 0 0 ... #> $ iowa : int 0 0 0 0 0 0 0 0 0 0 ... #> $ indiana : int 1 1 1 1 1 1 1 1 1 1 ... #> $ purdue : int 0 0 0 0 0 0 0 0 0 0 ... #> $ msu : int 0 0 0 0 0 0 0 0 0 0 ... #> $ minn : int 0 0 0 0 0 0 0 0 0 0 ... #> $ mich : int 0 0 0 0 0 0 0 0 0 0 ... #> $ wisc : int 0 0 0 0 0 0 0 0 0 0 ... #> $ illinois : int 0 0 0 0 0 0 0 0 0 0 ... #> $ y92 : int 1 0 0 1 0 0 1 0 0 1 ... #> $ y95 : int 0 1 0 0 1 0 0 1 0 0 ... #> $ y99 : int 0 0 1 0 0 1 0 0 1 0 ... #> $ lsalary : num NA NA 11.6 11.3 11.4 ... #> $ exper : int 19 22 26 16 19 23 31 34 38 1 ... #> $ expersq : int 361 484 676 256 361 529 961 1156 1444 1 ... #> $ pubindxsq: num 933 959 1636 1125 1149 ... #> $ pubindx0 : num 0 0 0 0 0 0 0 0 0 0 ... #> $ lpubindx : num 3.42 3.43 3.7 3.51 3.52 ... #> - attr(*, "time.stamp")= chr "22 Jan 2013 14:09"