Wooldridge Source: I took a random sample of data reported in the May 6, 1991 issue of Businessweek. Data loads lazily.

data('ceosal1')

Format

A data.frame with 209 observations on 12 variables:

  • salary: 1990 salary, thousands $

  • pcsalary: percent change salary, 89-90

  • sales: 1990 firm sales, millions $

  • roe: return on equity, 88-90 avg

  • pcroe: percent change roe, 88-90

  • ros: return on firm's stock, 88-90

  • indus: =1 if industrial firm

  • finance: =1 if financial firm

  • consprod: =1 if consumer product firm

  • utility: =1 if transport. or utilties

  • lsalary: natural log of salary

  • lsales: natural log of sales

Notes

This kind of data collection is relatively easy for students just learning data analysis, and the findings can be interesting. A good term project is to have students collect a similar data set using a more recent issue of Businessweek, and to find additional variables that might explain differences in CEO compensation. My impression is that the public is still interested in CEO compensation. An interesting question is whether the list of explanatory variables included in this data set now explain less of the variation in log(salary) than they used to.

Used in Text: pages 32, 35-36, 39, 159-160, 218-219, 260-261, 263, 685, 692-693

Examples

 str(ceosal1)
#> 'data.frame':	209 obs. of  12 variables:
#>  $ salary  : int  1095 1001 1122 578 1368 1145 1078 1094 1237 833 ...
#>  $ pcsalary: int  20 32 9 -9 7 5 10 7 16 5 ...
#>  $ sales   : num  27595 9958 6126 16246 21783 ...
#>  $ roe     : num  14.1 10.9 23.5 5.9 13.8 ...
#>  $ pcroe   : num  106.4 -30.6 -16.3 -25.7 -3 ...
#>  $ ros     : int  191 13 14 -21 56 55 62 44 37 37 ...
#>  $ indus   : int  1 1 1 1 1 1 1 1 1 1 ...
#>  $ finance : int  0 0 0 0 0 0 0 0 0 0 ...
#>  $ consprod: int  0 0 0 0 0 0 0 0 0 0 ...
#>  $ utility : int  0 0 0 0 0 0 0 0 0 0 ...
#>  $ lsalary : num  7 6.91 7.02 6.36 7.22 ...
#>  $ lsales  : num  10.23 9.21 8.72 9.7 9.99 ...
#>  - attr(*, "time.stamp")= chr "25 Jun 2011 23:03"