Wooldridge Source: I took a random sample of data reported in the May 6, 1991 issue of Businessweek. Data loads lazily.
data('ceosal1')
A data.frame with 209 observations on 12 variables:
salary: 1990 salary, thousands $
pcsalary: percent change salary, 89-90
sales: 1990 firm sales, millions $
roe: return on equity, 88-90 avg
pcroe: percent change roe, 88-90
ros: return on firm's stock, 88-90
indus: =1 if industrial firm
finance: =1 if financial firm
consprod: =1 if consumer product firm
utility: =1 if transport. or utilties
lsalary: natural log of salary
lsales: natural log of sales
https://www.cengage.com/cgi-wadsworth/course_products_wp.pl?fid=M20b&product_isbn_issn=9781111531041
This kind of data collection is relatively easy for students just learning data analysis, and the findings can be interesting. A good term project is to have students collect a similar data set using a more recent issue of Businessweek, and to find additional variables that might explain differences in CEO compensation. My impression is that the public is still interested in CEO compensation. An interesting question is whether the list of explanatory variables included in this data set now explain less of the variation in log(salary) than they used to.
Used in Text: pages 32, 35-36, 39, 159-160, 218-219, 260-261, 263, 685, 692-693
str(ceosal1)
#> 'data.frame': 209 obs. of 12 variables:
#> $ salary : int 1095 1001 1122 578 1368 1145 1078 1094 1237 833 ...
#> $ pcsalary: int 20 32 9 -9 7 5 10 7 16 5 ...
#> $ sales : num 27595 9958 6126 16246 21783 ...
#> $ roe : num 14.1 10.9 23.5 5.9 13.8 ...
#> $ pcroe : num 106.4 -30.6 -16.3 -25.7 -3 ...
#> $ ros : int 191 13 14 -21 56 55 62 44 37 37 ...
#> $ indus : int 1 1 1 1 1 1 1 1 1 1 ...
#> $ finance : int 0 0 0 0 0 0 0 0 0 0 ...
#> $ consprod: int 0 0 0 0 0 0 0 0 0 0 ...
#> $ utility : int 0 0 0 0 0 0 0 0 0 0 ...
#> $ lsalary : num 7 6.91 7.02 6.36 7.22 ...
#> $ lsales : num 10.23 9.21 8.72 9.7 9.99 ...
#> - attr(*, "time.stamp")= chr "25 Jun 2011 23:03"