Wooldridge Source: Collected by G. Mark Holmes, a former MSU undergraduate, for a term project. The salary data were obtained from the New York Times, April 11, 1993. The baseball statistics are from The Baseball Encyclopedia, 9th edition, and the city population figures are from the Statistical Abstract of the United States. Data loads lazily.
data('mlb1')
A data.frame with 353 observations on 47 variables:
salary: 1993 season salary
teamsal: team payroll
nl: =1 if national league
years: years in major leagues
games: career games played
atbats: career at bats
runs: career runs scored
hits: career hits
doubles: career doubles
triples: career triples
hruns: career home runs
rbis: career runs batted in
bavg: career batting average
bb: career walks
so: career strike outs
sbases: career stolen bases
fldperc: career fielding perc
frstbase: = 1 if first base
scndbase: =1 if second base
shrtstop: =1 if shortstop
thrdbase: =1 if third base
outfield: =1 if outfield
catcher: =1 if catcher
yrsallst: years as all-star
hispan: =1 if hispanic
black: =1 if black
whitepop: white pop. in city
blackpop: black pop. in city
hisppop: hispanic pop. in city
pcinc: city per capita income
gamesyr: games per year in league
hrunsyr: home runs per year
atbatsyr: at bats per year
allstar: perc. of years an all-star
slugavg: career slugging average
rbisyr: rbis per year
sbasesyr: stolen bases per year
runsyr: runs scored per year
percwhte: percent white in city
percblck: percent black in city
perchisp: percent hispanic in city
blckpb: black*percblck
hispph: hispan*perchisp
whtepw: white*percwhte
blckph: black*perchisp
hisppb: hispan*percblck
lsalary: log(salary)
https://www.cengage.com/cgi-wadsworth/course_products_wp.pl?fid=M20b&product_isbn_issn=9781111531041
The baseball statistics are career statistics through the 1992 season. Players whose race or ethnicity could not be easily determined were not included. It should not be too difficult to obtain the city population and racial composition numbers for Montreal and Toronto for 1993. Of course, the data can be pretty easily obtained for more recent players.
Used in Text: pages 143-149, 165, 244-245, 262
str(mlb1)
#> 'data.frame': 353 obs. of 47 variables:
#> $ salary : num 6329213 3375000 3100000 2900000 1650000 ...
#> $ teamsal : num 38407380 38407380 38407380 38407380 38407380 ...
#> $ nl : int 1 1 1 1 1 1 1 1 1 1 ...
#> $ years : int 12 8 5 8 12 17 4 10 4 3 ...
#> $ games : int 1705 918 751 1056 1196 2032 394 432 223 156 ...
#> $ atbats : int 6705 3333 2807 3337 3603 7489 1293 1005 491 434 ...
#> $ runs : int 1076 407 370 405 437 1136 179 78 37 45 ...
#> $ hits : int 1939 863 840 816 928 2145 303 240 118 116 ...
#> $ doubles : int 320 156 148 143 19 270 51 35 16 16 ...
#> $ triples : int 67 38 18 18 16 142 13 5 5 0 ...
#> $ hruns : int 231 73 46 107 124 40 37 13 1 10 ...
#> $ rbis : int 836 342 355 421 541 574 141 95 29 59 ...
#> $ bavg : num 289 259 299 245 258 286 234 239 240 267 ...
#> $ bb : int 619 137 341 306 316 416 77 39 23 18 ...
#> $ so : int 948 582 228 653 725 1098 358 140 62 48 ...
#> $ sbases : int 314 133 41 15 32 660 67 1 6 6 ...
#> $ fldperc : int 989 968 994 971 977 987 965 990 963 971 ...
#> $ frstbase: int 0 0 1 0 0 0 0 0 0 0 ...
#> $ scndbase: int 1 0 0 0 0 0 0 0 0 0 ...
#> $ shrtstop: int 0 1 0 0 0 0 0 0 1 0 ...
#> $ thrdbase: int 0 0 0 1 0 0 0 0 0 0 ...
#> $ outfield: int 0 0 0 0 1 1 1 0 0 1 ...
#> $ catcher : int 0 0 0 0 0 0 0 1 0 0 ...
#> $ yrsallst: int 9 2 0 0 0 2 0 0 0 0 ...
#> $ hispan : int 0 0 0 0 0 0 1 0 1 0 ...
#> $ black : int 0 1 0 0 1 1 0 0 0 1 ...
#> $ whitepop: num 5772110 5772110 5772110 5772110 5772110 ...
#> $ blackpop: num 1547725 1547725 1547725 1547725 1547725 ...
#> $ hisppop : num 893422 893422 893422 893422 893422 ...
#> $ pcinc : int 18840 18840 18840 18840 18840 18840 18840 18840 18840 18840 ...
#> $ gamesyr : num 142.1 114.8 150.2 132 99.7 ...
#> $ hrunsyr : num 19.25 9.12 9.2 13.38 10.33 ...
#> $ atbatsyr: num 559 417 561 417 300 ...
#> $ allstar : num 75 25 0 0 0 ...
#> $ slugavg : num 46 39.4 41.4 39.4 37.5 ...
#> $ rbisyr : num 69.7 42.8 71 52.6 45.1 ...
#> $ sbasesyr: num 26.17 16.62 8.2 1.88 2.67 ...
#> $ runsyr : num 89.7 50.9 74 50.6 36.4 ...
#> $ percwhte: num 70.3 70.3 70.3 70.3 70.3 ...
#> $ percblck: num 18.8 18.8 18.8 18.8 18.8 ...
#> $ perchisp: num 10.9 10.9 10.9 10.9 10.9 ...
#> $ blckpb : num 0 18.8 0 0 18.8 ...
#> $ hispph : num 0 0 0 0 0 ...
#> $ whtepw : num 70.3 0 70.3 70.3 0 ...
#> $ blckph : num 0 10.9 0 0 10.9 ...
#> $ hisppb : num 0 0 0 0 0 ...
#> $ lsalary : num 15.7 15 14.9 14.9 14.3 ...
#> - attr(*, "time.stamp")= chr "25 Jun 2011 23:03"