Wooldridge Source: Data on NCAA men’s basketball teams, collected by Weizhao Sun for a senior seminar project in sports economics at Michigan State University, Spring 2017. He used various sources, including www.espn.com and www.teamrankings.com/ncaa-basketball/rpi-ranking/rpi-rating-by-team. Data loads lazily.

data('ncaa_rpi')

Format

A data.frame with 336 observations on 14 variables:

  • team: Name

  • year: Year

  • conference: Conference

  • postrpi: Post Rank

  • prerpi: Preseason Rank

  • postrpi_1: Post Rank 1 yr ago

  • postrpi_2: Post Rank 2 yrs ago

  • recruitrank: Recruits Rank

  • wins: Number of games won

  • losses: Number of games lost

  • winperc: Winning Percentage

  • tourney: Tournament dummy

  • coachexper: Coach Experience

  • power5: PowerFive Dummy

Source

http://www.cengage.com/c/introductory-econometrics-a-modern-approach-7e-wooldridge

Notes

This is a nice example of how multiple regression analysis can be used to determine whether rankings compiled by experts – the so-called pre-season RPI in this case – provide additional information beyond what we can obtain from widely available data bases. A simple and interesting question is whether, once the previous year’s post-season RPI is controlled for, does the pre-season RPI – which is supposed to add information on recruiting and player development – help to predict performance (such as win percentage or making it to the NCAA men’s basketball tournament). For the binary outcome that indicates making it to the NCAA tournament, a probit or logit model can be used for courses that introduce more advanced methods. There are some other interesting variables, such as coaching experience, that can be included, too.

Used in Text: not used

Examples

str(ncaa_rpi)
#> 'data.frame': 336 obs. of 14 variables: #> $ team : chr "Boston College" "Boston College" "Boston College" "Boston College" ... #> $ year : chr "2003-2004" "2009-2010" "2012-2013" "2015-2016" ... #> $ conference : chr "ACC" "ACC" "ACC" "ACC" ... #> $ postrpi : int 19 115 114 249 104 41 187 126 1 2 ... #> $ prerpi : int 37 53 223 102 90 34 119 76 11 12 ... #> $ postrpi_1 : int 44 67 244 161 125 37 115 107 10 3 ... #> $ postrpi_2 : int 55 131 65 204 176 23 55 63 4 7 ... #> $ recruitrank: int 97 300 69 69 88 10 46 79 55 24 ... #> $ wins : int 24 15 16 7 10 21 13 17 31 35 ... #> $ losses : int 10 16 17 25 18 11 18 14 6 5 ... #> $ winperc : num 70.6 48.4 48.5 21.9 35.7 ... #> $ tourney : int 1 0 0 0 0 1 0 0 1 1 ... #> $ coachexper : int 23 27 28 25 28 34 21 24 29 35 ... #> $ power5 : int 1 1 1 1 1 1 1 1 1 1 ... #> - attr(*, "time.stamp")= chr "21 Dec 2018 17:07" #> - attr(*, "label.table")= list() #> - attr(*, "expansion.fields")= list() #> - attr(*, "byteorder")= chr "LSF" #> - attr(*, "orig.dim")= int [1:2] 336 14