Mth 111 Mathematics as a Human Pursuit Lab  -  Correlation and Golfing

Download the golf.xls file

A
correlation is a relationship between two variables (x,y).  We can represent this relationship by producing a scatter plot.  Normally one of the variables is considered to be the independent (or explanatory) variable, and the other the dependent  (or response) variable.

The correlation coefficient can be used to measure the strength and direction of a linear relationship between two variables.
In Excel,  the correl function is used to calculate this number.
The correlation coefficient r is always between -1 and 1.

1.  Construct the scatter plot of Golf Handicaps vs. Stock Rates.

2.   Does there appear to be a correlation?

3.   Check by calculating the coefficient of correlation.
 

  • From article review in CHANCE News 7.0  A more complete version of the criticism of the article can be found at Duffers Need Not Apply
  •  "Duffers Need Not Apply; Data Show that Good Golfers Make the Best
    C.E.O.'s"  by Adam Bryant. The New York Times, 31 May 1998, Section 3, p. 1.

    Investment compensation consultant Graef Crystal carried out this study for the Times. It purports to find a strong correlation between the stock performance of major companies and the golfing prowess of their chief executives. Crystal obtained data on C.E.O.'s golf handicaps from the journal Golf Digest, and used his own data on the stock market performance of 51 Fortune 500 companies. He created a "stock performance ranking" which gave each company a score based on how investors who held their stock did over a three-year period, with 100 being the highest rating and 0 the lowest.
     

    Crystal identified following seven points on his list as outliersand removed them from the analysis; this procedure is described in the article as scientific sifting.

    Scott_G._McNealy        Sun_Microsystems        3.2         97
    William_H._Gates        Microsoft               23.9        95
    Sanford_I._Weill        Travelers_Group          18         95
    Frank_V._Cahouet        Mellon_Bank              22         92
    William_C._Steere_Jr.   Pfizer                   34         89
    Donald_B._Marron        Paine_Webber             25         89
    Christopher_B._Galvin   Motorola                11.7        3
     

    4.  Remove these from your data set in Excel  (Highlight then choose Delete, Move values up).    Check the coefficient of correlation as you remove each name.
     
    The correlation coefficient between stock rate and handicap for the remaining data points is -0.414. This value is not reported in the article, but Crystal is quoted as saying: "For all the different factors I've tested as possible links to predicting which C.E.O.'s are going to perform well or poorly, this is certain the oddest -- but also the strongest -- I've seen. There's got to be something there."

    The article raises a number of questions for discussion relating to data snooping and the treatment of alleged outliers. For the full dataset, the correlation between handicap and stock rating is only -0.042! There are also issues of response bias; it turns out that when Golf Digest asked C.E.O.'s of the 300 largest Fortune 500 corporations for their golf handicaps, only 72 replied. (Of these, Crystal used the 51 for which he had corporate data.) Finally, while the article clearly presents the findings quite seriously, some of Crystal's commentary sounds tongue-in-cheek. He says of C.E.O.'s, "...if they can get their handicap down to 4, why not just pay them an extra 20 million bucks?"