Retrieve Fantasy Football Stats using ESPN’s API: Part 3

Welcome to Part 3 of our series on how to scrape fantasy football data from ESPN’s API. Let’s go ahead and recap what we’ve done so far:

  • set up a main class where we are making API calls to ESPN using the RestClient gem
  • built a data_source class to store all of our data and global variables
  • created a player_helper class to extract some logic out of our main class
  • parsed through one week of data for a league and output in CSV format

We’re a good way through retrieving our data and storing it in a format that is ready for output. We still have a little bit more data that we would like to capture, so let’s go ahead and knock that out now.

It’s nice that we have data such as basic player data (name, position, etc.), projected stats and actual stats. It would be great, however, for us to get even more granular statistics for each player and week. In fantasy football, our players gain points by accumulating stats such as gaining yards and scoring touchdowns. Wouldn’t it be nice to break down this data to see where our players’ points are coming from on a weekly or seasonal basis?

Let’s start by seeing where this data is located in our API response. Last time we were already looking inside the response data at the following location:

data[‘teams’][‘roster’][‘entries’][‘playerPoolEntry’][‘player’][‘stats’]

Now we’re going to dig a little deeper into this same data, and append another [‘stats’] to the end of the above location. If we look inside here, we will see a hash with a bunch of seemingly random numbers as keys, and more random numbers as values. These keys actually correspond to certain statistical categories. There is no way to know this without digging through the data with a little trial and error, but luckily I will provide the keys we’re going to use.

So let’s define a new hash inside our data_source file to map out what these keys correspond to.

STAT_KEYS =
    {
        'pass_attempts' => '0',
        'completions' => '1',
        'pass_yards' => '3',
        'pass_tds' => '4',
        'interceptions' => '20',
        'rush_attempts' => '23',
        'rush_yards' => '24',
        'rush_tds' => '25',
        'receptions' => '41',
        'receiving_yards' => '42',
        'receiving_tds' => '43'
    }

For the sake of simplicity, we will ignore defensive stats and kicker stats (who cares about kickers anyway, right?!) Now, if we remember back to last time, we have our get_stats method inside the player_helper class. We have to do a little searching in there based on the statSourceId and the scoringPeriodId. This is the same entry that we’ll want to pull our detailed stats from. So, let’s add in another method in the player_helper class to retrieve these detailed stats.

def get_detailed_stats(player)
  result = {}
  stats = player['stats']
  STAT_KEYS.each_pair do |key, value|
    if stats and stats.has_key?(value)
      result[key] = stats[value]
    else
      result[key] = '0'
    end
  end
  result
end

The purpose here is to take the STAT_KEYS hash, and swap out the numerical values with the actual corresponding player statistic. We could just return the values without the keys to signify what they are, but if we want to do any work with these down the line then they’ll already be nice and organized. The logic here is fairly straightforward.
1. Take the data we already have and look inside the 2nd ‘stats’ key.
2. Take each numerical value from our STAT_KEYS and check to see if that particular key is returned for that player.
3a. If we do have that key, we’ll store that value in a hash with the key from our STAT_KEYS hash.
3b. If we do NOT have that key, we’ll simply plug in a 0. The reason we look for every stat and plug in 0’s is that we want to have the same exact output for each player so that we can plug it neatly into a spreadsheet.

We’ll plug in the call to this method inside our get stats method like so:

def get_stats(stats_array, week)
  actual = ''
  projected = ''
  stats_array.each do |stat|
    if stat['scoringPeriodId'] == week
      if stat['statSourceId'] == 0
        actual = stat['appliedTotal']
        details = get_detailed_stats(stat)
      elsif stat['statSourceId'] == 1
        .....

We don’t need to do this for the projections for now, although we could parse those our more neatly in a similar fashion if we desired. The output for a single player should look something like this:

We can see here that there are a lot of 0’s plugged in, and that’s ok. Most players will only accumulate a few stats that are actually relevant to their position. Now, we will want to return this data to our main class along with the projected and actual values that we are already sending. Let’s send the whole hash for now, and we can include the logic of how to deal with that down the line. The last line of our get_stats method should now look like this:

{actual: actual, projected: projected, details: details}

When we hop back to our main class, our stats variable will now include these broken down player stats. Last time we were storing all of our data in a comma delimited string named results. To append this data to our string, we can simply loop through our stats[:details] hash and append each value to our results string. This line should appear immediately after our result variable is populated.

stats[:details].each_value do |value|
result << value.to_s + ','
end

Now the rest of our logic can stay the same, and we have a nice comma delimited string to output to a file. Now we can finally hop outside of all of our nested loops, and write the code to output the data. Feel free to name your output file whatever you choose. In this case I’ve decided on simply calling it “output” because I’m not very creative.

If you are familiar with file opening modes, we have a few to choose from here. We can use either “w”, “w+”, “a” or “a+”. (If you are not familiar, “w” will write after truncating the destination to size 0 if it already exists, and “a” will append to the end of the file if it already exists. The “+” just changes mode to read-write instead of write only.) Since we don’t really need to read the file and we don’t really want to append anything to prevent a massive CSV file being created over time, we should be fine with choosing “w”. So our output code will read as follows:

File.open(ROOT + '/output.csv', 'w') do |f|
  output.each do |string|
    f << "#{string}\n"
  end
  f.close
end

(In my case this will write a file called “output.csv” to my root directory. You may choose to store your output in a different location.)

And voila! We should now have a nice CSV file with all of our week 1 data. If you open this in Excel, then it will recognize the CSV format for you and move all of your data into the proper columns.

This looks great, but we don’t have any column headers! In order to prevent the need for adding these in every single time, let’s go back and add one last piece of data to our data_source file to easily store our column headers.

OUTPUT_HEADERS =
%w(
Owner
Week
Season
Position
FName
LName
Starter?
Actual
Projected
PlayerID
Pass Attempts
Completions
Pass Yards
Pass TDs
INTs
Rush Attempts
Rush Yards
Rush TDs
Reception
Rec. Yds
Rec. TDs
)

Then when we declare our output variable in our main file we can simply add one extra line:

output = []
output << OUTPUT_HEADERS.join(',')

This about wraps up the Ruby code we need to write for pulling data. The way we have organized and written our code should allow for further expansion if you wanted to add new features on your own. The work now shifts over to the excel side to actually make use of this data and create some pretty graphs and charts. What you want to do with this data is up to you, but here are a few examples of what I have made personally.

Breaking down each individual statistic by owner
Difference between actual and projected points per owner over the course of the season
Season overview of points scored above or below projections

Developing these is a great way to get familiar with some VLOOKUPS and other fun formulas. I think going into too much detail is outside the scope of this post, but most of the data isn’t too hard to put together with a help from google.

I hope you’ve enjoyed this series of posts, learned a little something, and hopefully have been able to use this code for your own purposes. If you have any questions or issues with any of the code mentioned here, feel free to contact me through the Contact Us section of the site.

Leave a Reply

%d bloggers like this: