Retrieve Fantasy Football Stats using ESPN’s API: Part 2

Hello again and welcome to part two of our tutorial on how to scrape data from ESPN’s fantasy football API using Ruby. Last time we left off with our basic connection to ESPN, and we had retrieved some solid data. Let’s continue to pull more data and parse it.

First, we have a little bit of cleanup. There are some global variables sitting around that we’d like to get rid of, and we’re also going to be adding static data to reference. So let’s create a data module to house these objects and name it DataSource. We can start by moving our SWID, S2, and league ID (if applicable) variables into this file and assigning them as constants instead of global variables.

Now that we are working with more than one file, we’ll need to pull in these files to our main.rb class. Since we think we will only have one directory, we can make this simple and only add our Root directory to our Load Path. Let’s create a constant in main.rb called ROOT_DIR that will look like this:

ROOT_DIR = File.join(File.dirname(FILE))

Then we can add that to our load path with this statement:


Now we’ll easily be able to pull any files we create in our Root path. Finally we’ll want to require our DataSource module like so:

require ‘data_source’
include DataSource

We could loop through our root directory and require every .rb file, but this might be overkill for now. Now that we have access to our DataSource file, we can remove those ugly global variables and update the references to them in our code.

Now we’re ready to start looping through each week to pull down all the various statistics that we’re looking for. The general flow of our code will be the following:

  1. Make an API call for each week of the season to pull in the data. In this case, we will use 2019.
  2. Loop through each team that played that week.
  3. Loop through each player on that team’s roster and parse out their stats.

Simple enough, right? So, let’s take a look at the data that we pulled down in part 1 to look at what data is relevant to us. For now, we will be concerned with the Teams key in our Hash. The teams key is structured like so:

This may seem a little messy but I’ll point out some relevant data as we walk through this. Most of the actual stats will come from the data in that hash, but we’ll also pull a few pieces from the playerPoolEntry. As mentioned above, our first step will be to loop through each week and make an API call that applies to that week. Let’s make two new variables to specify the weeks we want to look at and the applicable season. For testing purposes, we’ll just look at week 1 for now:

weeks = *(1..1)
season = ‘2019’

If you aren’t familiar with the * syntax, it will simply create an array with the specified range. So in his case it will just create an array of [1], but we can easily expand this later once we’re ready to pull the data for all weeks. We will also want to declare an array called output where we will store all of our data as it is parsed. Now we can set up our loop to iterate through each week:

output = []

weeks.each do |week|
  url = "{season}/segments/0/leagues/1009412?view=mMatchup&view=mMatchupScore&scoringPeriodId=#{week}"
  response = RestClient::Request.execute(
      :url => url,
      :headers => {
          'cookies': {'swid': SWID,
                      'espn_s2': S2}
      :method => :get

  data = JSON.parse(response)

In the above code, we’ll need to redefine the URL for our API call for each week. We can interpolate the season and week variables into the URL string to accomplish this. Then we will perform a GET call and parse out the JSON to turn it into a hash. At this point we should have our data for week 1. This will be followed by our next loop which will parse the players for each team. We will iterate through each object in the teams array from the response body:

data[‘teams’].each do |team|

Now we should be at a point to start pulling out individual pieces of data. The first item we’ll collect is the team ID, or the very first item in the team hash.

This ID will correspond to a team in your league. To find out which team is which, you will have to look at the URL for each team when you are on the ESPN site. To do this you can simply go to the standings page and click through each team.

Here you can see the team ID is set to 2.

This next step is optional depending on if you care about actually having names for each team, but I recommend adding another constant to your DataSource module to map the ID’s for each team:

So if you have added this, we can write the line:

owner = OWNERS[team[‘id’].to_s]

(If you did not add an OWNERS constant then simply write team[‘id’].to_s)

Now we get to add — you guessed it — another nested loop! Is this the best way to write this code? No, it is not. We typically want to minimize our cyclomatic complexity, and the saying goes “flat is better than nested”. So while this isn’t necessarily ideal, we can always get our code to work properly now and then refactor later to extract out some functionality into methods. We can keep a lookout as we go forward to identify places where we can reduce our code complexity and readability when we get around to refactoring. But I digress.

Our next loop will be through each roster entry. The data we will collect for each player is as follows:

  1. firstName
  2. lastName
  3. playerId – a unique ID given to each player
  4. lineUpSlotId – An ID that signifies which position corresponds to the given player
  5. defaultPositionId
  6. actual points scored
  7. points the player was projected to score

Some of this data we can simply take, and some of it we will have to use to parse out more data. Let’s start with the easy ones. The top of our code block will look like this:

team['roster']['entries'].each do |entry|
  fname = entry['playerPoolEntry']['player']['firstName']
  lname = entry['playerPoolEntry']['player']['lastName']
  player_id = entry['playerId']
  slot = entry['lineupSlotId']

This is fairly straightforward as far as data gathering. On the next line we will want to grab the player’s position code. Since this code doesn’t actually tell us anything useful, we’ll have to map out what these codes represent in our DataSource module. The player codes we’ll use are as follows:

'1' => 'QB',
'2' => 'RB',
'3' => 'WR',
'4' => 'TE',
'16' => 'D/ST',
'5' => 'K'

Then we can reference this constant just like we did for our team Owners.

position = POSITION_CODES[entry[‘playerPoolEntry’][‘player’][‘defaultPositionId’].to_s]

We also have to get a little creative with the slot codes that we already grabbed. The slot code doesn’t really tell us much other than if a player is in your starting lineup or on your bench. Luckily this is pretty straightforward. Any number that is less than 9, exactly 16, or exactly 17 represents a starter, and anything else is a bench player. This can be evaluated like so:

starter = (slot < 9 || slot == 17 || slot == 16) ? ‘true’ : ‘false’

Great, now we have a bunch of general info about our given player. Now we want to pull their projected and actual stats, but this requires us to iterate over the stats key from our data. These loops are getting a little out of hand, so let’s stop being lazy and create a new module to help us out. Since we’ll mostly be using this module for parsing player data, let’s call it PlayerHelper (player_helper.rb). We can go ahead and require this at the top of our main.rb file the same way we did with our DataSource. Then we’ll add a method into the PlayerHelper called get_stats.

There are a few entries in the stats array that we are looking at, but we only really care about the entry that corresponds to our given week. We also will need our stats array to parse from. So our method declaration will look like this:

def get_stats(stats_array, week)

Now we will need to use a bit of logic to find the correct entry. First we need to find the entry with the corresponding week in the scoringPeriodId field. Then inside that entry we will need to check the statSourceId. If that ID is a 0, then that is the player’s actual stats. If it is a 1, then that entry represents the player’s projected stats. When we have assigned our actual and projected values, we can return a hash with an actual value and a projected value. So our final method code will look like this:

def get_stats(stats_array, week)
actual = ''
projected = ''
stats_array.each do |stat|
if stat['scoringPeriodId'] == week
if stat['statSourceId'] == 0
actual = stat['appliedTotal']
elsif stat['statSourceId'] == 1
projected = stat['appliedTotal']
{actual: actual, projected: projected}

And the method call from main.rb will look like this:

stats = get_stats(entry[‘playerPoolEntry’][‘player’][‘stats’], week)

That should give us a pretty good list of data to start with. Now let’s think ahead for a minute. Where should we store all of our data when we’re done retrieving it? It would be nice to create our own database, but that’s probably overkill for the moment, not to mention a lot of extra work. We could definitely put it all in a spreadsheet, too, but then we’d have to pull in some extra gems and add more logic. So let’s just stick with a good old CSV for now, which is just comma delimited fields that we can always import into a spreadsheet later. To do this, we can add all of our data so far to one big string:

result = “#{owner},#{week},#{season},#{position},#{fname},#{lname},#{starter},#{stats[:actual]},#{stats[:projected]},#{player_id},”

It’s not the prettiest thing in the world, but it will work for now. Finally, we can add in this result object into our output array that we created earlier:

output << result

If we let our program iterate all the way through for week 1, then we should have output that looks similar to this:

Not bad for a day’s work!
Let’s review what we’ve accomplished up to this point:

  1. We created a new DataSource module that we can move our global variables into and establish constants that help us map our data.
  2. We’ve created logic that will loop through and collect all of our basic player data.
  3. We created another new module PlayerHelper that we can use going forward to extract logic into to keep our main.rb class clean.
  4. We’ve identified a few places where we can go back and refactor to clean up our existing code.

One more takeaway that we have is that we have further seen how the API returns our data in a way that isn’t exactly straightforward. We have to go pretty deep into our data objects to find what we need. This is typical of most web services that return lots of data. This gives us another reminder that we need to keep our code well organized or none of this is going to make much sense to our future selves and will be hard for others to read.

I hope that you’ve found this post helpful and are able to follow along. For part three, we will look at pulling some additional player data and outputting our results into spreadsheets.

Leave a Reply

%d bloggers like this: