What's new
Fantasy Football - Footballguys Forums

This is a sample guest message. Register a free account today to become a member! Once signed in, you'll be able to participate on this site by adding your own topics and posts, as well as connect with other members through your own private inbox!

scraping fantasy football sites... (1 Viewer)

Follows Closely

Footballguy
First Question:

I am looking for a way to get current rosters from all the major fantasy football sites out there: Yahoo, ESPN, mfl, cbs… myfantasyleague has an xml feed, but all others do not provide an api for gatering this information, so you will have to scape. Is there anything out there that I could utilize to get this information?

Second Question:

After a quick search, I have found nothing to help me with the first question. What I am thinking about developing is a service that you pass in your league url and it returns an xml document containing current rosters, etc. For some sites, such as Yahoo Private Leagues, you will also have to supply user/pass. Does this service interest anyone?

 
I've spent years learning scraping and spiderring and trying to develop stuff that works. Although it appears simple, it's very tricky and I wouldn't recommend going down that road.

Most of what you want to glean(gleen?) can be had by converting a cell phone WAP page into XML fairly easily. Most sites offer that.

 
I've spent years learning scraping and spiderring and trying to develop stuff that works. Although it appears simple, it's very tricky and I wouldn't recommend going down that road.Most of what you want to glean(gleen?) can be had by converting a cell phone WAP page into XML fairly easily. Most sites offer that.
I agree that it can be tricky, especially the spider components. However, to get the roster information is very simple. In most cases one regex expression will do the trick. Actually, the biggest hurdle I foresee is all the sites that require authentication. Yahoo for example has a complicated authenication process, multiple redirects and multiple cookie checks.I usually parse/scrape the "print" view, which is usually smaller and more simple than the standard bloated HTML. Parsing the WAP is a good idea that I did not think of, WAP is usually more XML complaint, thanks.
 

Users who are viewing this thread

Back
Top