What's new
Fantasy Football - Footballguys Forums

Welcome to Our Forums. Once you've registered and logged in, you're primed to talk football, among other topics, with the sharpest and most experienced fantasy players on the internet.

Data Guys, Aggregate! (2 Viewers)

Dinsy Ejotuz

Footballguy
We've had several threads on specific stats Qs, Excel functions and etc but thought it might be useful to have a catch all thread where all that is aggregated. I know there are a lot of us whose work touches on at least touch on some of the same things...

R, SQL, Excel, Tableau, Google Analytics, Stats, Optimization, Simulation, Web Scraping, Visualization etc.

If it dies on the vine... nothing ventured and all that.

Thought I'd start this thread after watching a nice tutorial on the DPLYR package in R. It's been out for a couple years now, but if you haven't used it before it's a fair bit tidier and a lot more intuitive to learn than some of the older base-R data manipulation functions. Highly recommend.

 
Last edited by a moderator:
I really need to take some time to learn how to use R. Pretty weak on stats too. Great at getting and managing very large datasets.

 
Last edited by a moderator:
We have a lot of Tableau usage at my company - it's a great tool for visualization and self-service against a virtually any data set.

 
I really need to take some time to learn how to use R. Pretty week on stats too. Great at getting and managing very large datasets.
We should partner. I'm your mirror image -- decent in stats/R and a great intuition for analysis, but playing catch up on the database and data manipulation side.

 
Last edited by a moderator:
We have a lot of Tableau usage at my company - it's a great tool for visualization and self-service against a virtually any data set.
I haven't used it a lot, but it seems to be great for descriptive visualizations. Not really a full analysis package, but really intuitive and a lot cheaper than something like SAS's visualization product.

 
We have a lot of Tableau usage at my company - it's a great tool for visualization and self-service against a virtually any data set.
My company has a huge push for Tableau too, but it is so very rare to find an implementation that isn't very slow. One reason we chose SSRS to build our new reporting site in.

 
Used Tableau at my old company. Just started a new job and they have MicroStrategy. I'm not familiar with it. Pros cons?

 
I really need to take some time to learn how to use R. Pretty week on stats too. Great at getting and managing very large datasets.
We should partner. I'm your mirror image -- decent in stats/R and a great intuition for analysis, but playing catch up on the database and data manipulation side.
Yeah, are you going to have time to do your rookie analysis this offseason?

 
We have a lot of Tableau usage at my company - it's a great tool for visualization and self-service against a virtually any data set.
My company has a huge push for Tableau too, but it is so very rare to find an implementation that isn't very slow. One reason we chose SSRS to build our new reporting site in.
Are you using live connections? We've found Tableau extracts to work well but depends on the dataset. Performance in most these tools are typically a factor of either your EDW, your design or the size of your data. I don't see SSRS and Tableau as apples to apples.

 
We have a lot of Tableau usage at my company - it's a great tool for visualization and self-service against a virtually any data set.
I haven't used it a lot, but it seems to be great for descriptive visualizations. Not really a full analysis package, but really intuitive and a lot cheaper than something like SAS's visualization product.
Definitely not a full analytics package - I'd compare SSRS and SAS to each other but neither can touch Tableau on visuals or ease of use, IMO.

 
Used Tableau at my old company. Just started a new job and they have MicroStrategy. I'm not familiar with it. Pros cons?
Former Microstrategy developer here - it's a rock solid enterprise reporting tool. Very stable but the complaint has been that you need IT to make the most out of it. Tableau was making big inroads with their tool so Microstrategy came out with their comparable tool - Microstrategy Desktop. These tools are a better comparison and there's a lot of companies that roll these two tools out as co complementary - we do that and it works well. Tableau can't do enterprise reporting in the conventional sense.

 
We have a lot of Tableau usage at my company - it's a great tool for visualization and self-service against a virtually any data set.
My company has a huge push for Tableau too, but it is so very rare to find an implementation that isn't very slow. One reason we chose SSRS to build our new reporting site in.
Are you using live connections? We've found Tableau extracts to work well but depends on the dataset. Performance in most these tools are typically a factor of either your EDW, your design or the size of your data. I don't see SSRS and Tableau as apples to apples.
Speaking strictly as a user with regard to Tableau as many different groups have server sites set up within the company for reporting or external vendors Tableau sites. But, generally, these are all very large datasets I am talking about.

In the SSRS vs Tableau example, the main goal was to replace a 5000 or so canned reports that were pretty barebones but instantaneous. Project about cost savings instead of visualization and speed major concern going from cached pdfs. Would like to use Tableau as a complement eventually, but that would mean the company would have to quit laying off the best developers.

 
I really need to take some time to learn how to use R. Pretty week on stats too. Great at getting and managing very large datasets.
We should partner. I'm your mirror image -- decent in stats/R and a great intuition for analysis, but playing catch up on the database and data manipulation side.
Yeah, are you going to have time to do your rookie analysis this offseason?
Heh. I finally started trying to turn my practicum into something digestible for DLF just this week. Hopefully I'll have that done by the time the combine is over and will be able to write some again this year.

 
Edited to add: I generally do a mix of analytics and programming, been moving away from the former and towards the latter over the past year but there's a lot of overlap. Pretty strong in Excel and SQL, and literally just the other day started messing with the pandas library in Python.

 
Last edited by a moderator:
Used Tableau at my old company. Just started a new job and they have MicroStrategy. I'm not familiar with it. Pros cons?
Former Microstrategy developer here - it's a rock solid enterprise reporting tool. Very stable but the complaint has been that you need IT to make the most out of it. Tableau was making big inroads with their tool so Microstrategy came out with their comparable tool - Microstrategy Desktop. These tools are a better comparison and there's a lot of companies that roll these two tools out as co complementary - we do that and it works well. Tableau can't do enterprise reporting in the conventional sense.
Good to know. Tableau is very limited still on the data manipulation side. I find doing calculations at the source is better and dumping the data into Tableau gives it less to "think" about which increases the performance. And trying to change the data in an extract is a pain in the ###. You basically have to fix it at the source and recreate the extract.

But the visualizations are nice and so far solid in that small changes don't break the whole chart. I'll have to see what version of MicroStrategy they have. Seems like they haven't used it in a bit. I might be able to convince them to either upgrade it or get Tableau.

 
I'll keep tuning in here because I'm basically self taught on all my skill. Learned from a few guys at my previous job and went from there on how to build data warehouses, how to write macros, etc. I was able to use that knowledge to get a much better paying job but now I'm on my own. They expect me to solve they're data issues which I think I can. It will be interesting.

 
Good to know. Tableau is very limited still on the data manipulation side. I find doing calculations at the source is better and dumping the data into Tableau gives it less to "think" about which increases the performance. And trying to change the data in an extract is a pain in the ###. You basically have to fix it at the source and recreate the extract.
Agree. Couple of thoughts about this:- As a best practice always push calculations to the lowest level possible

- If you are building for others and not doing analysis - build against smaller datasets that are live and not extracts - only move to extract at the end when you have to for performance

- Tableau is adding more data wrangling in version 10 which should help

We've started looking at tools like Datameer or Alteryx for data manipulation against Hadoop. I know very little about them but they meet a need that most of the major BI players can't do today.

 
Last edited by a moderator:
The one thing I would stress to anyone that is either new to this tech or looking to get in to it - while it's great to know specific tools and that will obviously be a requirement for most jobs the best path to long term success is to understand the concepts. That allows you to pick up different tools as they come and go - and they do, often.

We talk with Gartner regularly and they've said a few times now is not a great time to make a huge investment, especially not in an unproven vendor. The speed that things are changing and the number of vendors in the space and entering the space can be overwhelming.

 
I'm just getting into this after spending nearly a decade in a technology hamster wheel. I've found edX and Microsoft Virtual Academy to be both awesome and free sources of introductory classes in a wide variety of subject matter.

 
Last edited by a moderator:
I'm just getting into this after spending nearly a decade in a technology hamster wheel. I've found edX and Microsoft Virtual Academy to be both awesome and free sources of introductory classes in a wide variety of subject matter.
Agreed. A good class on edX is Analytics Edge, though it takes a fair amount of time. Gives good practice in r.

 
I'm just getting into this after spending nearly a decade in a technology hamster wheel. I've found edX and Microsoft Virtual Academy to be both awesome and free sources of introductory classes in a wide variety of subject matter.
I'm just getting into this after spending nearly a decade in a technology hamster wheel. I've found edX and Microsoft Virtual Academy to be both awesome and free sources of introductory classes in a wide variety of subject matter.
Agreed. A good class on edX is Analytics Edge, though it takes a fair amount of time. Gives good practice in r.
Will have to look into that. I've gotten by as the "Excel Guru" in our group with some limited self taught VBA/SQL stuff - really need to expand my horizons a bit so I can be somewhat familiar with the other programs vs. being able ot say I have heard about them.

 
Here's a mathy question...

If you have a normal curve with mean zero and standard deviation of one, what's the weighted mean absolute error (distance from the mean) of all possible points on the curve?

ETA... would it be the points on the curve where the area under the curve was 50%? i.e. the z-score for 25%? Something like .675?

Take the derivative of the normal function from 0->x so that the area is .25?

 
Last edited by a moderator:
Only somewhat related, but a little surprised that Tableau's stock has been cut in half with all the love they get in the enterprise space.

 
Only somewhat related, but a little surprised that Tableau's stock has been cut in half with all the love they get in the enterprise space.
I've heard IBM is going to try and kill them. I haven't followed it, but talked with some people on their sales team a few months ago.

 
So what is everyone doing with analytics? I have a masters from northwestern in predictive analytics, am a CPA, bs in finance and risk management...How are you all using it? I feel like the majority in the field are CIS guys who may or may not get the big picture stuff. Just want a feel for the world here.

 
I've been in I/T for 20 years (mostly infrastructure, now focused on HPC) and Hadoop is all the rage in our environment right now. Everyone seems to want to use it, but I have yet to see a killer application in our specific work area (R&D).

 
So we have MicroStrategy 10 which is all HTML based as opposed to a developer program. Seems easy enough but I'm still learning it. Going to be a bit to grasp this much like when I first did Tableau.

 
Interesting stat I heard at an Amazon AWS event.  MLB's next gen analytics produces about 16 Terabytes of data per game.

Such a long way from sitting in the stands keeping track of the game yourself with your dad on paper.

 
Greg Russell said:
Interesting stat I heard at an Amazon AWS event.  MLB's next gen analytics produces about 16 Terabytes of data per game.

Such a long way from sitting in the stands keeping track of the game yourself with your dad on paper.
That just sounds ridiculous. It's probably the graphics that take up the majority of the data usage.

 
That just sounds ridiculous. It's probably the graphics that take up the majority of the data usage.
They didn't mention the break down but I think the same group handles the online broadcasts so quite possible. They did say things like ball location were being recorded by the millisecond. Imagine they probably have the same kind of info on the players like the NFL does.

 
They didn't mention the break down but I think the same group handles the online broadcasts so quite possible. They did say things like ball location were being recorded by the millisecond. Imagine they probably have the same kind of info on the players like the NFL does.
Yeah, I'm not sure that it is graphics, but they are recording geospatial data of everything all the time.  Where every player is all the time, just like the ball.  

 
I've heard IBM is going to try and kill them. I haven't followed it, but talked with some people on their sales team a few months ago.
I talk with the Tableau sales team weekly and I keep asking when someone is going to buy them.  They maintain they won't sell (for now - my take).  They are still ahead (IMO) in true self-service analytics.  What they have to decide is how they keep their edge - will they go full enterprise to truly compete with SAP, IBM and Microstrategy or will they remain true to the visualization mantra they've always sold themselves on.  

I think they have some time still as they are still ahead of Qlik, Spotfire, DOMO and others but they will need something in the next few years to stay relevant long term.  Look for them to come out with something like Microstrategy PRIME or AWS SPICE in the next year or so.

 
I've heard IBM is going to try and kill them. I haven't followed it, but talked with some people on their sales team a few months ago.
I talk with the Tableau sales team weekly and I keep asking when someone is going to buy them.  They maintain they won't sell (for now - my take).  They are still ahead (IMO) in true self-service analytics.  What they have to decide is how they keep their edge - will they go full enterprise to truly compete with SAP, IBM and Microstrategy or will they remain true to the visualization mantra they've always sold themselves on.  

I think they have some time still as they are still ahead of Qlik, Spotfire, DOMO and others but they will need something in the next few years to stay relevant long term.  Look for them to come out with something like Microstrategy PRIME or AWS SPICE in the next year or so.

 
GIS DBA guy here.  I just processed 34,000,00 GPS points and have worked with a ton of crowdsourced data.  It's astounding how much data is collected every day IMO.

 
Anyone have much experience with Alteryx as a self-service transformation tool?  Would be interested in any tips or gotchas anyone might have. Saw you mentioned them, AAABatteries, not sure how deep you've gotten with it yet.

 
I used to spend my days buried in SQL doing analytics but moved to the business sides a few years ago to take point on our MicroStrategy products. My new boss loves what Hadoop brings to the table so I suspect I'll get deeper on it shortly as well. 

I'm happy to help answer questions about MicroStrategy or SQL where I can help. 

 
Anyone have much experience with Alteryx as a self-service transformation tool?  Would be interested in any tips or gotchas anyone might have. Saw you mentioned them, AAABatteries, not sure how deep you've gotten with it yet.
I don't - we had them in to do a demo but it didn't go too much further for now.  Will update if I get in to it.

 
Curious for those using Hadoop if you are doing any analytics/reporting directly off of it or if you stage the data.  We are going live this year with our first project where Hadoop is an integral part of the solution and I made the call to stage the data in Redshift.  I don't have confidence you can get the performance you need when using a tool like Tableau.  And I don't want to extract all that data in to something proprietary where I'm limited in how I can access it.

 
Last edited by a moderator:
Curious for those using Hadoop if you are doing any analytics/reporting directly off of it or if you stage the data.  We are going live this year with our first project where Hadoop is an integral part of the solution and I made the call to stage the data in Redshift.  I don't have confidence you can get the performance you need when using a tool like Tableau.  And I don't want to extract all that data in to something proprietary where I'm limited in how I can access it.
We have a few in house, but I'm frankly a tad less familiar with them than others. I'd answer this with two questions:

1. How many attributes are you providing the end user for the analytics/reporting aspect? Hadoop will not solve performance issues centric to ad-hoc needs to any great degree becasue there is too much variability to consider in design.

2. Who's your end user? If the user is an Analyst who's used to waiting a bit for an answer what is the intent of leveraging Hadoop? If this is someone who wants push-button I presume it would be solely for the analytics application?

 
We have some going directly out of Hadoop, though there is a range of use cases. Reports, KPIs, etc, that don't need to be as responsive it isn't an issue. Spotfire which is our main in house analytic tool can also help with some caching apparently for some of the cases.

Have another situation though where performance is more of an issue, wanting second or sub-second responsiveness, and staging the data somewhere was looking a likely possibility.  However, we have some promising results from initial tests using Hawq instead of Hive with it. Still some work to checking to do on it though.

 
So our company ended up going with Qlik over MicroStrategy and Tableau. I am actually excited about the possibilities from what I've been able to grasp so far. Any Qlik users out here?

 

Users who are viewing this thread

Top