Brony
Footballguy
Sales territory map is less than $100 and will do fine job at simple maps.Cheap mapping software? Just need a program that can overlay zip/postal codes on a map.
Sales territory map is less than $100 and will do fine job at simple maps.Cheap mapping software? Just need a program that can overlay zip/postal codes on a map.
Got a referral link?Sales territory map is less than $100 and will do fine job at simple maps.
1. Typically a lot - our usual approach in the past was large models where the users could select whatever dimensions they wanted in a tool like Microstrategy- we have the opportunity to scale it back but not alwaysWe have a few in house, but I'm frankly a tad less familiar with them than others. I'd answer this with two questions:
1. How many attributes are you providing the end user for the analytics/reporting aspect? Hadoop will not solve performance issues centric to ad-hoc needs to any great degree becasue there is too much variability to consider in design.
2. Who's your end user? If the user is an Analyst who's used to waiting a bit for an answer what is the intent of leveraging Hadoop? If this is someone who wants push-button I presume it would be solely for the analytics application?
We have some folks evaluating Hawq - my biggest concerns are performance and the limitations we may encounter - based on some of my reading it seems there's some limitations we may encounter.We have some going directly out of Hadoop, though there is a range of use cases. Reports, KPIs, etc, that don't need to be as responsive it isn't an issue. Spotfire which is our main in house analytic tool can also help with some caching apparently for some of the cases.
Have another situation though where performance is more of an issue, wanting second or sub-second responsiveness, and staging the data somewhere was looking a likely possibility. However, we have some promising results from initial tests using Hawq instead of Hive with it. Still some work to checking to do on it though.
We evaluated it with Tableau - we found Tableau a little easier to use and a little more down the road for Enterprise at the time (3 years ago). I haven't kept up with Qlik's newer versions but Gartner still gives Tableau the lead and our users have been happy so we haven't really questioned the decision. Qlik was a good tool though so you should be happy.So our company ended up going with Qlik over MicroStrategy and Tableau. I am actually excited about the possibilities from what I've been able to grasp so far. Any Qlik users out here?
QGIS is freeCheap mapping software? Just need a program that can overlay zip/postal codes on a map.
Need a new job?I'm starting to move from mostly SQL and PL/SQL into Hadoop and analytics.
We use them all... along with SAS, Pentaho, OBIEE and now implementing an Hadoop Data Lake.So our company ended up going with Qlik over MicroStrategy and Tableau. I am actually excited about the possibilities from what I've been able to grasp so far. Any Qlik users out here?
Currently looking to replace QLik with DOMO... not sure where they are in the process.Have any of you guys looked at DOMO? We did a pretty exhaustive eval on them - they have an interesting product. I loved some features (built-in collaboration) and hated others (ingesting all your data in their solution).
They frame themselves as business management software and not just an analytics tool. Anybody test it out and/or using it?
I'm not sure the reasoning for it, wasn't involved in this arena at the time. Will see if I can find out though not even sure where the decision was made.We have some folks evaluating Hawq - my biggest concerns are performance and the limitations we may encounter - based on some of my reading it seems there's some limitations we may encounter.
Curious why you guys went with Spotfire - cost?
Heh, feels like I got a completely new one in a lot of ways. So much new stuff to learn, but this is such a great area to get into, I think.Need a new job?
Tableau can do this IIRC. Fairly sure there's a package in R that can do the same, but that's a big learning curve unless you already know how to use R.Cheap mapping software? Just need a program that can overlay zip/postal codes on a map.
Tableau does it out of the box. Other software I've seen asks for Lat and Long.Tableau can do this IIRC. Fairly sure there's a package in R that can do the same, but that's a big learning curve unless you already know how to use R.
bump for the morning crowd.any tips for a finance guy looking to transition to an analytics role? I majored in math and econ, have a master's in finance, have worked ~4 years in finance. very good in excel and solid in VBA, some light usage of SQL, have taken a couple SAS workshops but there just aren't opportunities in my current role to get practice with it.
online courses? any particular places to look for analytics roles in my area?
I brought on a guy from JPEG Morgan who took a position for a year and then went to Macy's, but the initial move had to be a paycut. Retail analysis.bump for the morning crowd.
We have any ~serious R users in this joint?
I've been working in R most days for the last four months now. To the point where I'm moving out of the "Hey! I know how to get answers!" phase and working to really understand data structures and how to use what's being called the Tidyverse.
More or less readr, tidyr, dplyr, purrr, migrittr, ggplot, shiny, rvest and probably a couple other packages I'm forgetting.
I'd like to be able to stop dropping my R output in Excel and keep it all in R all the way through to the final product -- but that's a stretch right now.
Anyone else use/want to use these?
still in the stages of finalizing, but it's looking like I made this happenany tips for a finance guy looking to transition to an analytics role? I majored in math and econ, have a master's in finance, have worked ~4 years in finance. very good in excel and solid in VBA, some light usage of SQL, have taken a couple SAS workshops but there just aren't opportunities in my current role to get practice with it.
online courses? any particular places to look for analytics roles in my area?
How about you aggregate that $2300 and settle up your debts?Disappointed the title isn't "Data Guys, Aggregate!"
What training/ classes did you do?still in the stages of finalizing, but it's looking like I made this happen![]()
Is the need recurring or just once? If recurring, why not set up another view?GregR said:I've got a Hive view, 6 columns something like:
ID ... DATE ... Value A ... Value B ... Value C ... Value D
Where the combo of ID and Date are unique.
For each ID I would like to retrieve for each of A/B/C/D the most recent Date that each had a value (i.e. was not NULL), and what that value was. So I'd end up with 9 columns... ID, Date A, Value A, Date B, Value B, etc.
I could do each of A/B/C/D in separate subqueries and join them back together. But is there a more performant way to do it, that can return what I want in a single pass through the data instead of accessing the original view 4 times?
MicroStrategy offers their Tableau-like local install client called Desktop free now. It was a direct shot at Tableau, but also lets anyone play with a local installed BI tool that'll give you an idea of what Tableau will be like.cap'n grunge said:Finance guy that does the TPS reports. Sounds like we are going to have the opportunity to use Tableau in our area eventually. Most of our reports are from the GL and created in Excel using an OLAP cube and SSAS. Would also like to hone my tech skills. Any recommendations appreciated.
It is going to be end up being a view, yes. I'm trying to write the query for the view.Is the need recurring or just once? If recurring, why not set up another view?GregR said:I've got a Hive view, 6 columns something like:
ID ... DATE ... Value A ... Value B ... Value C ... Value D
Where the combo of ID and Date are unique.
For each ID I would like to retrieve for each of A/B/C/D the most recent Date that each had a value (i.e. was not NULL), and what that value was. So I'd end up with 9 columns... ID, Date A, Value A, Date B, Value B, etc.
I could do each of A/B/C/D in separate subqueries and join them back together. But is there a more performant way to do it, that can return what I want in a single pass through the data instead of accessing the original view 4 times?
Tableau has nice visualization, but it doesn't sound like you'll be designing it's reports. Much won't change for youcap'n grunge said:Finance guy that does the TPS reports. Sounds like we are going to have the opportunity to use Tableau in our area eventually. Most of our reports are from the GL and created in Excel using an OLAP cube and SSAS. Would also like to hone my tech skills. Any recommendations appreciated.
Could you unpivot to ID, Value Type and Date; rank by descending Date grouped by ID and Value Type; pivot back out the the top ranked rowsIt is going to be end up being a view, yes. I'm trying to write the query for the view.
Getting the query to work isn't the problem. I'm trying to see if there is a more performant way to do it than I would have in Oracle with a bunch of GROUP BY's to get the max dates. Actually, I know that RANK() is more performant in Hive than GROUP BY for this. But what I'm asking is, I was hoping there was a performance improving way that instead of reading my big data set 4 times, once to get the max date for each of A/B/C/D, that it could process all four on one pass through the data.
Or maybe it will already do that behind the scenes I don't know. I haven't looked at an execution plan yet, was just kind of mapping out the query in my head so far.
I took the intro and intermediate courses for R on edx.org. I also took a Python course that MIT offered on the same website. My knowledge of these programs is still on the beginner side.What training/ classes did you do?
That's usually 95% of the hire. Then you just hope whatever they're asking is within your realm of learning quickly. At least that's been my experience. It usually is as the hiring people like to make the job description overly complicated.I took the intro and intermediate courses for R on edx.org. I also took a Python course that MIT offered on the same website. My knowledge of these programs is still on the beginner side.
I would say the biggest factors in my securing this position was a) interviewing skills b) I have a friend who works closely with the hiring manager who put in a strong recommendation for me c) I was able to sell the hiring manager on my general intelligence, ability to learn quickly, and passion for D&A
Have you tried using a materialized view or populating a GTT to pre-collect your data and then query that object at runtime for improved performance?It is going to be end up being a view, yes. I'm trying to write the query for the view.Is the need recurring or just once? If recurring, why not set up another view?GregR said:I've got a Hive view, 6 columns something like:
ID ... DATE ... Value A ... Value B ... Value C ... Value D
Where the combo of ID and Date are unique.
For each ID I would like to retrieve for each of A/B/C/D the most recent Date that each had a value (i.e. was not NULL), and what that value was. So I'd end up with 9 columns... ID, Date A, Value A, Date B, Value B, etc.
I could do each of A/B/C/D in separate subqueries and join them back together. But is there a more performant way to do it, that can return what I want in a single pass through the data instead of accessing the original view 4 times?
Getting the query to work isn't the problem. I'm trying to see if there is a more performant way to do it than I would have in Oracle with a bunch of GROUP BY's to get the max dates. Actually, I know that RANK() is more performant in Hive than GROUP BY for this. But what I'm asking is, I was hoping there was a performance improving way that instead of reading my big data set 4 times, once to get the max date for each of A/B/C/D, that it could process all four on one pass through the data.
Or maybe it will already do that behind the scenes I don't know. I haven't looked at an execution plan yet, was just kind of mapping out the query in my head so far.
Edited 9 hours ago by GregR
That might be worth a try, will take a look at it. Thanks!Could you unpivot to ID, Value Type and Date; rank by descending Date grouped by ID and Value Type; pivot back out the the top ranked rows
Current available jobs in Football Operations:
» Football Analytics Coordinator - The Houston Texans (Houston, TX)
Football Operations: Statistics
Football Analytics Coordinator - The Houston Texans (Houston, TX)
Reports to: Director of Football Information Systems
Education/Experience:
Skills Required:
- Bachelor’s or master’s degree in Data Science, Analytics or other analytical field preferred.
- Two (2) years of analytical and technical experience.
- High-level proficiency in statistical programming languages R and/or Python.
Basic Function:
- Demonstrated strong mathematical and computational acumen.
- Experience in writing SQL queries and reports utilizing SQL Server 2008 or later strongly preferred.
- Working knowledge in data discovery and new data acquisition.
- Working knowledge of scripting languages and of working with large data sets.
- Demonstrated working knowledge of statistics and commonly used statistics and analytical tools, econometrics, data visualization and football analytics.
- Interest in football and familiarity with football terminology.
- Proficiency in use of all Microsoft Office software applications with high-level proficiency in Excel.
- Strong organizational and time management skills with ability to prioritize and manage multiple tasks in a high-energy environment.
- Effective verbal, presentation and written communication skills.
- Strong interpersonal skills and the ability to create and maintain solid working relationships at all levels across the organization.
- Possess excellent attention to detail and an ability to produce high-quality, accurate work within designated deadlines.
- Ability to maintain confidential and/or proprietary information.
- Ability and internal drive to demonstrate a winning attitude and a strong work ethic in the performance of all job responsibilities.
Job Function (duties and responsibilities):
- Responsible for providing direct analytical support to Football Operations personnel.
- Provide Football Operations users with easily digestible summaries of internal data warehouse information.
- Develop new methods/tools to answer business intelligence questions for sports science, football operations and coaching departments.
Travel Requirements:
- Conduct ad-hoc research and create analytical reports for sports science, football operations and coaches.
- Import, analyze, verify, and draw useful conclusions from non-documented data.
- Perform data ETL technique (extract, transform and load) and quality control work.
- Create integrations with sports science third-party application program interface (API).
- Research, recommend and implement business intelligence solutions with the goal of making data more user friendly for a variety of internal clients.
- Formulate creative and insightful internal metrics to gauge a variety of football data points.
- Perform various other tasks that may be assigned from time to time by Director of Football Information Systems and the General Manager and Executive Vice President, Football Operations.
- Position requires routine face-to-face personal interaction with other Club personnel; therefore, job responsibilities must be physically performed in the Club offices and not in a telecommuting manner.
Domestic U.S. travel associated with team road games and Training Camp practices as may be requested or required.
Note: When you apply for this job online, you will be required to answer the following questions:
1. Do you have Bachelors’ and/or Master’s degree in data science, analytics, applied mathematics, or other analytical field?
2. Are you proficient with Microsoft SQL 2008 or later?
3. Are you proficient with statistical programming languages R and/or Python?
Wish I was that confident in my skills. Would love to work for an NFL team.Anyone looking for an analytics job in the NFL?
Not 100% sure what you're looking for, but this sounds like something that would need to be coded.not sure this is the right place for this, but what the hell.
i am looking for a program (or code suggestion) that can re-calculate financial histories. there could be hundreds of transactions with charges, penalties, interest and payments. the penalties and interest were assessed at specific rates and the penalty and interest amounts assessed depended on the running balance of the various components at the time. problem is that the rates need to be changed retroactively, which is really annoying to deal with from because all the payments have to be re-allocated in accordance with the adjusted rate. so i am trying to figure out a more automated way than individually calculating all of the transactions.
any thoughts? does quickbooks or something similar have a function where you can do that (especially if you can import).
What exactly is a data lake? I keep hearing it mentioned in some large, very slow project at my company but nobody seems to know what they are actually building.So I've been working with what amounts to our data lake infrastructure and ingestion group. It's mainly a keep it running and create data pipelines group. I'm more of a data guy though and as one of the only people on the team with much domain knowledge I'd kind of carved out a niche working projects where I was a technical interface with the business users doing transformations and data integration though it's not our focus. I am going to try to make a change back to a more data centric role though as I still often feel like a data fish in a pond of CS majors. We are really growing our data analytics positions so I'm tooling up for that. Taking some Python courses at present via Coursera that I can get for the first couple for free via work.
Anyone built up their statistics skills for data science, and have any recommendations there?