A Pulse of Data Science
I joined a @GreaterIBM TweetChat today on the subject of Data Science. Here are some of the highlights.
We started with a definition.
@JamesKobielus, IBM’s Big Data Evangelist, defined data science as “the process of exploring, modeling, & mining data sets using stat methods to find non-obvious patterns” while @KirkDBorne pointed out that it is a “methodology & skillset that extracts information from data, knowledge from info” and a scientific process to avoid biases and pitfalls in the ocean of big data. He added it’s “many different things to diff people, but for me it is combo of math+CS+Stats+Domain+Science.”
@thomasdeutsch pointed out that “IDC: 2013 – 2020 90% of IT growth from #cloud, #bigdata, #mobile and #analytics. All disruptive = #datascience needed.”
Why is data science growing?
James stated that data science is growing rapidly because “advanced analytics development/modeling skills are key to unlocking #bigdata biz value”. My take was that we have more data to deal with, there’s lower cost to store it all and information underlies most elements of company strategy.
Since, as @daviottenheimer pointed out “#datascience job postings increased 15,000% last years and will grow 18.7% next 10 years. 2nd only to game dev”, what skills are necessary to fill these postings?
James tweeted “a biz profession is 1 part stat modeling, 1 part SME, 1 part app dev. Exploratory and/or operational apps.” My input was “Behaviors and judgment are key to good #datascience. Then skills.”
We then explored some “hot areas” within data science.
Thomas tweeted that “#bigdata probably hottest specialty for #datascience right now but I’m not sure it should be. Much low hanging fruit still.” @sumeetkad provided a nice list of hot areas with “#datascience specialty: visualization, uncertainty modeling, data warehousing, pattern recognition & high performance computing.”
Cognitive computing was also mentioned as were machine learning and cognitive computing
I chimed in with “Hadoop data/predictive analytics”.
The next question posed was “What type of #skills, education, and aptitude are needed to become a #datascientist?”
I said creativity was huge, noting that it should not be lost in the search for logical greatnesss.
Thomas answered with “Thinking like a #datascience professional starts with asking good questions.”
James noted that a good business-oriented #datascientist needs to be able to collaborate, follow procedures, document work, & adopt standards
In terms of hard skills, R was mentioned. Other hard skills that came out in earlier answers are appropriate here as well like Hadoop, statistical modeling and big data architecture.
The chat was a good place to get a pulse of data science and I hope to have shared that with you here. We should figure out data science. I’ll close with one of my tweets: “Early adopters who figure it out will reap the majority of the gains from #datascience. Now is that time.”
This post was written as part of the IBM for Midsize Businessprogram, which provides midsize businesses with the tools, expertise and solutions they need to become engines of a smarter planet. I’ve been compensated to contribute to this program, but the opinions expressed in this post are my own and don’t necessarily represent IBM’s positions, strategies or opinions.