Get More Out of Hadoop by Building Your User Community
If your company is standing up Hadoop, you’ll want to get the most out of its rich data. This process doesn’t end with the first user group or first user. As with data warehouses, the users will eventually form into multiple categories. Although the Hadoop user base is often data science heavy at first, Hadoop builders should nourish users across the enterprise for the valuable data they are making available.
Four categories will make up your Hadoop user community, and each one interacts with Hadoop in a specific way.
Data Scientists
Scientists with the statistical and applied mathematical expertise to analyze data for insights — the ability to extract signal from noise — are more critical than ever in Hadoop environments. Most Hadoop projects, of course, involve processing big data to find a relatively minute amount of signal.
Hadoop’s ability to rapidly crunch through enormous amounts of data makes it economically feasible to extract these insights. In addition, data scientists may discover patterns that aren’t evident in smaller data sets.
These members of your team investigate the value of various big data sources, which in Hadoop environments means mastering a wider range of tools and analytic techniques.
Creating queries and guiding machine-learning algorithms, they discover data patterns and relationships that could potentially be useful for BI or for building predictive or descriptive analytic models. They determine which data looks interesting enough to justify further analysis and build logical views (e.g., Hive tables) on top of the data to facilitate queries by themselves and other users.
For the rest of the article, please see Upside.com.
Ssorry, the comment form is closed at this time.