NoSQL: Not Just for Big Data
Advocates of NoSQL solutions have positioned them as supporting that class of data we’ve come to know as “big data.” Big data is data to a scale of volume, velocity and variety where NoSQL projects provide price-performance benefits over traditional platforms for the storage of the data. This positioning pigeon-holes NoSQL from good and bad perspectives.
On the one hand, big data has become quite important and is playing a larger role in enterprise information strategies. Though still comprising a relatively small portion of most Global 2000 information technology budgets, it is a fast-growing category that demands attention. It is certainly getting attention from the investment community – frequently a forebearer of IT spend. On the negative side of the positioning, it precludes thoughts of NoSQL for non-big (little?) data.
The latter point is almost completely valid since truly the value proposition grows exponentially with higher levels of data to manage, higher velocity of data to ingest or analyze and greater variety of data – especially into new data types of sensor, social and web clickstream data. There is no function I have come across in study and practice that NoSQL solutions do that cannot be done in the relational world, albeit sometimes expensively and with difficulty. On the flip side, there are many more possibilities in the relational world that cannot be done in NoSQL.
However, there are some NoSQL functions that would involve great difficulty to replicate in NoSQL. Those are the functions around the graph database category. These functions are the “asterisk” on the statement that NoSQL is best suited for just big data. The nature of the relationship data stored in a graph database lends itself to a level of performance for that data only attainable with these specialized solutions.
Graph databases are poised to expand dramatically in the next few years as the nature of what is important and worth saving in an enterprise has expanded dramatically beyond alphanumeric data and into relationships. Network databases, the distant predecessor to graph databases, lost their luster when number crunching became the key part of most workloads but in a highly connected world where power has reverted to the individual in control of their relationships where “group think” is involved in those relationships, it is imperative to understand them. All the nodes in a network could be people in order to understand their relationships or it could be a variety of objects that have relationships that need to be navigated quickly such as name, address and order.
Graph databases are the best technology for quickly determining who will do what next. My graph database client, in retail, uses the social graph for churn management and for developing promotion lists. Finance graph uses I know about include churn as well, but also investigating corruption and fraud and how trades and other series of events are related.
Social networks are the obvious fit. When a telecommunications customer leaves a vendor, his or her peer group is highly susceptible to do the same. Intervention is required at this point. Similarly, receptivity to promotion tends to be shared across the social network. Configurations and recommendations can also have complex relationships that require a graph database to return relationship inquiries with high performance. Queries are traversals of the graph.
So when you are considering your NoSQL needs, you have to consider the asterisk after “big data.” Graph databases are out to change your relationship with NoSQL projects.
I look forward to teaching Introduction to Graph Databases at TDWI San Diego on September 22, 2015.