Tunable Consistency in Cassandra NoSQL
When it comes to data consistency, most relational databases give you one choice. RDBMSs deploy a highly consistent model in real-time, which ensures data accuracy to 100 percent of the users. John will not be looking at a record that is down-level from the one Mary is looking at.
However, in the fast-paced, real-time world, options are useful. An occasional inconsistent read can be more than offset by the ability to handle the veracity of the ingest. The options in this area are one reason many turn to NoSQL databases when it comes to modern operational deployments.
Cassandra, commercially supported by DataStax, is a NoSQL column store — aka a Bigtable clone. It was created by none other than Facebook and donated to Apache. Cassandra has some noteworthy adopters, including Netflix, eBay, and Twitter. It features high-write performance and multi-data center support, and is optimized for solid-state storage.
Cassandra’s use of “TABLE” takes some getting used to if you’re coming from the relational world. It uses TABLE to mean a column family, which are groupings of columns that have fairly homogenous access patterns. Families occupy special parts of the cluster and are like metadata that tell the cluster which column goes into which column family when presented in an insert.
Although Cassandra offers parameters that are tuned at the column level, HBase, another column store, offers more, such as number of versions, duration to keep the value, compression, and in-memory options. The historical data options make that column store especially useful for time-series analysis.
Though all the major NoSQL offerings have tunable consistency options, Cassandra gives you the most, I believe. To understand the options, you must understand the replication factor for the nodes. Although the default is 3, for localized performance and failover you can deploy a higher number of replica nodes in the cluster.
For the rest of the article, please see here.