NoSQL Hashing: How Couchbase Does It
In my last Upside article, I discussed a function of NoSQL databases that made them unique and offered options that relational databases cannot. That function was tunable consistency, and I used Cassandra as the example.
Today, I will continue that theme by taking a look at how the NoSQL database Couchbase distributes data across its nodes that allows NoSQL databases to provide low-cost elastic scalability.
A Flexible Document Database
Couchbase is a NoSQL database in the document category, although it started life as a key-value database and can still be used that way. It features JSON storage — XML and other data types are also possible.
JSON can make every document (the NoSQL term for equivalent relational record) unique and fixes the sparse data problem common to relational databases, whereby missing or irrelevant fields are padded as nulls or with dummy values. It also easily allows documents to be “interleaved” in a meaningful fashion, as when an order document is followed by all of its line-item documents.
The primary purpose of a document database is to enable read and write operations that meet high performance requirements without the need for an identifier or key in order to access the data, which is needed in key-value database queries. Indexes are possible in Couchbase in B-Tree and other styles.
Many of the founders came from Memcached. Memcached is itself a NoSQL database, but it is mostly known today as the in-RAM memory store in widespread use on the Internet. Couchbase inserts will first go into Memcached and later in the background write to the disk asynchronously, decoupled completely from your action.
Clusters and Buckets
Couchbase provides a vast amount of metrics on its cluster, which, of course, must be populated. Like other NoSQL databases, Couchbase has an algorithm for randomizing the distribution.
For the rest of the article, please see here.