NoSQL - The Great Escape from SQL and Normalization

For the last decade or so, data and data structures have been moving at the speed of the web. They change rapidly to keep pace with end-users and markets that are in constant flux. The data model is volatile and will continue to be as more and more unstructured data (images, videos, social media content, online purchase histories, and more) is generated. The solution of the pre-Internet era, meaning the RDBMS, can’t keep up. Enter the NoSQL database—a solution for handling data with highly variable formats in massive quantities at lightning speeds.

The Rich, New World of NoSQL

NoSQL databases or stores come in four flavors: key value, document, wide column, and graph. The first two are arguably the most popular. Among NoSQL document stores for example, MongoDB was recently classed as the fifth most popular database in the world out of all databases—including the relational/SQL kind. Key value data stores handle data by using just one key value. Better known examples include Redis and Riak. Document data stores can work with multiples of key values for data (for example, title, author, and date of a document) and nested data. Besides MongoDB, Couchbase and Cassandra are other instances. In general, the NoSQL classification refers to "Not-Only-SQL," rather than "not SQL."

How and where are they used?DB_Part3

NoSQL databases are used when data models become too rich and complex for relational databases. They offer different compromises for different applications, although they aren’t designed to offer the complete set of ACID properties as an RDBMS. Performance can be blisteringly fast. Redis runs in (main) memory and can handle billions of session information records, however it’s not guaranteed. For example, products may exceptionally fall out of an online shopping cart. MongoDB and Couchbase offer better durability, but by sacrificing performance.

Performance of MongoDB is still good enough, though, to be the back-end software for several high profile enterprise sites like Craigslist and eBay. Riak is at the heart of the Medical Records Store of the Danish Health Authorities, serving the medical prescription histories of all citizens of Denmark. Redis is used as a data structure cache for Hulu, the online viewing media company. It stores around 4 billion records and responds to about 7,000 queries per second.

If it’s that good, why isn’t everybody using it?

In a nutshell, if NoSQL databases do not offer ACID transactions, they cannot be guaranteed to reliably handle purchases, payments, reservations, or similar items. NoSQL systems are sometimes described as BASE systems. <insert ACID-BASE chemistry joke here>. BASE stands for Basically Available Soft-State services with Eventual Consistency. BASE systems make sure they are always in a position to handle transactions and may make guesses or optimistic assumptions for the sake of performance. Unlike ACID systems, BASE systems may then have to revisit transactions later to clear up any lingering problems about consistency (hence the description of "eventual consistency").

My next post will look at a solution designed to combine the benefits of an RDBMS and a NoSQL database: the next-generation SQL database.

If you haven't already, take a look at my previous blog in this series, "Why Won't the RDBMS Go Away?"

big data analysis

Topics: Data Platforms, Analytics, & AI ESG Validation Services