I was reading about comparisons between SQL, NoSQL and NewSQL solutions, and found some interesting articles.
First, my research led me to this article on GigaOM.com about the massive number of database solutions in the marketplace today, and their categorization into Traditional SQL, NoSQL and NewSQL categories among others. Which led me to read more on the451group.com’s web site about their research and mapping of the new database solutions that compete with the traditional ACID-based RDBMS paradigm. Check out their recent Database Landscape Map:
As I spent more time getting a brief overview of some of the more recognizable systems (e.g. Cassandra, Clustrix, VoltDB, Nuo4j) and their pros/cons, I stumbled upon this presentation on slideshare by Chris Richardson. On slide # 21, he shows a slide that visualizes a “polyglot database architecture”. In his example, Netflix is using a combination of a traditional RDBMS, SimpleDB, Cassandra and Hadoop/Hbase. He goes on to point out that the polyglot architecture requires that one evaluate the specialized data management needs of various aspects of an application, which leads to an architecture that leverages the right data management solution for the right job. So, how does one go about evaluating the specialized data management needs of application components, and how does one map those to the glut of new data solutions?
Back to the 451 landscape map above, and one’s head can spin right off.
So, I found another slideshare presentation by Akmal Chaudhri that lays out a set of considerations to follow when evaluating alternate data management solutions. His slide deck is a daunting 213 slides long. But it’s all good information. Right around slide 115, he starts presenting use cases for NoSQL that lay a foundation of considerations.
I dropped by the offices of my former employer (Neudesic) to catch up with a few folks, and ran into a friend there who’s working on a new product/platform in their research group. I wanted to compare notes with him on his data management architecture. Low and behold, they are using a polyglot database architecture, leveraging a combination of 3-4 different database systems that fall into the RDBMS SQL and NoSQL categories. Each system was hand-selected to meet specific requirements for various components of their platform. Nice to see that I am on the right track, and it’s nice to compare notes with people who are making ready use of this design pattern.
Conclusion: I will no longer be looking for the “One DB to Rule Them All” in the big data space. The Polyglot strategy drives a certain level of conscious and deliberate design thinking, while the one-db strategy sub-optimizes the solution. So, a deeper investigation into the categorization and taxonomy of NoSQL and NewSQL solutions is in order. Understanding the principal categories, features and differences between systems will lead to a stronger evaluation criteria. The451Group seems to be a great place to start, and I intend to spend a lot more time on their web site learning.