Having recently posted a question asking Why are there so many NoSQL databases? over at Hacker News, I thought it would be useful to summarise the responses, and to draw any common thoughts out.

Background

There is a "standard set" of traditional databases if you are developing a web (or non-web, of course) application; MySQL, PostgreSQL, SQLite and perhaps Oracle and SQL Server if you're in an enterprise environment.

However, with the NoSQL anti-database movement gaining in momentum and becoming more widespread, people are starting to look towards the new schema-free, key-value and document store databases that are hitting the market.

The problem is the proliferation of NoSQL options, and trying to boil everything down to understand which options suits your needs most closely.

Why are there so many NoSQL options?

Without delving into too much detail, there have been many recent innovations ^1 in the field (Facebook created Cassandra, Google created BigTable and MapReduce, Amazon created SimpleTable and LinkedIn created Project Voldemort). Innvoations that came about to solve the relatively new challenge of scaling web applications ^2 for millions of users.

The general view is that things will settle down in the future, with a few clear front-runners emerging next year ^3. In the mean time, a useful - and related - analogy is that of SQL. There were many different ways of communicating with relational databases, and a common syntax was needed ^4; SQL was the result of compromise and common ground between all those different query languages.

Whilst I don't think that NoSQL projects will merge to create a common system, it does seem likely that some will lag behind in their development, and be superceeded by the better-engineered solutions.

How to decide on a NoSQL option

In the mean time, all you can do is read read read. No one is going to tell you which path you should take; you need to research it yourself and fit it with your requirements before commiting.

Remember that there are three general camps for NoSQL systems:

  • Key-value stores
  • Column stores
  • Document stores

And there is now redis, which straddles these camps.

My advice...

A good jumping off point is NOSQL: scaling to size and scaling to complexity (which gives a good high level overview of the concepts), then browse some of the posts over on MyNoSQL to see which projects are active and what new technologies are being added.

Good luck, and thanks to all the respondees on Hacker News.