NoSQL: Scaling beyond traditional relational database systems

The latest trend for highly trafficked websites is to move away from MySQL in favor of distributed storage systems most commonly referred to as NoSQL databases. Google been in the NoSQL camp for quite a while with Bigtable. Digg recently announced their plans to use Cassandra going forward. Amazon uses a similar distributed storage technology called Dynamo for Amazon S3.

The problem with MySQL is that it’s difficult to scale up and you have to rely on patches from Percona and Google. It’s also not the ideal solution for storing terabytes of data. Scalability, availability and backups all become big challenges.

The challenge with NoSQL systems is ad hoc query capability is generally left up to the imagination of the engineers. Meaning it doesn’t really exist without building special systems on top of the data store. Google addresses this with MapReduce, which is their model for applying a set of functions to a huge dataset. The result being a derived dataset.

It’s a very interesting problem. As the internet grows and the quantity of information increases, we’re outgrowing our traditional relational database systems.

This entry was posted in Database, Programming, Storage and tagged , , , . Bookmark the permalink.