Digg has gotten itself into an awkward place with it’s latest redesign, Digg v4. The users are mad, the content has taken a turn for the worse, and now they’ve decided to fire their VP of Engineering over the whole debacle.
To be honest I never really understood why Digg had to move to Cassandra. There a plenty of really big players that use sharded MySQL to scale very effectively, Facebook being one of them. However, in reading this blog post it kind of makes sense. Joins are very expensive, and they’re working with some seriously huge datasets. But then again, the whole reason they started working with Cassandra is so they could add the green badge feature (green badge == one of your friends dugg the story). Is that really a necessary feature? This is reminiscent of Friendster’s failure:
The problem might have been solved if someone had reworked the software to ignore distant connections–for example, by calculating only connections between friends. But Friendster’s engineers were so preoccupied with day-to-day slowdowns that they neglected to step back and ask what was causing them.
When engineers are left unchecked they will try to solve every problem that comes up with a technology solution. In this case the green badge feature probably is a necessary feature, but that doesn’t mean the entire system had to move to Cassandra at the very same time they were deploying a new version of their site. In hindsight it probably would have been a lot smarter to change features or change database technology. Doing both at the same time at the same time has proven to be perilous.