Amazon has announced it’s next software-as-a-service play – SimpleDB – and the technical world is all a-flutter. You can see the breathless reporting all over the net. Is this the beautiful panacea of unleashed database power it’s being reported as?
A few months back, I examined S3 and EC2 (two of Amazon’s earlier web service offerings) and came away with the sense that Amazon is changing the rules of the game in a big way, but that there is still no good way to implement an online scalable database with these services. (I know that there are attempts to put a relational database on EC2, but they seem to be quite painful.)
In short, S3 is a great scalable online file system, and EC2 is a great scalable processor. Neither one of them allows us the sort of slicing, dicing, remixing, and re-serving of data that the world has come to expect from a database.
So when the news of SimpleDB hit the wires, I figured that Amazon was stepping up to answer my cry and providing an online scalable database. Well, they are, and they aren’t.
They are providing a way to store structured information online, but it’s hard to call it a database. In fact, it’s a bit disingenuous of them to do so.
Those who are concerned about the details can quickly find out for themselves that SimpleDB has nothing to do with the Relational Model that has been the basis for databases for the past 40 years. To call something that doesn’t even smell like the relational model a database is pure marketing.
But let’s leave the marketing aside – the service is in the field, it’s a totally new beast, and it’s called what it’s called. What does it look like? It looks like a place to store object instances. No classes, no schema, they-are-just-what-they-look-like, object instance. Besides coming with a whole new metamodel, it comes without a lot of the sugar that mature database systems have led us to expect. There’s no fulltext search, queries are lexicographic (so they don’t deal well with numbers or dates), the set of operators on a query is more limited than we’re used to, there’s no verification of the data, no triggers, etc.
All this is fun, but the real kicker is this – reading data from SimpleDB immediately after a write may not reflect the latest updates. This is called eventual consistency. That’s what you tell your customers – it’ll get there eventually.
What happens now? What happens now is that the developers start to relearn the way they handle data. No existing database applications – back-end, front-end, or middleware – can be easily ported to run on top of this new beast. You can’t tell your customers that the data will get there eventually, so you tell your developers to cover the gap. This service might save you a database administrator, but in the near term, you’ll need another developer to take his place.
In the long term, we’ll start seeing SimpleDB, S3, and EC2 aggregated under another layer – one that presents the tried and true relational model. SimpleDB will handle the tuples, S3 will handle the BLOBS, EC2 will grind the queries, and the application developer won’t have to worry about it. Whether Amazon delivers it or someone else does, it’s coming – the reliable online scalable database.
(Some of the sharper analyses: O’reilly compares pricing of SimpleDB to S3, Marcelo smells a familiar data model, Inside Looking Out lays out some of the technical hurdles, rc3 wonders how to tune it, and the comment from daveadams sings a love song to the relational model)