February 28, 2009

Experimenting with SimpleDB (Flagthis.com)

A few years ago I wrote a simple online bookmarking tool called Flagthis. The tool allowed one to bookmark sites using a javascript bookmarklet from the bookmark tab. The problem it was trying to solve is that most links people bookmark are never used again if they are not checked out within the next few days.  The tool helps the user ignore bookmarks which were not used in last 30 days.

The initial version of this tool used MySQL database. The original application architecture was very simple, and other than the database it could have scaled horizontally. Over the weekend I played a little with SimpleDB and was able to convert my code to use SimpleDB in a matter of hours.


Here are some things I observed during my experimentation

  1. Its not a relational database.
  2. Can’t do joins in the database. If joins have to be done, it has to be done at the application which can be very expensive .
  3. De-normalizing data is recommended.
  4. Schemaless: You can add new columns (which are actually just new row attributes) anytime you want.
  5. You have to create your own unique row identifiers. SimpleDB doesn’t have a concept of auto-increment
  6. All attributes are auto-Indexed. I think in Google App Engine you had to specify which columns need indexing. I’m wondering if this would increase cost of using SimpleDB.
  7. Data is automatically replicated across Amazon’s huge SimpleDB cloud. But they only guarantee something called “Eventually Consistent”. Which means data which is “put” into the system is not guaranteed to be available in the next “get”.
  8. I couldn’t find a GUI based tool to browse my SimpleDB like the way some S3 browsers do. I’m sure someone will come up with something soon. [Updated: Jeff blogged about some simpleDB tools here]
  9. There are limits imposed by SimpleDB on the amount of data you can put in. Look at the tables below.


Attribute Maximum
domains 100 active domains
size of domains 10GB
attributes per domain 250,000,000
attributes per item 256 attributes
size per attribute 1024 characters


Attribute Maximum
items returned in a query response 250 items
seconds a query may run 5 seconds
attribute names per query predicate 1 attribute name
comparisons per predicate 10 operators
predicates per query expression 10 predicates

Other related discussions (Do checkout CouchDB)

Techmeme run out of news ?

A lot of us go to Techmeme for our hourly fix. But for the last few hours things haven’t been quite the same. Come to think of it, the quality of news on techmeme could be an indicator of whats left to come to the tech industry.

The first couple of news of news has nothing to do with technology in general and the third news item is a few days old already. The three items after that are the same old news in different wrapping.


Either the weekend is getting to me, or this is the lull before the storm.