Experimenting with SimpleDB (Flagthis.com)

A few years ago I wrote a simple online bookmarking tool called Flagthis. The tool allowed one to bookmark sites using a javascript bookmarklet from the bookmark tab. The problem it was trying to solve is that most links people bookmark are never used again if they are not checked out within the next few days.  The tool helps the user ignore bookmarks which were not used in last 30 days.

The initial version of this tool used MySQL database. The original application architecture was very simple, and other than the database it could have scaled horizontally. Over the weekend I played a little with SimpleDB and was able to convert my code to use SimpleDB in a matter of hours.

image 

Here are some things I observed during my experimentation

  1. Its not a relational database.
  2. Can’t do joins in the database. If joins have to be done, it has to be done at the application which can be very expensive .
  3. De-normalizing data is recommended.
  4. Schemaless: You can add new columns (which are actually just new row attributes) anytime you want.
  5. You have to create your own unique row identifiers. SimpleDB doesn’t have a concept of auto-increment
  6. All attributes are auto-Indexed. I think in Google App Engine you had to specify which columns need indexing. I’m wondering if this would increase cost of using SimpleDB.
  7. Data is automatically replicated across Amazon’s huge SimpleDB cloud. But they only guarantee something called “Eventually Consistent”. Which means data which is “put” into the system is not guaranteed to be available in the next “get”.
  8. I couldn’t find a GUI based tool to browse my SimpleDB like the way some S3 browsers do. I’m sure someone will come up with something soon. [Updated: Jeff blogged about some simpleDB tools here]
  9. There are limits imposed by SimpleDB on the amount of data you can put in. Look at the tables below.

 

Attribute Maximum
domains 100 active domains
size of domains 10GB
attributes per domain 250,000,000
attributes per item 256 attributes
size per attribute 1024 characters

 

Attribute Maximum
items returned in a query response 250 items
seconds a query may run 5 seconds
attribute names per query predicate 1 attribute name
comparisons per predicate 10 operators
predicates per query expression 10 predicates

Other related discussions (Do checkout CouchDB)

1 comment

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>