The initial version of this tool used MySQL database. The original application architecture was very simple, and other than the database it could have scaled horizontally. Over the weekend I played a little with SimpleDB and was able to convert my code to use SimpleDB in a matter of hours.
Here are some things I observed during my experimentation
- Its not a relational database.
- Canâ€™t do joins in the database. If joins have to be done, it has to be done at the application which can be very expensive .
- De-normalizing data is recommended.
- Schemaless: You can add new columns (which are actually just new row attributes) anytime you want.
- You have to create your own unique row identifiers. SimpleDB doesnâ€™t have a concept of auto-increment
- All attributes are auto-Indexed. I think in Google App Engine you had to specify which columns need indexing. Iâ€™m wondering if this would increase cost of using SimpleDB.
- Data is automatically replicated across Amazonâ€™s huge SimpleDB cloud. But they only guarantee something called â€œEventually Consistentâ€. Which means data which is â€œputâ€ into the system is not guaranteed to be available in the next â€œgetâ€.
- I couldnâ€™t find a GUI based tool to browse my SimpleDB like the way some S3 browsers do. Iâ€™m sure someone will come up with something soon. [Updated: Jeff blogged about some simpleDB tools here]
- There are limits imposed by SimpleDB on the amount of data you can put in. Look at the tables below.
|domains||100 active domains|
|size of domains||10GB|
|attributes per domain||250,000,000|
|attributes per item||256 attributes|
|size per attribute||1024 characters|
|items returned in a query response||250 items|
|seconds a query may run||5 seconds|
|attribute names per query predicate||1 attribute name|
|comparisons per predicate||10 operators|
|predicates per query expression||10 predicates|