Showing posts from February 17, 2010

Scalability updates for Feb 18, 2010

Some of interesting links for today A very good post about the need of event driven Cloud API model for monitoring . I think its a matter of time before this happens. Just like feed crawlers are embracing even driven publication notification using protocols like Pubsubhubbub , we need something similar to snmp traps for the cloud notification world. Translate SQL to MongoDB MapReduce Real-time web for web developers : An example of how the problem of polling huge number of websites for updates was transformed by simply using an event driven push model. Logging: unsexy, important and now usable Comparing Pig Latin and SQL for Constructing Data processing pipelines Cassandra backend for Lucene ? This seems to solve the problem of building reverse index on cassandra which I previously blogged about. Cloud MR : A Map/Reduce framework over Amazon’s S3/SQS/EC2 service. Interesting NoSQL Categorization Writing twitter service on App engine

More on Amazon S3 versioning (webinar)

If you missed the AWS S3 versioning webcast, I have a copy of the video here . And here are the highlights.. You can enable and disable this at the bucket level They don’t think there is a performance penalty of turning versioning (but it was kind of obvious S3 would be doing slightly extra work to figure out which is the latest version of any object you have) There isn’t any additional cost for using versioning. But you have to pay for extra copy of each object. MFA (multi factor authentication) to delete objects is not mandatory when versioning is turned on. It needs to be turned on. This was slightly confusing in the original email I got from AWS. If you are planning to use this, please watch this video. There is a part where they explain what happens if you disable versioning after using the feature. This is something you might like to know about. They use GUID for versioning of each object You can iterate over objects and figure out how many ver