Showing posts from June, 2009

Monitoring Cloud health

Both Amazon and Google (and probably others as well) provide web pages which monitors its service status. The one which I go to, when I need to compare availability and to detect service problems is the one called Cloudstatus by Hyperic . They try to monitor most of the individual services provided by Google (Engine, Datastore, Memcache, Fetch) and Amazon (EC2, S3, SQS, SDB, FPS). On top of online graphs, you can also subscribe to twitter status updates which can be really helpful during a real outage.

BSET SearchEngine relevance test results

A few days ago I started a tool called BSET – Blackbox Search engine Testing tool to evaluate how good Bing really is. If you watch the stats on the page, its clear which search engine is being consistently picked as the winner. The results were collected from 518 unique source IP addresses ( some were just NATs from larger organizations ). 251 users just executed 1 query each. 111 users executed 2 queries and rest executed more than that. A total of 808 results were submitted just for “standard web search” category and of that 44% of the submissions were in favor of Google. 32% of them were for Yahoo. Only about 28% results went for Microsoft’s new search engine “Bing”. Between Google and Yahoo, a user is 15% more likely to pick Google than Yahoo. Between Google and Bing, a user will pick Google 21% more frequently than Bing. The results may not be staggering for folks who have been following search engine trends over the last few weeks, but for me, to see the result

Velocity 2009 : Conference presentation slides

If you are like me, and not attending Velocity 2009, you should track this page for the presentation slides from this years conference.     2 Years Later, Loving and Hating the Cloud Death of a Web Server: Crisis in Caching Fixing Twitter: Improving the Performance and Scalability of the World's Most Popular Micro-blogging Site Hadoop Operations: Managing Big Data Clusters Introduction to Managed Infrastructure with Puppet Metrics that Matter - Approaches To Managing High Performing Websites Scalable Internet Architectures Surviving the 2008 Elections The User and Business Impact of Server Delays, Additional Bytes, and HTTP Chunking in Web Search Writing Efficient JavaScript

Building BlackboxSET on GAE/java

Last week I spent a few hours building a search engine testing tool called “ BlackboxSET ”. The purpose of the tool was to allow users to see search results from three different search providers and vote for the best set of results without knowing the source of the results. The hope was that the search engine which presents best set of results on the top of the page will stand out. What we found was interesting. Though Google’s search score aren’t significantly better than Yahoo’s or Bing’s, it is the current leader on BlackboxSET. But this post is about what it took me to build BlackboxSET on GAE which as you can see is a relatively simple application. The entire app was built in a few hours of late night hacking and I decided to use Google’s AppEngine infrastructure to learn a little more about GAE. Primary goals Ability to randomly show results from the three search engines Persist data collected after the user votes Report the results using a simple p

BlackboxSET – Blackbox Search Engine Testing

The launch of Bing has shaken the Google Kingdom a little bit. I for one have been doubting my own support for Google’s search engine . And I know others who swear by Yahoo’s search engine which is a trust I don’t share. To make such testing easier, I’ve spent a few hours last night to create a tool which allows you to search something against the 3 top search engines and lets you decide which one is the best. At the end of the exercise you should be able to find out if you are doing the right thing by sticking with your personal search engine. May the best search engine win.

Steps to migrate your webapp to AWS

Most web applications needs at least the following services to be self sufficient. Computational power, storage, webserver/cdn, database,  messaging, loadbalancer and monitoring. Here is the tried and tested steps as recommended by AWS folks Move static web content to S3 storage first. Images, css stylesheets, javascript files, html, etc can all be moved to S3. Its easier to move some static content than others, so there might be some work required to understand how to breakup web content to move parts of it into the cloud. The content on S3 can be served by Amazon Cloudfront service which is Amazon’s CDN(content delivery network) service. Once you persist your data on S3, your users will get those objects from the S3 servers located closest to them. Move applications and webserver layer to the EC2 infrastructure. This step will require you to figure out how to automate deployments into cloud infrastructure Once your apps are in the cloud, you can start working on

Opera Unite: web server built in ?

There seems to be a lot of talk about “Opera Unite” launch and everyone is so pumped up about the new feature, “webserver built into the web browser”. This is just like twitter. I think it might be a great idea for a few, but for the masses it might turn out to be just over-boated hype. Most of us who have used a recent OS have sharing features and we have been always on the look out for better firewalls to block it. Now here comes a browser which wants to do the same thing, and for some reason doesn’t expect firewalls to impact it? Have all the security concerns gone away all of a sudden ? While the world is switching to a lighter OS and browser, Opera is trying to build a kitchen sink. That being said, I think its a bold step on Opera’s part, and I have to give credit for its “unique” idea, regardless of how useful I think its going to be.

Working with Google App engine’s datastore

I heard a great set of Google App engine datastore related talks at the google I/O conference. I think this is one of the best out talks I heard which is now on Youtube. You should watch it if you are working with or planning to work with Google App Engine in the near future. Click on this link if you cant see the embedded video.