November 30, 2006

Design to fail

Last night I went to an SDForum talk by two eBay architects Randy Shoup and Dan Pritchett on how they built, scaled and run their operation. The talk didn't have anything substantially different from what I've heard before, but was still impressive because they were applying some of the common thinking to their operations which runs over 15000 servers any given time. [ Slides ]
Here are a few interesting phrases I took away from the talk.

  • Scale out not up: Scaling up is not only expensive, it will also become impossible beyond a certain technical limitation. Scaling out, however is cheaper and practical.

  • Design to fail: Every QA team I know, do a whole batch of tests to make sure all components work as they should. Rarely have I seen a team which also does testing to see whether the servers stay up if certain parts of the application fail.

  • If you can't split it, you can't scale it: Ebay realized early on that anything which cannot be split into smaller components can't be scaled. A good example of such operation are the "joins" on multiple tables in a database. Relying on database to do joins across a large set of tables means that you can never partition those tables into different databases. And if you can't split it, you will have t

  • Virtualize components: If they can virtualize it, and create an abstraction layer to take care of these virtual components, then rest of the application need not worry about the actual server names, database names, table names etc. The Operations team can move components around to suite scalability needs.

November 27, 2006

The Java+linux OS

This will be an interesting trend to follow. This linux+perl distribution is made up of just linux kernel and perl binaries. Rest of the tools are all written in perl shell scripts. Miguel de Icaza, the creator of mono is looking for folks to do the same with mono.

I think its a great experiment and will help validate mono as a practical alternative to other frameworks/languages on linux. But what will be even cooler (for me atleast) is if someone can create a true Object oriented shell experience like Microsoft's powershell/monad. And incase you didn't know, Powershell/Monad is the new shell by Microsoft using .net framework. It will probably replace cmd sometime in the future.
That being said, it doesn't really have to be mono. Java is a perfect candidate for it as well. There was a java project related to a java based shell which I don't think is active anymore... may be someone can revive it.

Can it be done ?

November 23, 2006

JSON: Breaking the same-server-policy Ajax barrier

The same origin policy prevents document or script loaded from one origin from getting or setting properties (XMLHttpRequest) of a document from a different origin. The policy dates from Netscape Navigator 2.0. This is a very important security restriction which disables rogue third-party javascripts from getting information from your authenticated banking server session.

Unfortunately, this also almost completely shuts down any possibility of data sharing between multiple servers. Note the use of the word "almost", because "JSON" is the new Saviour of web2.0 world. JSON or Javascript Object Notation, is nothing but a simple data interchange format which can be easily used by javascript applications. Whats different here is that unlike XMLHttpRequest which can send back answers in any format the javascript application wants, JSON requires the answers to be in JSON format, which is basically a subset of Javascript Programming language, or to be more specific Standard ECMA-262.

For those who are curious how this works and don't have time to read the complete documentation, the difference is that a javascript application can still call other javascripts to be loaded from third party websites. So if you are running an application on and you have some data on, you can load that data into your application as long as you masquerade that information as a javascript.

Thats it, there is no rocket science here... but it does feel like one when you first come across it. I surely did.

While you are at it, watch out for JSONP (JSON with padding) too. Google is one company which I know have been using such mechanisms for a long time. They recently came out with more vocal support of this new open data interchange standard.
Oh, and before you go hacking your code, one thing you might like to watch out is to avoid opening up private/privileged information using JSON mechanism, because its open to XSS (Cross site scripting hole).

Ajax/Web debugging with Firebug

I've been using Firefox for a long time, and have always had Web developer plugin by my side for those miserable days. This tool which can save your ass at a time when you really need to understand what the heck your Ajax code is up to.

A couple of days ago I came across another such tool called  Firebug. All I have to say is that I was completely blown away by its intutive debugging style Cleaning up my messy Ajax generated code could have been a lot worse if this guy wasn't around.
Here is a quick feature list

* JavaScript debugger for stepping through code one line at a time
* Status bar icon shows you when there is an error in a web page
* A console that shows errors from JavaScript and CSS
* Log messages from JavaScript in your web page to the console (bye bye "alert debugging")
* An JavaScript command line (no more "javascript:" in the URL bar)
* Spy on XMLHttpRequest traffic
* Inspect HTML source, computed style, events, layout and the DOM


Thanksgiving updates

November 19, 2006

Faking a Virtual Machine

One of the more popular trends in the recent years is the move of malicious code analysts towards virtual machines to test and reverse-engineer malicious code. And surprisingly the virus/worm writers have been adding mechanisms to their code to detect such environments.

I came across this particular piece of software called Themida which does exactly that. Lenny Zeltser from SANS reports about this on SANS. Whats interesting is that this kind of detection is now part of commercial packers around the world.
The question I have is this, how long will it take for someone to come up with a VMWare/Virtual Machine simulator/faker which I can run on my perfect non-virtual desktop/laptop/server and make malwares believe its running inside a Virtual machine ?

If that can kill even a small percent of fresh 0-day worms/viruses, it would be worth the effort. Wouldn't it ?

November 18, 2006

The RAJAX framework (Reverse AJAX)

The use of XmlHTTPRequest without refreshing the browser is one of the more common ways of differentiating an Ajax application from a more traditional approach. But while rest of the world was learning Ajax, some smart developers have figured out to do the next step and created something called "Reverse AJAX", or as I call it "RAJAX".

Traditional client-server applications (not over the web) which used standard TCP/IP and UDP protocols didn't have to worry about Firewalls, NATs and PATs. Such client-server applications had the ability to intiate connections either way (from client to server, or from server to client). HTTP Protocol, which was built over TCP/IP was designed for specifically for web browsing where its always the clients asking for information and servers replying.

By moving traditional client-server applications to Web applications, the users did solve a lot of Firewall/NAT/PAT issues, but gave up a lot on usability and speed. AJAX to some extent solves the problem by reducing the amount of communication happening between the client ant the server, but it still doesn't openly allow something which servers could do in the old client-server model. Initiate a connection back to the client.

RAJAX is a framework where multiple AJAX calls between the client and server could bridge this gap and give both the server and client the ability to ask and answer to requests. An excellent example of an RAJAX application is webified chat client. Google Talk for example doesn't just open a connection when the user types in a message... it also keeps a connection open to the server to send messages to the user in case one of his/her contacts wants to initiate a chat. Another example provided by one of the reference links below is that of allowing multiple AJAX-based-document-sharers modifying the same document.
So, in short, the client always keeps an active HTTP request to the Server and allows the server to respond to that request only if there is a message from server to client which client didn't ask for.


November 15, 2006

Sitemaps now supported by Microsoft and Yahoo.

Google started it, but sitemaps has since been adopted by most of the large search organizations out there. If you own a website, and have a lot of static content, you probably should be investigating at creating and updating sitemap on regular basis.

Sitemap is basically an XML file which describes the contents and change frequency of the site. If you ever had pages hidden deep inside your website which were not getting indexed before, sitemaps is an excellent way of advertising those pages to the search engine.
Sitemaps are an easy way for webmasters to inform search engines about pages on their sites that are available for crawling. In its simplest form, a Sitemap is an XML file that lists URLs for a site along with additional metadata about each URL (when it was last updated, how often it usually changes, and how important it is, relative to other URLs in the site) so that search engines can more intelligently crawl the site. Web crawlers usually discover pages from links within the site and from other sites. Sitemaps supplement this data to allow crawlers that support Sitemaps to pick up all URLs in the Sitemap and learn about those URLs using the associated metadata. Using the Sitemap protocol does not guarantee that web pages are included in search engines, but provides hints for web crawlers to do a better job of crawling your site.

Powershell/Monad Version 1.0 is finally out

More than two years ago I wrote about a neat little microsoft project called Monad which caught my eye. The project boasted of doing something which I've never seen anyone else do before. They created an object oriented shell interface.

One of the examples I use to explain is that unlike unix flavor of "ps" which allows listing of fields you like or not using optional command line parameters, in Monad, you can parse the output of "ps" (aka get-process) and manipulate the objects returned to print any format you want by inspecting the object. All unix admins know how to use "cut" "grep" and "awk" for different reasons, but in a true monad shell environments where every command you type is a monad commandlet, you won't have to use the traditional string based tools anymore.

Whats interesting is that unlike in Unix/other_shells, you can pipe the output of ps command in monad and throw it on to an XLS sheet with a pie chart attached. Neat !!

Microsoft has finally released the official 1.0 version of this product (just in time for the Vista release) and its now being called Powershell. Even though the version I installed was on my XP box, it supports other flavors of Windows as well. Watch out for this blog for more of Powershell as I'm for sure going to use it.


November 14, 2006

Comprehensive security report on Mac

I knew that the Macs are the most secure operating systems around, but what surprised me is that someone took the trouble of writing a comprehensive 29 page PDF report about it."The research report looks at significant OS X threats including local, remote and kernel vulnerabilities and discusses overall system design weaknesses that contribute to insecurities on the Mac platform. The document also reviews the current state of malicious code, discussing the presence of several viruses and worms and the existence of three known rootkits for OS X."

November 12, 2006

Microsoft will probably start selling/distributing linux soon

Anyone can tell you an interesting story, but when it comes to Microsoft and Novell's recent deal Linux enthusiasts around the world have more than a couple up their sleeves.

Microsoft has a long history at killing competition. They started with Novell's Server market, they tried to do with Java, and today they are trying to do it against the Anti-Virus vendors. They succeeded against Netscape, gained significant grounds against Sony's Playstation, and killed a thousand other products that I can't name because I forgot about them after Microsoft obliterated them out of the market. If any of you are XBox lovers, I don't have to tell you that in the war over consoles Microsoft has been losing money on every XBox it sells. Zune (the competition to iPod) is said to have a similar strategy. In short Microsoft has a huge bank balance and can pump in a lot of money until the competition goes bankrupt.

As a result of this announcement its not a surprise that the Linux world is almost up in arms against Novell for giving in for a few pieces of silver. I on the other hand have a different prespective on it.

  • Microsoft isn't interested in suing anyone (anytime soon atleast) because of its Vista launch schedule and the tricky negotiations going on in Europe

  • SCO has already tried the same FUD which Microsoft is accused of trying. In fact if you remember Microsoft had "licensed" SCO unix in a similar deal which was indirectly used to fund SCO's battle against IBM/Linux

  • Most of the other visible products Microsoft has went after till now have been markets where Microsoft didn't really have a foothold. Linux is one of the very few unique products which started up as a competitor to Microsoft has has gradually increased in popularity over the years. [Firefox/Mozilla is the other one which I admire]

  • The other interesting point to note is that unlike most other commercial vendors who got nailed by Microsoft's pump and dump strategy, Linux is not a commercial entity which can go bankrupt. They can kill Novell, but it will be very hard for them to kill the whole linux movement.

My personal analysis is that Microsoft is afraid.

  • Its so afraid of loosing this battle that in its moment of desperation its ready to do anything short of launching a Microsoft branded Linux distribution.

  • The Financial deal Microsoft and Novell signed has a few hints of where this might be heading.

  • To begin with its clear both of them want to integrate each others OS using each others technology to provide a better virtualization experience.

  • Its also clear that though Novell might use significant portions of proprietary Microsoft technology (for example for authentication, authorization and accounting) Microsoft will mostly be using GNU code to which Novell doesn't have any rights anyway.

  • So why is Microsoft paying Novell ?

  • And what's the deal with 240 million dollars for linux license subscription cost ? What is it going to do with that many copies of linux distribution ?

  • Oh wait, they could embed it into your Microsoft operating system ? Have you ever thought which distribution of Linux you would use if your Microsoft OS copy you already have, has a Linux distribution pre-bundled with it?

  • Novell also mentions that it will pay Microsoft a minimum amount of licensing fees, which can increase depending on its own sales. So may be it will sell Windows as well... who knows. But it will sell something with at least some part of Microsoft code in it.

  • Finally based on my personal opinion (with no understanding of financial details) it almost looks like Microsoft has kind of bought a share of Novell's company and wants a piece of the action every year.

  • May be Microsoft is going to announce something even much more significant which will dramatically increase Novell's sales. May be Novell is an investment after all... not just a pump-and-dump target.

My thought process finally took me to the one place I didn't want to go... Its the thought that Microsoft will soon bundle Suse linux with one of its own products.

Coming back to the discussion on whether we should abandon Suse or not, I personally think it doesn't matter as long as Microsoft is not trying to kill it. Stop acting like a 5 year old kid who doesn't like the big guys. If anything, you should be excited about more commercial support behind your favourite OS. And if they really do bundle Suse with every Desktop/Server OS thats exactly what I wanted when I joined the revolution. Linux on every desktop...
I have said this before, and I'll continue to say it that I'm not opposed to Microsoft Linux as long as others can innovate and keep Microsoft on its toes.

Offline Storage in Ajax applications ?

I've been out of the blogging world working on a ajax application which has been sucking out a lot of time from my already small free time which I have.

I'd mentioned Laszlo sometime back, and explained how its jumping into the Ajax world from a pure flash based application server. The ajax application I was working on, however, started in pure ajax before it got involved with Dojo. Dojo is not the only Javascript library out there, but it certainly is one of the better ones. I played around with a few others including yahoo's javascript library, Google web toolkit and Sajax before I chose Dojo to work with. No server side code was one of the reasons, but its popularity was the man reason.

When I started off Dojo had 0.3 version out which already had a lot of important features like back-button-fix and keyboard event handlers which I heavily use in my application. As of today has 0.4 released which has among other things APIs to draw 2D graphics. But what really surprised me today was when I read that one of the most important things which wasn't possible to do with javascript is now not only possible but its also supported by Dojo.

Interestingly, Offline storage on browsers has always been there in the form of web cache. I also know there are some flash based applications which can persist data on client's desktop too... but until I saw the Dojo:Storage documentation it never occured to me that an Ajax based application could so easily use this feature to do something which should have been there to begin with.

Dojo doesn't only have APIs to programatically recall that cache and browse the content but also interact and modify it. Here are some references to this interesting concept