How to extract biggest text block from an HTML page ?
One of the interesting problems in handling html content is trying to auto-detect biggest html block from the center of the page. This can be very useful for on-the-fly content analysis done on the browser. Here is an example of how it could be done by parsing the dom after page is rendered. // Royans K Tharakan (2010 June) // http://www.royans.net/ // You are free in any form to use as long as you give credit where its due // Would appretiate if you submit your changes/improvement back to me or to some other public forum. // Requires jquery var largestId = 0; var largestDiv = null; var largestSize = -1; function getLargestDiv() { var size = getSize(document.getElementsByTagName("body")[0], 0); if (window.location.href.indexOf("wikipedia.org")>0){ return "#bodyContent"; } return "[d_id='tmp_" + largestId+"']"; } function getSize(currentElement, depth) {