December 22, 2004

Using Google to fix your 404 errors ( Part II )

A few weeks ago I wrote a small hack to use google to handle 404 errors. You can find that artcle here Using Google to handle website 404 errors

Unfortunately even though it works, its not optimal. Here are the few drawbacks I noticed

  • I was using Meta Redirects. Some bots didn't understand that very well

  • Meta redirect generates a 302 (temporary move) instead of 301 (permanent)

  • Some bots, browsers were refreshing the same page in an endless loop for some reason.


So in frusteration I wrote another piece of code. This time I'm using google web-api to get my results internally, instead of forcing the user to go to the google website for the first best hit. Here is the code I'm using. Please remember to put in your google key in the right place before you try it out yourself.
#!/usr/bin/perl
use strict;
use SOAP::Lite;
my $request=$ENV{REQUEST_URI};
my $httphost=$ENV{HTTP_HOST};
my @found=();
my $foundtext="";
my $lookfor=&fix;($request);
my $site="www.royans.net";

if ($httphost =~/security/i) {$site="security.royans.net";}
if ($httphost =~/desijokes/i) {$site="desijokes.royans.net";}

&getnewurl;($lookfor,"$site");
print "Status: 301 Moved Permanentlyn";
print "Location: $found[0]n";
print "Content-type: text/htmlnn";

print "$foundtext";

## This removes some characters to help google do a better search based on content rather than the file name
sub fix()
{
my ($lookfor)=@_;
$lookfor=~s/// /g;
$lookfor=~s/./ /g;
$lookfor=~s/?/ /g;
$lookfor=~s/-/ /g;
$lookfor=~s/_/ /g;
return $lookfor;
}

sub getnewurl()
{
my ($lookingfor,$site)=@_;

my $google_key='Your Google Key here';
my $google_wdsl = "/home2/rkt/www/cgi-bin/GoogleSearch.wsdl";
my $query = "$lookingfor site:$site";
my $google_search = SOAP::Lite->service("file:$google_wdsl");

my $results = $google_search -> doGoogleSearch( $google_key, $query, 0, 10, "false", "", "false", "", "latin1", "latin1");

@{$results->{resultElements}} or exit;
foreach my $result (@{$results->{resultElements}}) {
$found[$#found+1]=$result->{URL};
$foundtext="$foundtext

$result->{title}
{URL}> $result->{URL}
$result->{snippet}

";
}
}

No comments: