Biomart and error 500…

One great thing with Biomart is that you can export your query to PERL code. You set up your query once a time and then, you can launch your PERL script to update your databases, for example (after installing the Biomart API which it’s not an easy part, especially with the registry files). It sounds great, but, in practice, it doesn’t work at all very well for big queries: Bandwitch is very very very slow, and most of the time, we will get an error 500 read timeout… So, you re-launch your script, and again, and again… After a while, you will get upset about Biomart, trust me!

So, I searched in the biomart help, and I found the MartService. As the help is « missing », I tried the example in the website. To my mind, it is not particularly clear (what the POST example means ?) or working (the wget command returned me an error). So, I tried different things. I first picked up the XML file (XML button on top right). By the way, the « Unique Results only » option didn’t work in the XML file: no matter the option was selected or not, the XML file still had the following option: uniqueRows = ’0′ (don’t forget to change to ’1′, or your files will be very very very big). After hanging around the website for a while, I had copy/paste the content of the file WebExample.pl :

# an example script demonstrating the use of BioMart webservice
use strict;
use LWP::UserAgent;

open (FH,$ARGV[0]) || die ("\nUsage: perl webExample.pl Query.xml\n\n");

my $xml;
while (<FH>){
    $xml .= $_;
}
close(FH);

my $path="http://www.biomart.org/biomart/martservice?";
my $request = HTTP::Request->new("POST",$path,HTTP::Headers->new(),'query='.$xml."\n");
my $ua = LWP::UserAgent->new;

my $response;

$ua->request($request,
	     sub{
		 my($data, $response) = @_;
		 if ($response->is_success) {
		     print "$data";
		 }
		 else {
		     warn ("Problems with the web server: ".$response->status_line);
		 }
	     },1000);

Then, I used the following command (as indicated in the example):

perl WebExample.pl myfile.xml

… And… It worked: data were flowing in my terminal! wünderbar! Finally, don’t mess with the installation of Biomart API and configurations of registry files (not tricky at all) if you just want to automatically update your data: use the XML approach with the LWP script. It’s easier, faster and you won’t get error 500 read timeout.

Submit a Comment

Spam Protection by WP-SpamFree