Tag: system

PERL  MacOS X 10.6.x, FINK, Macport, DBD::mysql and PERL…

Recently, I updated my laptop to snow leopard. After installing a bunch of softwares, I made some test with PERL… And then, with a PERL program working with a SQL database :

dyld: lazy symbol binding failed: Symbol not found: _mysql_init
  Referenced from: /Library/Perl/5.10.0/darwin-thread-multi-2level/auto/DBD/mysql/mysql.bundle
  Expected in: flat namespace

dyld: Symbol not found: _mysql_init
  Referenced from: /Library/Perl/5.10.0/darwin-thread-multi-2level/auto/DBD/mysql/mysql.bundle
  Expected in: flat namespace

Trace/BPT trap

Argl ! According to this perlmonk discussion, there’s a problem with FINK, snow leopard and some dynamic libraries… Great ! So, I immediately removed fink from my computer and installed macport. Actually, I didn’t want to install macport, since this package manager always installed a lot of dependancies, without checking if these dependancies are already available in the system (i.e. PERL…). But, as FINK crash my PERL-SQL system (and as others package managers don’t have so many packages than macport), I had finally installed macport on my laptop. And, again, macport had downloaded and installed another PERL distribution on my computer (thanks, you’re welcome)… So, by default, your perl and cpan programs will be /opt/local/bin…

How to use the apple PERL distribution ? I simply change the following line in my .profile:

export PATH=/opt/local/bin:/opt/local/sbin:$PATH

by

export PATH=$PATH:/opt/local/bin:/opt/local/sbin

Then, my /usr/bin PATH is on top (where the apple perl program is) and the ‘which perl‘ command gives me /usr/bin/perl. Works for me…

Unix  How to mirror a FTP site?

For research purposes, I want to mirror the GEO directory of the NCBI FTP site. These data are huge: more than 1,000 annotation platforms files and more than 20,000 GSE files! Platforms files are updated each month and GSE files on a daily basis. As you may imagine, I quickly forgot the idea of developping some PERL programs to achieve this task.

I first try to use a software dedicated to the mirroring of scientific databases: BioMaj. With the annotation platforms files, BioMaj works fine: files were quickly downloaded and inserted in SQL databases with the management of post-process scripts included in BioMaj. But, with the GSE files, I think I reached the limit of BioMaj, because of the structure of the GSE datafiles. Indeed, the « seriesmatrix » directory is organized in many subdirectories (almost 20,000 in which there are at least one file) and BioMaj spent more than 6 hours only to list the directory content!!! Then, the downloading and the management of the files (as BioMaj handles files versioning) were quite long, too! More problematic is that BioMaj freezed / halted / stopped during the update process. After a while, I finally try another way to make a local mirror: using classic unix tools, as… lftp, which has a mirroring and a multi-threading features!

Using lftp is quite simple, I found some usefull shell scripts here, which I’ve modified:

#!/bin/bash
HOST="ftp.ncbi.nih.gov"
USER="anonymous"
PASS="anonymous"
LOG_dir="/data/GEO/logs"
LCD="/data/GEO/platforms"
RCD="/pub/geo/DATA/annotation/platforms"
DATE=`date '+%Y-%m-%d_%H%M'`
lftp -c "set ftp:list-options -a;
open ftp://$USER:$PASS@$HOST;
lcd $LCD;
cd $RCD;
mirror --verbose --exclude-glob old/ --parallel=10 --log="$LOG_dir/GPL_$DATE" "

This script works with the GSE datafiles and it’s relatively fast (about 45 min to get the directory listing and less than one night (!) to download the files.

Unix  XML::SAX conflict between Debian & CPAN

If you have made a previous install of XML::SAX via CPAN and then you try to install a debian package that required this module, you will probably get this error:

Fatal: cannot call XML::SAX->save_parsers_debian().
You probably have a locally installed XML::SAX module.
See /usr/share/doc/libxml-sax-perl/README.Debian.gz

for further details on this error, you can read this bug report page. The soluce is to remove the XML::SAX modules from CPAN. It can be performed by using this script:

#!/usr/bin/perl -w

use ExtUtils::Packlist;
use ExtUtils::Installed;

$ARGV[0] or die "Usage: $0 Module::Name\n";
my $mod = $ARGV[0];
my $inst = ExtUtils::Installed->new();

foreach my $item (sort($inst->files($mod)))
{
	print "removing $item\n";
	unlink $item;
}

my $packfile = $inst->packlist($mod)->packlist_file();
print "removing $packfile\n";
unlink $packfile;

Then, type the following commands (root):

chmod u+x rm_perl_mod.pl
./rm_perl_mod.pl XML::SAX
apt-get upgrade

It works for me…

Unix  The beauty of screen

How to launch analyses in background on a *nix system ? You can use the « & » at the end of your command, but the screen output of your analyses will be messy and you cannot close your session. You also can use the batch command but you will lose the screen output and there will be no way to interact with your software. To my mind, the best alternative is the use of the screen command, which is detailled in this article.

To use your favorite cpu-intensive software, use the following command:

screen

Then, you will start with a new fresh shell session and in this session, you can launch your analysis. When it’s done, type « Ctrl-A Ctrl-D » to detached your screen: you will go back to your old shell session that launched the screen session. Now, you can log out.

After a while, to reconnect to your « screen » session, use the command:

screen -list
There are screens on:
3199.pts-1.server (Detached)
3248.pts-1.server (Detached)
2 Sockets in /var/run/screen/S-user

So, to attach to the session on « 3199.pts-1.server », just type:

screen -r 3199.pts-1.server

and that’s it! When your job is done, you can leave your screen session by typing « Ctrl-D »… Don’t forget to read the man page for further details on this « magic » command.