Tag: benchmark

PERL  Don’t dereference if you don’t need…

Sometimes, the PERL syntax could be… obscure, especially if you’re dealing with Hash of pointers to Lists… Let imagine you’re handling a tree structure of a family, for instance. Then, you will have a Hash where the keys are the parents. In this Hash, you will store a pointer to a List, which store the children. Thus, your data file has this structure:

parent1 child1
parent1 child2
child2 child3

Then, while reading the file content, you will use the following code:

my @T = split(/\t/, $line);
my $parent = $T[0];
my $child = $T[1];
push (@{$Family{$parent}}, $child);

Thus, to find all the children of one parent, you can use a recursive function:

sub get_children ( $ $ $ $ ) {
	my ($root_parent, $parent, $_Family, $_List) = @_;
	foreach my $child (@{$$_Family{$parent}}) {
		push (@{$$_List{$root_parent}}, $child);
		&get_children ($root_parent, $child, $_Family, $_List);

But, don’t produce this code (even if the syntax could more simple):

sub get_children ( $ $ $ $ ) {
	my ($root_parent, $parent, $_Family, $_List) = @_;
	my %Temp =  %$_Family;
	foreach my $child (@{$Temp{$parent}}) {
		push (@{$$_List{$root_parent}}, $child);
		&get_children ($root_parent, $child, $_Family, $_List);

In the last version, the line my %Temp = %$_Family dereference the pointer to another Hash. If you’re dealing with small pedigree, it won’t affect the performance of the script. But, if you’re handling (very) large families, the performance will drop down, since at each recursive step, the whole pedigree Hash is dereference to another Hash (meaning memory transferts)…

Let’s benchmark the two different codes with a large pedigree of 33000 individuals (store in %Family) and re-contruct the list of all the children for a parent with an offspring of 600 individuals (kind of families we have to manage in animal genetics):

Pointer version

Time taken was  0 wallclock secs ( 0.00 usr  0.00 sys +  0.00 cusr  0.00 csys =  0.00 CPU) seconds

Dereference version

Time taken was 27 wallclock secs (26.08 usr  0.09 sys +  0.00 cusr  0.00 csys = 26.17 CPU) seconds

Need more comments?

PERL  And the winner is…

In bioinformatics analysis, I often perform basic statistic computations (mean, median, etc) to describe the data or others calculations (correlation, for instance) to analyse sets of gene expression data… To conduct such analyses, I use some PERL modules to make these basic calculations. Although with small datasets, the choice of a module has no « real » impact in term of time of analysis, it can be relevant to select the right module if the datasets become bigger or if the analysis has to be performed many times…

Depending on the analysis to carry out, I currently use 4 different PERL modules:

  • PDL (see older post for more details)
  • Statistics::Descriptive
  • Statistics::OLS
  • Statistics::Basic

But, all these modules can handle basic calculation, so let’s benchmark!

For benchmarking purpose, I will use a datafile with 100 000 genomic distances and will compute the mean and the root mean square (rms). For correlation computations, I will use two vectors of expression profiling of 139 assays. Here are the benchmark results:

Benchmarking MEAN and STD computation:
  - des : using Statistics::Descriptive module
  - pdl : using PDL module
  - bas : using Statistics::Basic module

Benchmark: timing 1000 iterations of bas, des, pdl...
bas: 47 wallclock secs (45.31 usr +  0.60 sys = 45.91 CPU) @ 21.78/s (n=1000)
des: 176 wallclock secs (171.60 usr +  1.93 sys = 173.53 CPU) @  5.76/s (n=1000)
pdl: 42 wallclock secs (37.87 usr +  2.79 sys = 40.66 CPU) @ 24.59/s (n=1000)
      Rate  des  bas  pdl
des 5.76/s   -- -74% -77%
bas 21.8/s 278%   -- -11%
pdl 24.6/s 327%  13%   --

Benchmarking correlation computation:
  - ols : using Statistics::OLS
  - pdl : using PDL module
  - bas : using Statistics::Basic module

Benchmark: timing 10000 iterations of bas, ols, pdl...
bas:  3 wallclock secs ( 2.98 usr +  0.01 sys =  2.99 CPU) @ 3344.48/s (n=10000)
ols:  7 wallclock secs ( 6.80 usr +  0.02 sys =  6.82 CPU) @ 1466.28/s (n=10000)
pdl:  2 wallclock secs ( 2.54 usr +  0.00 sys =  2.54 CPU) @ 3937.01/s (n=10000)
      Rate  ols  bas  pdl
ols 1466/s   -- -56% -63%
bas 3344/s 128%   -- -15%
pdl 3937/s 169%  18%   --

As you may notice, for basic computations, Statistics::Descriptive is the worse choice… If we benchmark the number of operations done in 5 seconds, we can notice that our script using PDL or Statistics::Basic will perform four times more operations than the one using Statistics::Descriptive!

Benchmark: running bas, des, pdl for at least 5 CPU seconds...
bas:  6 wallclock secs ( 5.21 usr +  0.08 sys =  5.29 CPU) @ 21.74/s (n=115)
des:  5 wallclock secs ( 5.12 usr +  0.06 sys =  5.18 CPU) @  5.79/s (n=30)
pdl:  5 wallclock secs ( 4.83 usr +  0.36 sys =  5.19 CPU) @ 24.66/s (n=128)

Read more »