Learning to “map”

Before I start, I have to use the usual disclaimer (I’m trained as a biologist- don’t be surprised I didn’t know “map”).

Rob directed me today to Perl’s “map,” a little function that I didn’t know about, and that seems to have the potential to solve many of my problems.

Map is documented here.

So, why do I need it?

Let’s say, for example, I have this list of protein pairs that are similar to each other: @sims = (1115.1, 1116.1, 1116.1, 1115.1, 1115.1, 1118.2, 1118.2, 1115.1, 1118.2, 1116.1, 1116.1, 1118.2). These are simply 4 homologs that hit each other reciprocally. To get a set of unique IDs of these homologs using Perl, my options are:

Option 1:

my %hash;
for my $k (@sims) {
     $hash {$k} = 1
     } 
print join “t”, keys %hash; 

Option 2:

my %hash;
for (@sims) {
     $hash {$_} = 1
     } 
print join “t”, keys %hash; 

Option 3: really shorter

my %hash;
map {$hash{$_} =1} @sims;
print join “t”, keys %hash; 

They all return:

1118.2 1115.1 1116.1