perl - How can I plot p-values for SNPs that are spread across thousands of scaffolds on a single continuous axis? -
i have association mapping derived p-values snps scattered across thousands of scaffolds in non-model organism. plot p-value of each snp on manhattan-style plot. not care order of scaffolds, retain relative order , spacing of snp positions on respective scaffolds. want visualize how many genomic regions associated phenotype. example:
my data looks this:
scaffold position 1 8967 1 8986 1 9002 1 9025 1 9064 2 60995 2 61091 2 61642 2 61898 2 61921 2 62034 2 62133 2 62202 2 62219 2 62220 3 731894 3 731907 3 731962 3 731999 3 732000 3 732050 3 732076 3 732097
i write perl code create third column retains distance between snps on same scaffold, while arbitrarily spacing scaffolds number (100 in following example):
scaffold position continuous_axis 1 8967 8967 1 8986 8986 1 9002 9002 1 9025 9025 1 9064 9064 2 60995 9164 2 61091 9260 2 61642 9811 2 61898 10067 2 61921 10090 2 62034 10203 2 62133 10302 2 62202 10371 2 62219 10388 2 62220 10389 3 731894 10489 3 731907 10502 3 731962 10557 3 731999 10594 3 732000 10595 3 732050 10645 3 732076 10671 3 732097 10692
thank might have strategy.
something following should work:
#!/usr/bin/env perl use strict; use warnings; use constant scaffold_spacing => 100; ($last_scaffold, $last_position, $continuous_axis, $found_data); $input = './input'; open $fh, "<$input" or die "unable open '$input' reading : $!"; print join( "\t", qw( scaffold position continuous_axis ) ) . "\n"; # output header while (<$fh>) { next unless m|\d|; # skip non-data lines ($scaffold, $position) = split /\s+/; # split on whitespace unless ($found_data++) { # initialize $last_scaffold = $scaffold; # set first data value $last_position = $position; # set first data value $continuous_axis = $position; # start continuous axis @ first position } $position_diff = $position - $last_position; $scaffold_diff = $scaffold - $last_scaffold; if ($scaffold_diff == 0) { $continuous_axis += $position_diff; } else { $continuous_axis += scaffold_spacing; } print join( "\t", $scaffold, $position, $continuous_axis ) . "\n"; # update $last_scaffold = $scaffold; $last_position = $position; }
Comments
Post a Comment