cd /lustre/work1/sanger/io1/2009-07-08_chimp_repeats
perl -i -plne 's/\t/,/g; s/\s+/_/g; s/[\(\)]//g; s/,/\t/g' OID2130?/experimental_report.txt
Now set up the jobs to parse the data:
perl -E 'say "ng42m_parser.pl -o sample_$_.txt -- $_" for @ARGV' OID2130? > parse_data_commands.txt
bsub -o load_repeats.%J.out -J 'load_repeats[1-3]%3' -q basement -R "select[mem>=3000] rusage[mem=3000]" -M3000000 'submit_job_array parse_data_commands.txt'
For future reference we might want to know the order ID to sample name mapping:
cut -f 1,12 OID2130?/experimental_report.txt | sort | uniq | grep -v 'ORDER_ID' > order_id_sample_mapping.txt
The Processed_data_files for orders 2130[78] were not gzipped. Thus this had to be rectified b/4 proceeding:
bsub -Ip 'gzip OID2130[78]/Processed_data_files/*.txt'