Dear all,

we noticed some inconsistencies in the read lengths of the GEM mappings (I didn't check the BWA mappings, but it's probably the same).
Some samples appear to have up to 76 bp matched, while other samples only have 75 bp. In regard to comparability this might cause some (probably very little) differences between the samples.

i.e.
NA20589.1.M_111124_3.bam has 75 bp
NA20812.2.M_111216_6.bam has 76 bp
NA20760.3.M_120202_5.bam has 75 bp
NA20783.4.M_120208_6.bam has 75 bp
NA20768.5.M_120131_1.bam has 75 bp
NA20798.6.M_120119_6.bam has 75 bp
NA20803.7.M_120219_1.bam has 75 bp

We checked some more files for institute 2, and they seem to have generally 1 bp more than the others.


We're also a bit lost regarding GEM's quality score. We would like to filter reads for good mapping quality, we're just not sure how to accomplish this.

Below are two read pairs as an example.
The fist pair has a mapping quality of 180. The second read however was terribly aligned.
The second pair has only a quality of 99, while the reads mapped much better (at least I would say so).
We're not sure what's the reason for this.

Curiously, both reads of a pair get the same mapping quality. This could be by intention (i.e. it's always the pair quality rather than the read quality) or, which could also explain the quality differences for the reads, both reads of the pair get the quality of the first mapped read.
We're thankful for any suggestions on how to filter for 'good' mappings.

HWI-ST661:153:D0FTJACXX:8:2201:16410:93092      163     chr1    14704   180     75M     =       14770   -134    CCCAGTCGTCCTCGTCCTCCTCTGCCTGTGGCTGCTGCGGTGGCGGCAGAGGAGGGATGGAGGCTGACACGCGGG     CCCFFFFFHHHGHJJJIGIJIIIIIJIJHIIJJJB?BFHG7@F@AB:9=(6=(;>B29<@###############       RG:Z:0  NM:i:7  XT:A:U  md:Z:62T12      XA:Z:chr15,-102516387,75M,8;chr9,+14815,75M,10;chr16,+64389,19M1I1M2I52M,10;chr2,-114356235,75M,11;


HWI-ST661:153:D0FTJACXX:8:2201:16410:93092      83      chr1    14770   180     39M1D22M1I3M2I1M1I2M4S  =       14704   134     ACACGCGGGCAAAGGCTCCTCCGGGCCCCTCACCAGCCCAGGTCCTTTCCCAGAGATGCCTTGGCTCGTGGCTGT     5@
9DDDDDDDDDC@BDDDDDDFHHIIIIJIHFJJJJIIJIJIJJIJJJJJJJJJJIIIJJJJHHHHHFFDDA11B     RG:Z:0  NM:i:7  XT:A:U  md:Z:(4)2>1-1>2-T2>1-22>1+39    XA:Z:chr15,+102516328,1S1M2I4M3I1M1I1M1I24M1D36M,8;chr9,-14881,39M1D22M1I3M2
I1M1I2M4S,10;chr16,-64452,39M1D22M1I3M2I1M1I2M4S,10;chr2,+114356176,1S1M2I4M3I1M1I1M1I24M1D36M,11;

HWI-ST661:153:D0FTJACXX:8:2201:6270:52066       99      chr1    14582   99      1M2I71M1S       =       14665   158     CCCTGGTTCCGTCACCCCCTCCCAGGGAAGCAGGTCTGAGCAGCTTGTCCTGGCTGTGTCAATGTCAGAGCAACA     @11ADFFFHHHHHIIJJJJIJJIJJJJHHJJIJJHIJJJIIGFGIIJIIJF@@EGDAE>?)).7?BDFFDCCCB@       RG:Z:0  NM:i:8  XT:A:U  md:Z:1>2-21A5T29C13(1)  XA:Z:chrY,-59358087,75M,4;chrX,-155255081,75M,4;chr9,+14691,75M,7;chr2,-114356359,75M,7;chr16,+64266,75M,8;chr12,-90971,75M,8;

HWI-ST661:153:D0FTJACXX:8:2201:6270:52066       147     chr1    14665   99      75M     =       14582   -158    GGGTCTGGGGGGGAAGGTGTCATGGAGCCCCCTAGGATTCCCAGTCGTCCTCGTCCTCCTCTGCCTGTGGCTGTG     <B@A<9;>@CAAADDDDCCC>CC=?;>FHCGGEGIGJJIIGGIHHIIIIJIIJIJIGJJIJIHHFDDFFFFDB@;       RG:Z:0  NM:i:8  XT:A:U  md:Z:AG38G34    XA:Z:chrY,+59358002,75M,4;chrX,+155254996,75M,4;chr9,-14776,75M,7;chr2,+114356274,75M,7;chr16,-64350,58M1I1M2I11M1D2M,8;chr12,+90885,2M1D73M,8;


best wishes,

Matthias & Daniela
-- 
Matthias Barann
Institute of Clinical Molecular Biology
Christian Albrechts University Kiel
Schittenhelmstr. 12
D-24105 Kiel, Germany

m.barann@ikmb.uni-kiel.de
+49 - (0)431 - 597 8681 (office)