Hi Matthias and Daniela,
yes, your guess is correct, with the additional provisos that:
Hello,
just to be sure about this, how do you calculate the edit distance?
Our guess is that you consider each of the following variations...
- mismatches
- each insertion/deletion independent of the length
- split-reads
with an edit distance of 1? That's actually how we would like to have it, though its borderline for the split-reads.
Thanks,
Matthias & Daniela
On 05.07.2012 02:18, Tuuli Lappalainen wrote:Hello,
In my opinion we need have a filter for maximum number of mismatches (as well as MAPQ>150) when we want to have well-mapped reliable reads - if a read has lots of mismatches, I wouldn't trust it even if the other matches were even worse. But you're right that 3 or 4 is too stringent, I was thinking of the 75 bps and not the total of 150.
I'd say that we keep reads with <=6 mismatches according to the NM flag. If no one objects by Thursday noon, I'll proceed with this. I'll provide a script for filtering, and upload a filtered set of bam files to the ftp site - you can do whatever is easier for you.
best regards,
Tuuli
Tuuli Lappalainen, PhD Department of Genetic Medicine and Development University of Geneva Medical School CMU / Rue Michel-Servet 1 1211 Geneva 4 Switzerland Tel. +41-(0)22-3795550 tuuli.lappalainen@unige.ch
-- Matthias Barann Institute of Clinical Molecular Biology Christian Albrechts University Kiel Schittenhelmstr. 12 D-24105 Kiel, Germany m.barann@ikmb.uni-kiel.de +49 - (0)431 - 597 8681 (office)