Hi,

ArrayExpressHTS pipeline summary:

1) Mapping - bowtie, BWA or Tophat
2) References - by default transcriptome, genome as option, any sort of combinations are possible as well
3) Normalization - RPKM, DESeq, TMM (from edgeR)
4) Quantification - MMSEQ, Cufflinks or count over exons/transcripts

For each step new methods can be added easily if needed.
Most of the computations are in R (we are using EBI R Cloud).

Results: HTML reports and R objects with quantification results.  All results can be easily converted into another needed format.

Will attend the call for sure.

Regards,
Natalja


On 13/12/11 10:27, Tuuli Lappalainen wrote:
Hi all,

We'll have the Geuvadis RNAseq analysis group TC today at 3pm, we should receive the details from Barcelona soon.

On the agenda:
- updates on sequencing and protocols (if needed)
- low-level data processing: whose pipeline to use

Micha and Natalja, we discussed that you would be interested in doing mapping and quantification of this dataset. I hope you can attend the call - and could you please send a summary of the methods that your pipeline uses so that we can discuss this on the call?

Here's what we could do in Geneva: normally we use bwa for mapping (which maps 10%+ reads compared to Tophat), and Geuvadis data we would probably map first to the genome, and nonmapping reads then to the transcriptome. This should yield exonic mapping of ~60% of the reads. Quantifications we do as a standard at the exon level, by simply counting all the reads that overlap an exon. We have some more complex alternatives as well, but for eQTL analysis we prefer to use these exon quantifications. Regarding normalization, we're in the process of developing our current methods further and would use the most up to date method that we have, but I don't have the details yet. Another thing that we'd need to take into account is reference allele mapping bias, which affects exon quantifications a little bit as well. bwa doesn't allow masking but we can deal with it in other ways - but I'd like to run mapping also with a software that allows masking (Tophat, GEM??).

Talk to you later,
Tuuli


Tuuli Lappalainen, PhD
Department of Genetic Medicine and Development
University of Geneva Medical School
CMU / Rue Michel-Servet 1
1211 Geneva 4
Switzerland
Tel. +41-(0)22-3795550
tuuli.lappalainen@unige.ch


-------- Original Message --------
Subject: Geuvadis RNAseq analysis group TC
Date: Wed, 07 Dec 2011 20:38:36 +0100
From: Tuuli Lappalainen <Tuuli.Lappalainen@unige.ch>
To: geuvadis_rna_analysis@lists.crg.es


Hi all,

Thanks for the quick replies to the doodle about the Geuvadis analysis group TCs. It seems that Thursday 2pm CET is the best time, so please keep that slot free for next spring. Let's have calls every 2 weeks by default, starting on January 12.

But we should have one call before Christmas and start with an exception since next week's Thursday is impossible for me. So let's have a TC on Tuesday December 13th at 3pm. As I said before, those who want to do mapping etc. to the raw data should send an email with a brief description of their pipeline before the call.

cheers,
Tuuli

PS. Marta, I'm sorry that I had to pick a time that wasn't good for you - the other times would have had even more people missing. I hope you can join at least sometimes or find someone else who could attend the call instead.


-- 
Tuuli Lappalainen, PhD
Department of Genetic Medicine and Development
University of Geneva Medical School
CMU / Rue Michel-Servet 1
1211 Geneva 4
Switzerland
Tel. +41-(0)22-3795550
tuuli.lappalainen@unige.ch


_______________________________________________
Geuvadis_rna_analysis mailing list
Geuvadis_rna_analysis@lists.crg.es
http://davinci.crg.es/mailman/listinfo/geuvadis_rna_analysis

-- 
Natalja Kurbatova, PhD
Functional Genomics Group, EBI