[Geuvadis_rna_analysis] Geuvadis RNAseq analysis group TC minutes - geuvadis_rna_analysis@lists.crg.es

13 Dec 2011

Hello all,

A brief summary of the first analysis group TC. Action items in *bold*. 
Next call Thursday January 12, 2pm.

Attending: Tuuli, Micha, Natalja, Marc, Mathias, Tim, Olof, Marta

1) Kit situation
- Geneva and Barcelona have both mRNA and miRNA kits
- Berlin and Kiel have miRNA kits
- Munich, Uppsala and Leiden (?) don't have any kits
- *All the labs* should send Tuuli an update before Christmas indicating 
if they have received the kits or not. We'll wait until early January 
and see then if we need a plan B.
- the labs that have kits are on schedule with the sequencing

2) Low-level data processing (mapping, quantification etc.)
- *Tuuli* will upload the fastq files and bams (from bwa) from their 
mRNA seq by the end of the week. *
- Tuuli* will define a sandbox dataset of 24 and 5 samples from UNIGE 
for testing purposes.
- *Micha* will find out when GEM is likely to be published
- We shouldn't spend too much time figuring out how to map the reads - 
this has been done already. However, there are a couple of things that 
we should test:
- *Marc* will analyze the level of genomic contamination in the 24 samples
- Tuuli has concerns about reference allele mapping bias affecting 
quantifications. *Natalja* will run       Tophat for 5 samples with the 
normal reference (hg19) and a reference masked for all common 1000g 
variants (Tuuli will provide this) to see if masking leads to a big loss 
in mapping. *Micha *will check whether it's possible with GEM. If 
masking is not feasible, we can consider other options for dealing with 
this bias...

The following was planned for the final dataset:
- Micha will run his whole pipeline (GEM for mapping, SNAPE for variant 
calling, Flux for deconvolution and normalization, AStalavista for 
alternative splicing analyzes..).
- Natalja will run the EBI pipeline using bwa and/or tophat, and 
quantify exon counts and/or RPKMs. bwa+exon counts would enable direct 
comparison with earlier eQTL results from Manolis's lab
- regarding normalization, both PCA and specific covariate based 
approaches are possible - we'll have to see what the data looks like
- all analyses should be run with duplicates - Geneva has seen that in 
RNAseq data duplicates rarely seem to be PCR artefacts

3) FTP instructions
- *Natalja* will send an email to the analysis group about the FTP 
instructions that are now on the ftp site

I hope I remembered at least the most important things! I'll let you 
know when the data is one the ftp site.

best regards,
Tuuli

-- 
Tuuli Lappalainen, PhD
Department of Genetic Medicine and Development
University of Geneva Medical School
CMU / Rue Michel-Servet 1
1211 Geneva 4
Switzerland
Tel. +41-(0)22-3795550
tuuli.lappalainen(a)unige.ch