Hello all,
A brief summary of the first analysis group TC. Action items in *bold*.
Next call Thursday January 12, 2pm.
Attending: Tuuli, Micha, Natalja, Marc, Mathias, Tim, Olof, Marta
1) Kit situation
- Geneva and Barcelona have both mRNA and miRNA kits
- Berlin and Kiel have miRNA kits
- Munich, Uppsala and Leiden (?) don't have any kits
- *All the labs* should send Tuuli an update before Christmas indicating
if they have received the kits or not. We'll wait until early January
and see then if we need a plan B.
- the labs that have kits are on schedule with the sequencing
2) Low-level data processing (mapping, quantification etc.)
- *Tuuli* will upload the fastq files and bams (from bwa) from their
mRNA seq by the end of the week. *
- Tuuli* will define a sandbox dataset of 24 and 5 samples from UNIGE
for testing purposes.
- *Micha* will find out when GEM is likely to be published
- We shouldn't spend too much time figuring out how to map the reads -
this has been done already. However, there are a couple of things that
we should test:
- *Marc* will analyze the level of genomic contamination in the 24 samples
- Tuuli has concerns about reference allele mapping bias affecting
quantifications. *Natalja* will run Tophat for 5 samples with the
normal reference (hg19) and a reference masked for all common 1000g
variants (Tuuli will provide this) to see if masking leads to a big loss
in mapping. *Micha *will check whether it's possible with GEM. If
masking is not feasible, we can consider other options for dealing with
this bias...
The following was planned for the final dataset:
- Micha will run his whole pipeline (GEM for mapping, SNAPE for variant
calling, Flux for deconvolution and normalization, AStalavista for
alternative splicing analyzes..).
- Natalja will run the EBI pipeline using bwa and/or tophat, and
quantify exon counts and/or RPKMs. bwa+exon counts would enable direct
comparison with earlier eQTL results from Manolis's lab
- regarding normalization, both PCA and specific covariate based
approaches are possible - we'll have to see what the data looks like
- all analyses should be run with duplicates - Geneva has seen that in
RNAseq data duplicates rarely seem to be PCR artefacts
3) FTP instructions
- *Natalja* will send an email to the analysis group about the FTP
instructions that are now on the ftp site
I hope I remembered at least the most important things! I'll let you
know when the data is one the ftp site.
best regards,
Tuuli
--
Tuuli Lappalainen, PhD
Department of Genetic Medicine and Development
University of Geneva Medical School
CMU / Rue Michel-Servet 1
1211 Geneva 4
Switzerland
Tel. +41-(0)22-3795550
tuuli.lappalainen(a)unige.ch