Hello all,
The final version of exon, gene and transcript quantifications are now
on the ftp site in :
/upload/geuvadis/wp4_rnaseq/main_project/analysis_data/quantification/ .
These are in Gencode v12 with the reads with >6 mismatches removed. See
the readme files and wiki for documentation - there are both raw counts,
counts normalized by the number of exonic reads, and RPKMs for
transcripts. Note that the files include all 667 samples, and later this
week I will send information of those samples provide files without the
couple of samples that we'll drop from all analyses.
Splice junction and intron quantifications will follow later this week.
best regards,
Tuuli
--
Tuuli Lappalainen, PhD
Department of Genetic Medicine and Development
University of Geneva Medical School
CMU / Rue Michel-Servet 1
1211 Geneva 4
Switzerland
Tel. +41-(0)22-3795550
tuuli.lappalainen(a)unige.ch
FYI - Here is the information for the GEUVADIS Annual Meeting during which the results of all project WP will be presented.
Kind regards,
Gabrielle
From: Gabrielle Anne Bertier
Subject: Registration - GEUVADIS AM 2012: 29-31/10/2012 - Santiago de Compostela
Dear all,
Please find here the registration form<https://docs.google.com/spreadsheet/viewform?formkey=dEpnY0h6Y2Z3Y291dFhvRz…> for our next Annual Meeting; to be filled in before August 31st 2012.
Here is a draft programme for the meeting:
GEUVADIS Annual Meeting 2 - Santiago de Compostela - 29-31 October 2012
Sunday 28 October - 20:00 - Welcome dinner
Monday 29 October - 09:00 - 18:00 - GEUVADIS Annual Meeting, WP updates and feedback from SAB.
Tuesday 30 October - 09:00 - Wednesday 31 October -13:00
Workshop: Exome sequencing data storage, sharing and analysis: from National to European standards.
If you want to be involved in the organization of this workshop, or suggest speakers please let me know.
Let me know if anything remains unclear,
Kind regards,
Gabrielle
PS: registration form full link: https://docs.google.com/spreadsheet/viewform?formkey=dEpnY0h6Y2Z3Y291dFhvRz…<https://docs.google.com/spreadsheet/viewform?formkey=dEpnY0h6Y2Z3Y291dFhvRz…>
Gabrielle Bertier
Scientific Project Manager
International Collaboration Office
CRG, Center for Genomic Regulation
Carrer Dr. Aiguader, 88
08003 Barcelona, España
Tel: +34933160374
Mobile: +34639960656
email: gabrielle.bertier(a)crg.es<mailto:gabrielle.bertier@crg.es>
web: www.geuvadis.eu
Dear all,
we noticed some inconsistencies in the read lengths of the GEM mappings
(I didn't check the BWA mappings, but it's probably the same).
Some samples appear to have up to 76 bp matched, while other samples
only have 75 bp. In regard to comparability this might cause some
(probably very little) differences between the samples.
i.e.
NA20589.1.M_111124_3.bam has 75 bp
NA20812.2.M_111216_6.bam has *76 bp*
NA20760.3.M_120202_5.bam has 75 bp
NA20783.4.M_120208_6.bam has 75 bp
NA20768.5.M_120131_1.bam has 75 bp
NA20798.6.M_120119_6.bam has 75 bp
NA20803.7.M_120219_1.bam has 75 bp
We checked some more files for institute 2, and they seem to have
generally 1 bp more than the others.
We're also a bit lost regarding GEM's quality score. We would like to
filter reads for good mapping quality, we're just not sure how to
accomplish this.
Below are two read pairs as an example.
The fist pair has a mapping quality of 180. The second read however was
terribly aligned.
The second pair has only a quality of 99, while the reads mapped much
better (at least I would say so).
We're not sure what's the reason for this.
Curiously, both reads of a pair get the same mapping quality. This could
be by intention (i.e. it's always the pair quality rather than the read
quality) or, which could also explain the quality differences for the
reads, both reads of the pair get the quality of the first mapped read.
We're thankful for any suggestions on how to filter for 'good' mappings.
> HWI-ST661:153:D0FTJACXX:8:2201:16410:93092 163 chr1 14704
> *180 75M *= 14770 -134
> CCCAGTCGTCCTCGTCCTCCTCTGCCTGTGGCTGCTGCGGTGGCGGCAGAGGAGGGATGGAGGCTGACACGCGGG
> CCCFFFFFHHHGHJJJIGIJIIIIIJIJHIIJJJB?BFHG7@F@AB:9=(6=(;>B29<@###############
> RG:Z:0 NM:i:7 XT:A:U md:Z:62T12
> XA:Z:chr15,-102516387,75M,8;chr9,+14815,75M,10;chr16,+64389,19M1I1M2I52M,10;chr2,-114356235,75M,11;
>
>
> HWI-ST661:153:D0FTJACXX:8:2201:16410:93092 83 chr1 14770
> *180 39M1D22M1I3M2I1M1I2M4S *= 14704 134
> ACACGCGGGCAAAGGCTCCTCCGGGCCCCTCACCAGCCCAGGTCCTTTCCCAGAGATGCCTTGGCTCGTGGCTGT
> 5@
> 9DDDDDDDDDC@BDDDDDDFHHIIIIJIHFJJJJIIJIJIJJIJJJJJJJJJJIIIJJJJHHHHHFFDDA11B
> RG:Z:0 NM:i:7 XT:A:U md:Z:(4)2>1-1>2-T2>1-22>1+39
> XA:Z:chr15,+102516328,1S1M2I4M3I1M1I1M1I24M1D36M,8;chr9,-14881,39M1D22M1I3M2
> I1M1I2M4S,10;chr16,-64452,39M1D22M1I3M2I1M1I2M4S,10;chr2,+114356176,1S1M2I4M3I1M1I1M1I24M1D36M,11;
> HWI-ST661:153:D0FTJACXX:8:2201:6270:52066 99 chr1 14582
> *99 1M2I71M1S *= 14665 158
> CCCTGGTTCCGTCACCCCCTCCCAGGGAAGCAGGTCTGAGCAGCTTGTCCTGGCTGTGTCAATGTCAGAGCAACA
> @11ADFFFHHHHHIIJJJJIJJIJJJJHHJJIJJHIJJJIIGFGIIJIIJF@@EGDAE>?)).7?BDFFDCCCB@
> RG:Z:0 NM:i:8 XT:A:U md:Z:1>2-21A5T29C13(1)
> XA:Z:chrY,-59358087,75M,4;chrX,-155255081,75M,4;chr9,+14691,75M,7;chr2,-114356359,75M,7;chr16,+64266,75M,8;chr12,-90971,75M,8;
>
> HWI-ST661:153:D0FTJACXX:8:2201:6270:52066 147 chr1 14665
> *99 75M *= 14582 -158
> GGGTCTGGGGGGGAAGGTGTCATGGAGCCCCCTAGGATTCCCAGTCGTCCTCGTCCTCCTCTGCCTGTGGCTGTG
> <B@A<9;>@CAAADDDDCCC>CC=?;>FHCGGEGIGJJIIGGIHHIIIIJIIJIJIGJJIJIHHFDDFFFFDB@;
> RG:Z:0 NM:i:8 XT:A:U md:Z:AG38G34
> XA:Z:chrY,+59358002,75M,4;chrX,+155254996,75M,4;chr9,-14776,75M,7;chr2,+114356274,75M,7;chr16,-64350,58M1I1M2I11M1D2M,8;chr12,+90885,2M1D73M,8;
best wishes,
Matthias & Daniela
--
Matthias Barann
Institute of Clinical Molecular Biology
Christian Albrechts University Kiel
Schittenhelmstr. 12
D-24105 Kiel, Germany
m.barann(a)ikmb.uni-kiel.de
+49 - (0)431 - 597 8681 (office)
Hi,
as mentioned in the call last thursday we uploaded the (first round of)
mapping counts in non-exonic elements; these are currently limited to known
junctions and all-intronic regions that are transcribed but not retained in
any mature transcript, based on Gencode v12.
The location on the FTP server is below, Emilio (from CRG) realized
counting and the upload.
Cheers, micha
---------- Forwarded message ----------
From: Emilio <emiliopalumbo(a)gmail.com>
Date: Mon, Jul 2, 2012 at 5:59 PM
Subject: Geuvadis SJ and Intron mapping counts upload
To: Micha Sammeth <micha(a)sammeth.net>
Hi Micha,
I uploaded the data to the ftp. The folder is:
/upload/geuvadis/wp4_rnaseq/main_project/analysis_data/quantification/splice_junction-intron
Cheers,
Emilio
Hello,
We'll have a Geuvadis analysis group TC on Thursday 28th at 2pm CET.
On the agenda:
- analysis updates
- wiki
- Liniana's presentation: "Conjoined genes in human populations"
- Mar's presentation: "Ubiquity vs specificity of gene and isoform
expression across populations"
NOTE: Call details different from the usual:
Germany: 0049 692 573 804 41
Spain: 0034 931 816 661
UK: 0044 203 370 57 19
Sweden: 0046 840 309 949
The Netherlands: 0031 108 920 271
Access code: /*611683 */
best regards,
Tuuli
--
Tuuli Lappalainen, PhD
Department of Genetic Medicine and Development
University of Geneva Medical School
CMU / Rue Michel-Servet 1
1211 Geneva 4
Switzerland
Tel. +41-(0)22-3795550
tuuli.lappalainen(a)unige.ch
Mar
---------- Forwarded message ----------
From: Mar Gonzàlez-Porta <mar(a)ebi.ac.uk>
Date: 2012/6/21
Subject: Re: [Geuvadis_rna_analysis] Analysis group TC June 21st
To: Natalja Kurbatova <natalja(a)ebi.ac.uk>
Cc: Liliana Greger <lgreger(a)ebi.ac.uk>
Please find attached the slides for today's meeting.
Cheers,
Mar
2012/6/20 Natalja Kurbatova <natalja(a)ebi.ac.uk>
>
>
> -------- Original Message -------- Subject: [Geuvadis_rna_analysis]
> Analysis group TC June 21st Date: Wed, 20 Jun 2012 16:26:36 +0200 From: Tuuli
> Lappalainen <Tuuli.Lappalainen(a)unige.ch> <Tuuli.Lappalainen(a)unige.ch> To:
> geuvadis_rna_analysis(a)lists.crg.es <geuvadis_rna_analysis(a)lists.crg.es><geuvadis_rna_analysis(a)lists.crg.es>
>
> Hello all,
>
> We'll have a Geuvadis RNAseq analysis group TC on Thursday June 21st at
> 2pm CET.
>
> On the agenda:
> - program for the Barcelona meeting (see attached draft)
> - data and analysis updates
> - Jonas from Uppsala will present some QC results
> - Liliana from EBI will present "Conjoined genes in human populations"
> - Mar from EBI will present "Ubiquity vs specificity of gene and isoform
> expression across populations" (if there's time)
>
> Call details are:
> from outside spain; 0034917911859
> from spain; 900800678
> Access code; 3160100
>
> best regards,
> Tuuli
>
>
> --
> Tuuli Lappalainen, PhD
> Department of Genetic Medicine and Development
> University of Geneva Medical School
> CMU / Rue Michel-Servet 1
> 1211 Geneva 4
> Switzerland
> Tel. +41-(0)22-3795550tuuli.lappalainen(a)unige.ch
>
>
Hello All,
Please find attached the updates slides which I will present on the today's conference call.
Regards Liliana
Dr. Liliana Greger
EMBL Outstation - Hinxton,
European Bioinformatics Institute,
Wellcome Trust Genome Campus,
Hinxton,
Cambridge, CB10 1SD
Hello all,
We'll have a Geuvadis RNAseq analysis group TC on Thursday June 21st at
2pm CET.
On the agenda:
- program for the Barcelona meeting (see attached draft)
- data and analysis updates
- Jonas from Uppsala will present some QC results
- Liliana from EBI will present "Conjoined genes in human populations"
- Mar from EBI will present "Ubiquity vs specificity of gene and isoform
expression across populations" (if there's time)
Call details are:
from outside spain; 0034917911859
from spain; 900800678
Access code; 3160100
best regards,
Tuuli
--
Tuuli Lappalainen, PhD
Department of Genetic Medicine and Development
University of Geneva Medical School
CMU / Rue Michel-Servet 1
1211 Geneva 4
Switzerland
Tel. +41-(0)22-3795550
tuuli.lappalainen(a)unige.ch