Hello all,
The Geuvadis RNAseq genotypes from Phase1 and Phase2 samples are now
available in:
/upload/geuvadis/wp4_rnaseq/main_project/external_data/genetic_variation/genotype/phase1_phase2/
These are vcf files, with a .genotypes.vcf file per chromosome, and a
.sites.vcf file of the whole genome that contains the first 8 columns of
the genotype files with all the variant information. See the header
lines for explanation of the info tags. There's also a sample
information file. In a couple of weeks there will be an update on the
variant annotations.
Phase1 genotypes come from 1000g Phase1 v3 release, and Phase2 genotypes
have been imputed from Omni2.5M - wiki has some more details on this
from Natalja. The imputation results have been filtered as follows; in
the end we have 38,187,570 variants left.
- imputed Phase2 genotypes with bad imputation quality (<0.5) have
missing values. Note that Phase1 genotypes for these variants are
perfectly OK to use in analyses even though the overall missingness rate
is 10%.
- variants with multiple alleles and sites with multiple variants have
been excluded. I noticed that either imputation or the merge of the
Phase1 and Phase2 data messed up some of these sites, so I thought that
the safest thing to do is to remove them - it's not a huge loss and I'm
not sure how reliable these genotype calls are in the Phase 1 anyway.
- Even after these filters, there was a very low number of variants with
very discordant frequencies in Phase1 and Phase2. Most of these seemed
to be variants that are on the Omni2.5M with an allele flip in the
original Phase2 SNP genotypes data that was used in imputation. This is
an ad hoc filter, but I removed 2876 variants with >0.3 frequency
difference between Phase1 and Phase2.
I've done a number of checks on these files, but please let me know if
you observe anything that doesn't seem right. Thanks to Natalja and
Thomas W for all the work for getting these data together.
have a nice weekend!
Tuuli
--
Tuuli Lappalainen, PhD
Department of Genetic Medicine and Development
University of Geneva Medical School
CMU / Rue Michel-Servet 1
1211 Geneva 4
Switzerland
Tel. +41-(0)22-3795550
tuuli.lappalainen(a)unige.ch
Show replies by date