Omics-squared: human genomic, transcriptomic and phenotypic data for genetic analysis workshop 19

Affiliations

18 October 2016

-

doi: 10.1186/s12919-016-0008-y


Abstract

Background: The Genetic Analysis Workshops (GAW) are a forum for development, testing, and comparison of statistical genetic methods and software. Each contribution to the workshop includes an application to a specified data set. Here we describe the data distributed for GAW19, which focused on analysis of human genomic and transcriptomic data.

Methods: GAW19 data were donated by the T2D-GENES Consortium and the San Antonio Family Heart Study and included whole genome and exome sequences for odd-numbered autosomes, measures of gene expression, systolic and diastolic blood pressures, and related covariates in two Mexican American samples. These two samples were a collection of 20 large families with whole genome sequence and transcriptomic data and a set of 1943 unrelated individuals with exome sequence. For each sample, simulated phenotypes were constructed based on the real sequence data. 'Functional' genes and variants for the simulations were chosen based on observed correlations between gene expression and blood pressure. The simulations focused primarily on additive genetic models but also included a genotype-by-medication interaction. A total of 245 genes were designated as 'functional' in the simulations with a few genes of large effect and most genes explaining < 1 % of the trait variation. An additional phenotype, Q1, was simulated to be correlated among related individuals, based on theoretical or empirical kinship matrices, but was not associated with any sequence variants. Two hundred replicates of the phenotypes were simulated. The GAW19 data are an expansion of the data used at GAW18, which included the family-based whole genome sequence, blood pressure, and simulated phenotypes, but not the gene expression data or the set of 1943 unrelated individuals with exome sequence.


Similar articles

Data for Genetic Analysis Workshop 18: human whole genome sequence, blood pressure, and simulated phenotypes in extended pedigrees.

Almasy L, Dyer TD, Peralta JM, Jun G, Wood AR, Fuchsberger C, Almeida MA, Kent JW Jr, Fowler S, Blackwell TW, Puppala S, Kumar S, Curran JE, Lehman D, Abecasis G, Duggirala R, Blangero J; T2D-GENES Consortium.BMC Proc. 2014 Jun 17;8(Suppl 1):S2. doi: 10.1186/1753-6561-8-S1-S2. eCollection 2014.PMID: 25519314 Free PMC article.

Drinking from the Holy Grail: analysis of whole-genome sequencing from the Genetic Analysis Workshop 18.

Paterson AD.Genet Epidemiol. 2014 Sep;38 Suppl 1:S1-4. doi: 10.1002/gepi.21818.PMID: 25112182

Longitudinal analytical approaches to genetic data.

Chiu YF, Justice AE, Melton PE.BMC Genet. 2016 Feb 3;17 Suppl 2(Suppl 2):4. doi: 10.1186/s12863-015-0312-y.PMID: 26866891 Free PMC article.

Constrained multivariate association with longitudinal phenotypes.

Melton PE, Peralta JM, Almasy L.BMC Proc. 2016 Oct 18;10(Suppl 7):329-332. doi: 10.1186/s12919-016-0051-8. eCollection 2016.PMID: 27980657 Free PMC article.

Identification and analysis of gene families from the duplicated genome of soybean using EST sequences.

Nelson RT, Shoemaker R.BMC Genomics. 2006 Aug 9;7:204. doi: 10.1186/1471-2164-7-204.PMID: 16899135 Free PMC article. Review.


Cited by

On the application, reporting, and sharing of in silico simulations for genetic studies.

Riggs K, Chen HS, Rotunno M, Li B, Simonds NI, Mechanic LE, Peng B.Genet Epidemiol. 2021 Mar;45(2):131-141. doi: 10.1002/gepi.22362. Epub 2020 Oct 16.PMID: 33063887 Free PMC article. Review.

Data for GAW20: genome-wide DNA sequence variation and epigenome-wide DNA methylation before and after fenofibrate treatment in a family study of metabolic phenotypes.

Aslibekyan S, Almasy L, Province MA, Absher DM, Arnett DK.BMC Proc. 2018 Sep 17;12(Suppl 9):35. doi: 10.1186/s12919-018-0114-0. eCollection 2018.PMID: 30275886 Free PMC article.

Advances in the Genetics of Hypertension: The Effect of Rare Variants.

Russo A, Di Gaetano C, Cugliari G, Matullo G.Int J Mol Sci. 2018 Feb 28;19(3):688. doi: 10.3390/ijms19030688.PMID: 29495593 Free PMC article. Review.

Estimating and testing direct genetic effects in directed acyclic graphs using estimating equations.

Konigorski S, Wang Y, Cigsar C, Yilmaz YE.Genet Epidemiol. 2018 Mar;42(2):174-186. doi: 10.1002/gepi.22107. Epub 2017 Dec 18.PMID: 29265408 Free PMC article.

Two-phase designs for joint quantitative-trait-dependent and genotype-dependent sampling in post-GWAS regional sequencing.

Espin-Garcia O, Craiu RV, Bull SB.Genet Epidemiol. 2018 Feb;42(1):104-116. doi: 10.1002/gepi.22099. Epub 2017 Dec 14.PMID: 29239496 Free PMC article.


KMEL References


References

  1.  
    1. Blangero J, Teslovich TM, Sim X, Almeida MA, Jun G, Dyer TD, Johnson M, Peralta JM, Manning AK, Wood AR, et al. Data for Genetic Analysis Workshop 18: human whole genome sequence, blood pressure, and simulated phenotypes in extended pedigrees. BMC Proc. 2015;9(Suppl 8):S2. - PMC - PubMed
  2.  
    1. Fuchsberger C, Flannick J, Teslovich TM, Mahajan A, Agarwala V, Gaulton KJ, Ma C, Fontanillas P, Moutsianas L, McCarthy DJ, et al. The genetic architecture of type 2 diabetes. Nature. 2016;536(7614):41–47. doi: 10.1038/nature18642. - DOI - PMC - PubMed
  3.  
    1. Göring HH, Curran JE, Johnson MP, Dyer TD, Charlesworth J, Cole SA, Jowett JB, Abraham LJ, Rainwater DL, Comuzzie AG, et al. Discovery of expression QTLs using large-scale transcriptional profiling in human lymphocytes. Nat Genet. 2007;39:1208–1216. doi: 10.1038/ng2119. - DOI - PubMed
  4.  
    1. Mitchell BD, Kammerer CM, Blangero J, Mahaney MC, Rainwater DL, Dyke B, Hixson JE, Henkel RD, Sharp RM, Comuzzie AG, VandeBerg JL, et al. Genetic and environmental contributions to cardiovascular risk factors in Mexican Americans. The San Antonio Family Heart Study. Circulation. 1996;94:2159–2170. doi: 10.1161/01.CIR.94.9.2159. - DOI - PubMed
  5.  
    1. Hunt KJ, Lehman DM, Arya R, Fowler S, Leach RJ, Göring HH, Almasy L, Blangero J, Dyer TD, Duggirala R, et al. Genome-wide linkage analyses of type 2 diabetes in Mexican Americans: the San Antonio Family Diabetes/Gallbladder Study. Diabetes. 2005;54:2655–2662. doi: 10.2337/diabetes.54.9.2655. - DOI - PubMed
  6.  
    1. Coletta DK, Schneider J, Hu SL, Dyer TD, Puppala S, Farook VS, Arya R, Lehman DM, Blangero J, DeFronzo RA, et al. Genome-wide linkage scan for genes influencing plasma triglyceride levels in the Veterans Administration Genetic Epidemiology Study. Diabetes. 2009;58:279–284. doi: 10.2337/db08-0491. - DOI - PMC - PubMed
  7.  
    1. Knowler WC, Coresh J, Elston RC, Freedman BI, Iyengar SK, Kimmel PL, Olson JM, Plaetke R, Sedor JR, Seldin MF, et al. The Family Investigation of Nephropathy and Diabetes (FIND): design and methods. J Diabetes Complications. 2005;19:1–9. doi: 10.1016/j.jdiacomp.2003.12.007. - DOI - PubMed
  8.  
    1. Hanis CL, Ferrell RE, Barton SA, Aguilar L, Garza-Ibarra A, Tulloch BR, Garcia CA, Schull WJ. Diabetes among Mexican Americans in Starr County, Texas. Am J Epidemiol. 1983;118:659–672. - PubMed
  9.  
    1. Below JE, Gamazon ER, Morrison JV, Konkashbaev A, Pluzhnikov A, McKeigue PM, Parra EJ, Elbein SC, Hallman DM, Nicolae DL, et al. Genome-wide association and meta-analysis in populations from Starr County, Texas, and Mexico City identify type 2 diabetes susceptibility loci and enrichment for expression quantitative trait loci in top signals. Diabetologia. 2011;54:2047–2055. doi: 10.1007/s00125-011-2188-3. - DOI - PMC - PubMed
  10.  
    1. McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, Garimella K, Altshuler D, Gabriel S, Daly M, et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 2010;20:1297–1303. doi: 10.1101/gr.107524.110. - DOI - PMC - PubMed
  11.  
    1. Price AL, Patterson NJ, Plenge RM, Weinblatt ME, Shadick NA, Reich D. Principal components analysis corrects for stratification in genome-wide association studies. Nat Genet. 2006;38:904–909. doi: 10.1038/ng1847. - DOI - PubMed
  12.  
    1. Adzhubei IA, Schmidt S, Peshkin L, Ramensky VE, Gerasimova A, Bork P, Kondrashov AS, Sunyaev SR. A method and server for predicting damaging missense mutations. Nat Methods. 2010;7:248–249. doi: 10.1038/nmeth0410-248. - DOI - PMC - PubMed
  13.  
    1. Speed D, Hemani G, Johnson MR, Balding DJ. Improved heritability estimation from genome-wide SNPs. Am J Hum Genet. 2012;91:1011–1021. doi: 10.1016/j.ajhg.2012.10.010. - DOI - PMC - PubMed
  14.  
    1. Lorenzo Bermejo J. Above and beyond state-of-the-art approaches to investigate sequence data: summary of methods and results from the population-based association group at the Genetic Analysis Workshop 19. BMC Genet. 2016;17(Suppl 2):S2. doi: 10.1186/s12863-015-0310-0. - DOI - PMC - PubMed
  15.  
    1. Wijsman EM. Family-based approaches: design, imputation, analysis, and beyond. BMC Genet. 2016;17(Suppl 2):S9. doi: 10.1186/s12863-015-0318-5. - DOI - PMC - PubMed
  16.  
    1. König IR, Auerbach J, Gola D, Held E, Holzinger ER, Legault MA, Sun R, Tintle N, Yang HC. Machine learning and data mining in complex genomic data-a review on the lessons learned in Genetic Analysis Workshop 19. BMC Genet. 2016;17(Suppl 2):S1. doi: 10.1186/s12863-015-0315-8. - DOI - PMC - PubMed
  17.  
    1. Schillert A, Konigorski S. Joint analysis of multiple phenotypes: summary of results and discussions from the Genetic Analysis Workshop 19. BMC Genet. 2016;17(Suppl 2):S7. doi: 10.1186/s12863-015-0317-6. - DOI - PMC - PubMed
  18.  
    1. Friedrichs S, Malzahn D, Pugh EW, Almeida M, Liu XQ, Bailey JN. Filtering genetic variants and placing informative priors based on putative biological function. BMC Genet. 2016;17(Suppl 2):S8. doi: 10.1186/s12863-015-0313-x. - DOI - PMC - PubMed
  19.  
    1. Kent JW., Jr Pathway-based analyses. BMC Genet. 2016;17(Suppl 2):S5. doi: 10.1186/s12863-015-0314-9. - DOI - PMC - PubMed
  20.  
    1. Santorico SA, Hendricks AE. Progress in methods for rare variant association. BMC Genet. 2016;17(Suppl 2):S6. doi: 10.1186/s12863-015-0316-7. - DOI - PMC - PubMed
  21.  
    1. Chiu YF, Justice AE, Melton PE. Longitudinal analytical approaches to genetic data. BMC Genet. 2016;17(Suppl 2):S4. doi: 10.1186/s12863-015-0312-y. - DOI - PMC - PubMed
  22.  
    1. Cantor RM, Cordell HJ. Gene expression in large pedigrees: analytic approaches. BMC Genet. 2016;17(Suppl 2):S3. doi: 10.1186/s12863-015-0311-z. - DOI - PMC - PubMed