Skip to main content

Network-guided interaction mining for the blood pressure phenotype of unrelated individuals in genetic analysis workshop 19

Abstract

Interactions between genes are an important part of the genetic architecture of complex diseases. In this paper, we use literature-guided individual genes known to be associated with type 2 diabetes (referred to as “seed genes”) to create a larger list of genes that share implied or direct networks with these seed genes. This larger list of genes are known to interact with each other, but whether they interact in ways to influence hypertension in individuals presents an interesting question. Using Genetic Analysis Workshop data on individuals with diabetes, for which only case-control labels of hypertension are known, we offer a foray into identification of diabetes-related gene interactions that are associated with hypertension. We use the approach of Lo et al. (Proc Natl Acad Sci U S A 105: 12387-12392, 2008), which creates a score to identify pairwise significant gene associations. We find that the genes GCK and PAX4, formerly known to be found within similar coexpression and pathway networks but without specific direct interactions, do, in fact, show significant joint interaction effects for hypertension.

Background

Hypertension is a well-studied genetic disease, particularly in the identification of genes marginally associated with the disease. When using high-throughput data such as genome-wide association studies or sequencing data we must also consider interactions between genes, which can simultaneously and dramatically increase the number of dimensions required for evaluation, as well as the chance of false positives. Reduction of dimensionality can be preliminarily conducted through literature-based confirmations of biological relations and possible interactions of genes, and focusing on these sets of genes first. Laboratory and data analysis have developed biological and functional interactions between some of these identified genes. This paper seeks to further the network knowledge of genes that interact to affect hypertension.

Using a “seed” set of 15 genes found to be theoretically associated with type 2 diabetes in the literature, we expand on this seed set with genes known to broadly interact with these seed genes (although specific information on their interactions to influence type 2 diabetes is unknown) to create our full gene list. We then explore pairwise associations in our full gene list by providing a systematic exploration of all significant pairwise associations (potentially expanding on edges in the literature’s drawn network of these genes). Because the Genetic Analysis Workshop (GAW) data is on individuals with type 2 diabetes but only the phenotype for blood pressure is known, we use network information on genes interacting for type 2 diabetes to identify novel gene interactions for hypertension. We believe that, for these individuals, underlying diabetes mechanisms drives variations in hypertension status. We use this study to identify potential association of blood pressure with diabetes genes in this data set.

Methods

Seed genes from literature

To build the original set of genes theoretically associated with hypertension, we turned to the Online Mendelian Inheritance in Man (OMIM) [1]. Using type 2 diabetes mellitus as the search term (#125853), we found a list of 15 genes known to be related to diabetes type 2 (Table 1). These were then used in GeneMANIA to retrieve genes connected to them.

Table 1 Seed gene list

Interaction network from literature

We supplied all 15 genes to the online portal of GeneMANIA [2], an online database of connections, including known biological pathways, between genes reported in the literature so far. The seed genes were used to retrieve genes that are connected to them. Each connected gene was scored based on the nature and strength of evidence of all the connection instances it had with all seed genes. We chose the 20 top-scored genes to expand the previous set of 15 seed genes, resulting in a final list of 22 seed and expanded genes. We denoted this list of genes as \( V=\left({v}_1,\dots,\;{v}_k\right),\;k\in \left\{1,\dots, 22\right\} \). We also retrieved the interaction network, denoted as \( E={\left[{e}_{ij}\right]}_{k\times k} \).

Hypertension phenotype

The data set under consideration was the GAW19 [3] unrelated individuals data with type 2 diabetes. The phenotype available for analysis, however, was hypertension. Subjects were coded as case or control phenotypes in parallel to the rules used in the GAW19 family data set, whereby cases were defined as individuals with systolic blood pressure (SBP) >140 mm Hg, diastolic blood pressure (DBP) > 90 mm Hg, or who were on antihypertensive medication. Satisfying any one of these three criteria was sufficient to make them a case. Controls measured as SBP ≤140 mm Hg, DBP ≤90 mm Hg, and were not on antihypertensive medication.

Pairwise network scoring

We use the approach in Lo et al. [4] to score the marginal and joint effects. Because we have 22 genes, we have \( \frac{\left(22*21\right)}{2}=231 \) gene pairs.

To measure the joint effects of two single nucleotide polymorphisms (SNPs), we use the measure v:

$$ v={\displaystyle \sum_{s=1}^{3^k}}{\left(\frac{n_{D,s}}{n_D}-\frac{n_{U,s}}{n_U}\right)}^2 $$

where \( {n}_{D,s} \) and \( {n}_{U,s} \) are counts of cases and controls in each genotype (element) s, \( {n}_D \) and \( {n}_U \) are the total number of cases and controls under the study, and \( k\in \left\{1,\dots, 22\right\} \).

To measure the amount of interaction between two genes, \( {g}_i \) and \( {g}_j \): for every pair of SNPs \( \left({i}_d,\;{i}_e\right) \) from each gene \( i \) and gene \( j \) define the SNP-wise interactions as the ratio of incremental interactions versus the maximum of the two marginal effects:

$$ r\left({i}_d,\ {j}_e\right)=\frac{v_{i_d,{j}_e}-{v}_{i_d}\mathsf{V}\ {v}_{j_e}}{v_{i_d}\mathsf{V}\ {v}_{j_e}} $$

where \( \vee \) represents the maximum of two values, and \( r\left({i}_d,\;{j}_e\right) \) is the relative amount of interactions of two SNPs with respect to their marginal effects. The amount of interactions between two genes \( i \) and \( j \) is defined as the average of all SNP-wise ratios possibly formed from these two genes and is denoted as:

$$ {R}_{ij}=\frac{{\displaystyle {\sum}_{d=1}^{m_i}}{\displaystyle {\sum}_{e=1}^{m_j}}r\left({i}_d,\ {j}_e\right)}{m_i{m}_j} $$

and called the “mean interaction ratio,” or “mean-ratio” or “R-statistic.”

For each gene pair, we also define the “average maximum marginal \( v \)” or “M-statistic” as:

$$ {M}_{ij}=\frac{{\displaystyle {\sum}_{d=1}^{m_i}}{\displaystyle {\sum}_{e=1}^{m_j}}\left({v}_{i_d}\vee {v}_{j_e}\right)}{m_i{m}_j} $$

From the above steps, we obtain a set of 231 total pairwise interactions, \( \left\{\left({M}_{ij},\;{R}_{ij}\right);1\le i<j\le 22\right\} \), corresponding to all possible gene pairs.

To establish significance, we applied 1000 permutations of the case-control outcomes in order to determine the null distribution of the ratio and maximum. Permutations are used to determine significance between gene interactions.

Results

Retrieved pairwise network scores

Results of all pairwise SNP interactions resulted in 41 pairwise SNPs with statistically significant joint effects (significant \( v \) scores). However, this is based on theoretical results (Table 2 lists the SNPs and their respective joint effect scores). Indeed, these are only amongst SNP interactions; to determine whether genes are significantly associated with other genes, we average across SNP–SNP interactions between one given gene and another given gene. Given the rare variant-heavy nature of the GAW19 data set, marginal and joint association scores were very low; this is not surprising given the rarity of the variants. However, even with the rare variant problem, one set of joint gene interactions was found after 1000 permutations, between gene GCK and gene PAX4 at the 95 % significance level.

Table 2 Top returned ratio of pairwise and marginal effects

Discussion

The main results are 41 pairwise SNPs that demonstrate statistically significant joint effects with respect to \( v \) values. Averaging across SNPs within genes and comparing joint effects retrieves a statistically significant joint effect between the two genes, GCK and PAX4, when comparing to the permuted null distribution. We can be confident that these results are not a result of overly large individual effects from the GCK or PAX4 gene as the pairwise interaction ratio statistics used are with respect to the maximum of the marginal effects.

We take a moment to note that the number of SNPs corresponding to each gene varies among genes, ranging from 1 to 124 SNPs. On average there are roughly 3000 bp between two consecutive SNPs, which means the largest of our genes corresponds to more than 370,000 bp. We recognize the possibility of linkage disequilibrium between SNPs located close to one another. We take advantage of this dependence and integrate neighboring information by treating the gene as the basic unit instead of each SNP (thus accounting for our gene-based approach). Thus when we discuss the effect of a certain gene pair, we mean the average of all pairwise interactions of SNP pairs formed from the two genes.

Conclusions

We find a significant interaction effect of the GCK and PAX4 genes on hypertension in the GAW19 data. While GCK and PAX4 have established coexpression and pathway linkages via other genes, no known interaction seems to have been previously established between the two genes themselves without the mediation of other genes. In addition, GCK and PAX4 are not known to specifically interact toward hypertension. As such we provide direct evidence of an interesting joint effect of these two genes in the context of hypertension.

References

  1. Online Mendelian Inheritance in Man, OMIM®. McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University (Baltimore, MD). http://omim.org/. Accessed Aug 1, 2014

  2. Warde-Farley D, Donaldson SL, Comes O, Zuberi K, Badrawi R, Chao P, Franz M, Grouios C, Kazi F, Lopes CT, et al. The GeneMANIA prediction server: biological network integration for gene prioritization and predicting gene function. Nucleic Acids Res. 2010;38(Web Server issue):W214–20.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Blangero J, Teslovich TM, Sim X, Almeida MA, Jun G, Dyer TD, Johnson M, Peralta JM, Manning A, Wood AR, Fuchsberger C, Kent Jr JW, et al. Omics-squared: human genomic, transcriptomic and phenotypic data for genetic analysis workshop 19. BMC Proc. 2015;9 Suppl 8:S2.

    Google Scholar 

  4. Lo S, Chernoff H, Cong L, Ding Y, Zheng T. Discovering interactions among BRCA1 and other candidate genes associated with sporadic breast cancer. Proc Natl Acad Sci U S A. 2008;105(34):12387–92.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

The authors thank the Robert Wood Johnson Foundation for funding of this project. The hypertension for unrelated individuals data were provided by the GAW. Special thanks to Dr. Chien-Hsun Huang for analytical support.

Declarations

This article has been published as part of BMC Proceedings Volume 10 Supplement 7, 2016: Genetic Analysis Workshop 19: Sequence, Blood Pressure and Expression Data. Summary articles. The full contents of the supplement are available online at http://bmcproc.biomedcentral.com/articles/supplements/volume-10-supplement-7. Publication of the proceedings of Genetic Analysis Workshop 19 was supported by National Institutes of Health grant R01 GM031575.

Authors’ contributions

AL and TZ designed the overall study. AL conducted statistical analyses and drafted the manuscript. All authors provided feedback on and read the manuscript. AL and TZ contributed equally to this work. All authors read and approved the final manuscript.

Competing interests

The authors declare that they have no competing interests.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Adeline Lo.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lo, A., Agne, M., Auerbach, J. et al. Network-guided interaction mining for the blood pressure phenotype of unrelated individuals in genetic analysis workshop 19. BMC Proc 10 (Suppl 7), 13 (2016). https://doi.org/10.1186/s12919-016-0052-7

Download citation

  • Published:

  • DOI: https://doi.org/10.1186/s12919-016-0052-7

Keywords