Volume 3 Supplement 7
Armitage's trend test for genome-wide association analysis: one-sided or two-sided?
© Fang et al; licensee BioMed Central Ltd. 2009
Published: 15 December 2009
The importance of considering confounding due to population stratification in genome-wide association analysis using case-control designs has been a source of debate. Armitage's trend test, together with some other methods developed from it, can correct for population stratification to some extent. However, there is a question whether the one-sided or the two-sided alternative hypothesis is appropriate, or to put it another way, whether examining both the one-sided and the two-sided alternative hypotheses can give more information. The dataset for Problem 1 of Genetic Analysis Workshop 16 provides us with a chance to address this question. Because it is a part of a combined sample from the North American Rheumatoid Arthritis Consortium (NARAC) and the Swedish Epidemiological Investigation of Rheumatoid Arthritis (EIRA), the results from the combined sample can be used as references. To test this aim, the last 10,000 single-nucleotide polymorphisms (SNPs) on chromosome 9, which contain the common genetic variant at the TRAF1-C5 locus, were examined by conducting Armitage's trend tests. Examining the two-sided alternative hypothesis shows that SNPs rs12380341 (p = 9.7 × 10-11) and rs872863 (p = 1.7 × 10-15), along with six SNPs across the TRAF1-C5 locus, rs1953126, rs10985073, rs881375, rs3761847, rs10760130, and rs2900180 (p~1 × 10-7), are significantly associated with anti-cyclic citrullinated peptide-positive rheumatoid arthritis. But examining the one-sided alternative hypothesis that the minor allele is positively associated with the disease shows that only those six SNPs across the TRAF1-C5 locus are significantly associated with the disease (p~1 × 10-8), which is consistent with the results from the combined sample of the NARAC and the EIRA.
The Genetic Analysis Workshop 16 (GAW16) rheumatoid arthritis (RA) dataset is the initial batch of whole genome-wide association study (GWAS) data for the North American Rheumatoid Arthritis Consortium (NARAC) cases (N1 = 868) and controls (N0 = 1194) after removing duplicated and contaminated samples . The high-throughput genotyping technology [~550 k single-nucleotide polymorphisms (SNPs)] in the NARAC data makes it a challenge to interpret this GWAS.
One of the disadvantages of the case-control GWAS is that they are prone to a number of biases including population stratification . The importance of considering confounding due to population stratification in GWAS using case-control designs [3, 4] has been a source of debate. The Armitage's trend tests can correct for population stratification to some extent [5–7]; some other methods based on the Armitage's trend tests have also been developed, such as genomic control approach [8, 9]. However, there is still a question as to whether the one-sided or the two-sided alternative hypothesis is appropriate, or put it another way, whether examining both the one-sided and the two-sided alternative hypotheses can give more information. The dataset for the Problem 1 of GAW16 provides us with a chance to address this question. Because it is a part of a combined sample from the NARAC and the Epidemiological Investigation of Rheumatoid Arthritis (EIRA), the results from the combined sample can be used as references.
To this aim, the last 10,000 SNPs on chromosome 9, which contains the common genetic variant at the TRAF1-C5 locus, were examined by conducting Armitage's trend tests. Two alternative hypotheses, the two-sided alternative hypothesis that the genotypes at a locus are associated with the disease and the one-sided alternative hypothesis that the minor allele at a locus is positively associated with the disease, were considered. Three types of scores, co-dominant score, dominant score, and recessive score, were chosen to construct the Armitage's trend tests.
Contingency table at any SNP (M is major allele and m is minor allele)
Under the null hypothesis, it is approximately distributed with . This test statistic is suitable for the two-sided alternative hypothesis that the genotypes at a SNP are associated with the disease of interest. As discussed in Armitage , whatever the scoring system chosen, the validity of the test is not affected, but the choice of scoring system affects the power of the test. There are three common choices of scoring system: 1) co-dominant score: x0 = 0, x1 = 1, and x2 = 2; 2) dominant score: x0 = 0, x1 = 1, and x2 = 1; 3) recessive score: x0 = 0, x1 = 0, and x2 = 1. Here, the names of scoring systems are in favor of the minor allele "m".
From the rationale of the genetic association analysis (see, for example, Risch and Merikangas ), it is more informative to look at two one-sided alternative hypotheses, i) the alternative that the minor allele is positively associated with the disease and ii) the alternative that the major allele is positively associated with the disease. Furthermore, because the disease of interest is rare, it is more reasonable to concentrate on the first alternative, despite that in practice we would do better to consider both alternatives if no prior information is available on which allele is positively associated with the disease. Another reason is that it can reduce the false-positive rate.
Under the null hypothesis, it is approximately distributed with N(0,1). Similarly, those three scoring systems can also be used here. It is shown in Knapp  that if the co-dominant scoring system is chosen, then , where F is the Wright's coefficient of inbreeding, and Z is the test statistic simply comparing the frequencies of minor allele "m" in the case and control groups. Here the value of F automatically corrects the population stratification to some extent.
For simplicity of interpretation, we only consider the last 10,000 SNPs on chromosome 9, which contain the common genetic variant at the TRAF1-C5 locus. The same analysis can be extended to the whole genome of approximately 550,000 SNPs.
LOD values for the two-sided alternative
Z 2 b
In Table 2, those six SNPs marked with asterisks have small F (<0.03), and this explains why their values in the third column, which correct for population stratification, are almost the same as Z2 in the second column. Also, for these six SNPs, is a bit more significant than and , and the latter two are close to each other, which means that these SNPs are very likely co-dominant. For the other seven SNPs, is a bit more significant than , but is not significant at all. This shows that these SNPs are very likely recessive.
LOD values for the one-sided alternative of the minor allele
Z A1 c
The question of whether the two-sided alternative or the one-sided alternatives should be considered is intractable, but this manuscript attempts to raise the question and address it to some extent. Table 3 shows that if we concentrate on the one-sided alternative that the minor allele is positively associated with the disease, we get exactly the same results as Plenge et al. . For rare diseases, and we have reason to believe that the alleles positively associated with them have low frequencies in a general population. Based on this belief (or alternative hypothesis), it seems that those SNPs without asterisks are false positives under the two-sided alternative.
But if we do not want to believe that the minor allele is positively associated with the disease and do not want to miss any SNPs related to the disease, we had better consider the two-sided alternative.
More information can be gained from GWAS by using multiple scoring systems in the Armitage's trend tests and examining both the one-sided and the two-sided alternative hypotheses.
List of abbreviations used
Epidemiological Investigation of Rheumatoid Arthritis
Genetic Analysis Workshop 16
North American Rheumatoid Arthritis Consortium
The Genetic Analysis Workshops are supported by NIH grant R01 GM031575 from the National Institute of General Medical Sciences.
This article has been published as part of BMC Proceedings Volume 3 Supplement 7, 2009: Genetic Analysis Workshop 16. The full contents of the supplement are available online at http://www.biomedcentral.com/1753-6561/3?issue=S7.
- Plenge R, Seielstad M, Padyukov L, et al: TRAF1-C5 as a risk locus for rheumatoid arthritis-a genomewide study. N Engl J Med. 2007, 357: 1199-1209. 10.1056/NEJMoa073491.PubMed CentralView ArticlePubMedGoogle Scholar
- Pearson TA, Manolio TA: How to interpret a genome-wide association study. JAMA. 2008, 299: 1335-1345. 10.1001/jama.299.11.1335.View ArticlePubMedGoogle Scholar
- Thomas DC, Witte JS: Point: population stratification: a problem for case-control studies of candidate-gene associations?. Cancer Epidemiol Biomarkers Prev. 2002, 11: 505-512.PubMedGoogle Scholar
- Wacholder S, Rothman N, Caporaso N: Counterpoint: bias from population stratification is not a major threat to the validity of conclusions from epidemiological studies of common polymorphisms and cancer. Cancer Epidemiol Biomarkers Prev. 2002, 11: 513-520.PubMedGoogle Scholar
- Armitage P: Tests for linear trends in proportions and frequencies. Biometrics. 1955, 11: 375-386. 10.2307/3001775.View ArticleGoogle Scholar
- Sasieni PD: From genotypes to genes: doubling the sample size. Biometrics. 1997, 53: 1253-1261. 10.2307/2533494.View ArticlePubMedGoogle Scholar
- Schaid DJ, Jacobsen SJ: Biased tests of association: comparisons of allele frequencies when departing from Hardy-Weinberg proportions. Am J Epidemiol. 1999, 149: 706-711.View ArticlePubMedGoogle Scholar
- Devlin B, Roeder K: Genomic control for association studies. Biometrics. 1999, 55: 997-1004. 10.1111/j.0006-341X.1999.00997.x.View ArticlePubMedGoogle Scholar
- Reich D, Goldstein D: Detecting association in a case-control study while correcting for population stratification. Genet Epidemiol. 2001, 20: 4-16. 10.1002/1098-2272(200101)20:1<4::AID-GEPI2>3.0.CO;2-T.View ArticlePubMedGoogle Scholar
- Risch N, Merikangas K: The future of genetic studies of complex human disease. Science. 1996, 273: 1516-1517. 10.1126/science.273.5281.1516.View ArticlePubMedGoogle Scholar
- Knapp M: Re: "Biased tests of association: comparisons of allele frequencies when departing from Hardy-Weinberg proportions". Am J Epidemiol. 2001, 154: 287-288. 10.1093/aje/154.3.287.View ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.