Rheumatoid arthritis is a complex disease with a moderately strong genetic component. The recurrence risk ratio for siblings is typically estimated at around 6 in Caucasians, but it has a broad range of values, primarily because the prevalence in the population is not well characterized . The prevalence also varies among populations, ranging from around 0.8% in Caucasians to 10% in some Native American groups. Females are generally at higher risk than males, with about a 3 to 1 predominance of females to males. The mean age of disease onset is in the fifth decade with considerable variability in age at presentation, including occasional presentation in the teenage years.
The HLA region on 6p21 has been implicated by numerous studies, and there is consistent evidence that DR alleles contribute to disease risk. The 'shared epitope' hypothesis was proposed by Gregersen et al.  to explain the organization of risk for rheumatoid arthritis from DR alleles. According to this hypothesis, individuals who share a QK/RRAA motif in positions 70 to 74 of the DR molecule show an increased risk for disease. The alleles that confer increased risk for rheumatoid arthritis include DRB1*0101, 0102, 0104, 0105, 0401, 0404, 0405, 0408, 0409, 1001, 1402, and 1406, with highest risk alleles in bold . This model was not quite sufficient to explain risk according to DR types, and newer models utilizing data from positions 70 to 74 have been developed [4, 5]. DR effects on risk for rheumatoid arthritis also show a complex effect on risk for rheumatoid arthritis, but presence of two risk alleles generally increases risk substantially more than the risk associated with heterozygosity for risk and nonrisk alleles. Aside from the main effects of DR, there is also evidence for interactions with other HLA loci or haplotypic effects including the class 1 region and the central MHC . Certain DR alleles, notably DR3 [7, 8], can occur on a background of extended linkage disequilibrium, for which the extended haplotype confers increased risk, even though DR3 alleles alone do not increase risk.
Two quantitative phenotypes that are used for identifying rheumatoid arthritis affected individuals include anti-cyclic citrullinated peptide (anti-CCP) and rheumatoid factor IgM autoantibodies. The heritability of these measures is hard to obtain from the selected sib pairs we are studying. After proband correction, the heritability estimates are 11% and 30%, while before correction the heritabilities are 15% and 67%. Specific autoantibodies are noted to co-occur with rheumatoid arthritis. Rheumatoid factor IgM has been correlated with erosive arthritic disease. However, anti-CCP is more specific for the disease and is a better predictor of erosive outcome . Elevations of anti-CCP have been noted to predict increased risk for development of rheumatoid arthritis . The shared-epitope alleles are strongly associated with the presence of anti-CCP antibodies, and there is evidence that this effect is modulated by HLA-DR3 .
Alleles at the PTPN22 locus have been shown to confer an increased risk for rheumatoid arthritis . At least two alleles of PTPN22 have been implicated as causing increased risk for rheumatoid arthritis; the R620W allele in rs2476601 (hCV16021387) confers 1.7- to 1.9-fold increased risk to heterozygotes and higher risks to homozygous carriers. These findings have further been confirmed by analysis of transmission of PTPN22 alleles to affected offspring in families . Increased risk has also been noted for either hCV8689108 or hCV25762283 , with some indeterminacy because of linkage disequilibrium among these markers (and others in the region).
The CTLA4 locus on chromosome 2q33 has been associated with mildly increased risk for rheumatoid arthritis . In addition, alleles at loci in the TRAF1/C5 region are associated with rheumatoid arthritis risk . A targeted association study showed that alleles of STAT4  are associated with rheumatoid arthritis risk, but these associations are too weak to reach genome-wide levels of association in the data set that we have here provided. Similarly, a locus on chromosome 6q (TNFAIP3) that is associated with rheumatoid arthritis risk has relatively weaker effects . Additional loci that have been implicated in Caucasian rheumatoid arthritis populations include CD40 (20q13), PRKCQ (10p15), and CCL21 (9p13), among others [17, 18].
Aside from identified genetic factors and sex, few environmental cofactors have been identified as affecting risk for rheumatoid arthritis. However, current smoking confers about a two-fold increased risk . Klareskog et al.  showed that the risk from smoking for rheumatoid arthritis is particularly high among individuals who have a shared-epitope allele and who also have elevated levels of anti-CCP antibodies. The biological basis for this rather complex interaction appears to reflect increased citrullination of peptides among smokers, and presentation of citrullinated peptides by shared-epitope alleles.
The data set submitted for the Genetic Analysis Workshop 16 (GAW16) was designed with a primary goal of allowing the identification of genetic factors that predispose to rheumatoid arthritis using association methods. Given some previously identified evidence for effects of smoking on rheumatoid arthritis risk and difference in risk according to sex, there is an interest in identifying gene-environment and gene-gene combinations that yield particularly high risks to individuals for rheumatoid arthritis.