Skip to main content

Table 1 Percentage of CVs and UNCVs that were in the top-ranked 10% of predictors (RF and ULR) or that were included in the final model (LR) in at least 5%, 10%, and 20% of the 200 simulated replicates

From: Performance of random forests and logic regression methods using mini-exome sequence data

Data set

Total numbera

Random forest, % variants ranked in the top 10% of predictors

Univariate linear regression, % variants ranked in the most significant 10%

Logic regression, % variants included in final model

  

In ≥5 PoR

In ≥10 PoR

In ≥20 PoR

In ≥5 PoR

In ≥10 PoR

In ≥20 PoR

In ≥5 PoR

In ≥10 PoR

In ≥20 PoR

Uncollapsed

CVs

36

100

72

33

94

72

39

   
 

UNCVs

12,485

76

46

8

90

25

3

   

Gene-collapsed

CVs

15

100

100

50

87

73

53

91b

64

45

 

UNCVs

6,642

98

81

24

90

26

3

63

32

9

Pathway-collapsed

CVs

167

99

95

72

93

50

26

1c

0

0

 

UNCVs

2,249

91

68

38

88

32

8

0

0

0

  1. UNCVs are noncausal variants that were not used in the simulation to determine Q2, were not in one of the causal genes, and did not display correlation of at least 0.6 with any CVs.
  2. a Total number of nonmonomorphic variants in Asians. For LR we excluded 4 common CVs and 4,725 common NCVs.
  3. b Final LR model: 3 trees with 10 leaves.
  4. c Final LR model: 2 trees with 20 leaves.