Skip to main content

Table 2 Comparison of the prediction rule between the empirical Bayes and other classifiers

From: Large-scale risk prediction applied to Genetic Analysis Workshop 17 mini-exome sequence data

Feature

Empirical Bayes method

Random forest classifier

Logistic regression

 

Genes

#SNP

MAF

Genes

#SNP

MAF

Genes

#SNP

MAF

1

Age

  

Age

  

Age

  

2

Smoke

  

Smoke

  

Smoke

  

3

ATP11A

1

0.29

FLT1

25

<0.01

SUSD2

36

<0.01

     

7

0.01–0.05

 

6

0.01–0.05

     

3

≥0.05

 

3

≥0.05

4

FLT1

25

<0.01

SUSD2

36

<0.01

ATP11A

1

0.29

  

7

0.01–0.05

 

6

0.01–0.05

   
  

3

≥0.05

 

3

≥0.05

   

5

SUSD2

36

 

SHD

10

< 0.01

BUD13

1

0.11

  

6

  

1

0.01–0.05

   
  

3

  

2

≥0.05

   

6

BUD13

1

0.11

RIPK3

17

<0.01

RIPK3

17

<0.01

     

2

0.01–0.05

 

2

0.01–0.05

     

2

≥0.05

 

2

≥0.05

7

RIPK3

17

<0.01

ADAMTS4

23

<0.01

FLT1

25

<0.01

  

2

0.01–0.05

 

4

0.01–0.05

 

7

0.01–0.05

  

2

≥0.05

 

3

≥0.05

 

3

≥0.05

8

C10ORF107

1

0.13

CECR1

8

<0.01

MAP3K12

14

<0.01

      

0.01–0.05

 

3

0.01–0.05

     

4

≥0.05

  

≥0.05

9

ADAMTS4

33

<0.01

GOLGA1

1

<0.01

ADAMTS4

33

<0.01

  

4

0.01–0.05

 

1

0.01–0.05

 

4

0.01–0.05

  

3

≥0.05

 

1

≥0.05

 

3

≥0.05

10

MAP3K12

14

<0.01

C14orf108

16

<0.01

C10ORF107

1

0.13

  

3

0.01–0.05

 

1

0.01–0.05

   
     

2

≥0.05

   
  1. The top 10 important features from the model incorporating genes and environmental variables between our proposed method (empirical Bayes) and other classifiers (random forest and logistic regression). #SNP, number of SNPs within a specific gene. MAF shows three intervals of minor allele frequency: MAF < 0.01, 0.01 ≤ MAF < 0.05, and MAF ≥ 0.05. The boldfaced genes are real causal features that are selected simultaneously from the three models; for example, FLT1is observed using the three classifiers.