Skip to main content

Table 1 Regression analysis summarizing association of model parameters and model performance across 315 different situations

From: Comparing machine learning and logistic regression methods for predicting hypertension using a combination of gene expression and next-generation sequencing data

 

Method

Model parameters

LR

\( \widehat{\beta}(SE) \)

Radial SVM

\( \widehat{\beta}(SE) \)

Linear SVM

\( \widehat{\beta}(SE) \)

Gene expression noise (k)

−3.5 × 10−2 (1.2 × 10−2)**

−6.4 × 10−2 (2.0 × 10−2)**

−3.3 × 10−2 (1.7 × 10−2)

Number of collapsed phenotypes (m)

−4 × 10−5 (1.1 × 10−5)***

−5.49 × 10−6 (1.7 × 10−5)

−4.4 × 10−5 (1.5 × 10−5)**

Number of causal genes

−7.5 × 10−4 (1.7 × 10−4)***

−8.2 × 10−4 (2.7 × 10−4)**

−1.6 × 10−4 (2.3 × 10−4)

Number of random genes

−1.7 × 10−3 (3.6 × 10−5)***

−9.8 × 10−4 (5.6 × 10−5)***

−1.0 × 10−3 (4.8 × 10−5)***

Model r2

88.4 %

50.7 %

62.5 %

  1. \( \widehat{\beta} \), the estimated coefficient in the regression model; SE, the estimated standard error of the coefficient
  2. Regression models predicted AUC by 4 different model parameters for each of the 3 methods separately
  3. Statistical significance of the estimated regression coefficients is indicated by asterisks (***p <0.001, **p <0.01)