Skip to main content

Table 2 Feature selection

From: Using LASSO regression to detect predictive aggregate effects in genetic studies

Model type

Model 1

 

Model 3

 

Gene

SNP

Counta

MAFb

Causalc

 

Gene

SNP

Counta

MAFb

Causalc

Gene only

FLT1

C13S523

35

0.0667

Y

 

FLT1

C13S523

71

0.0667

Y

 

ADAMTS7

C15S3360

22

0.0029

N

 

SRPR

C11S6885

63

0.0014

N

 

TG

C8S4379

17

0.0050

N

 

TG

C8S4379

61

0.0050

N

 

MDN1

C6S4146

15

0.0050

N

 

RPA3

C7S297

58

0.0007

N

 

GOLGA1

C9S4013

13

0.0308

N

 

LAMB3

C1S10178

54

0.0007

N

 

FLT1

C13S522

12

0.0280

Y

 

RPL27

C17S2981

52

0.0007

N

Gene restricted

FLT1

C13S523

19

0.0667

Y

 

FLT1

C13S523

44

0.0667

Y

 

TEX14

C17S3819

9

0.0043

N

 

FLT1

C13S522

24

0.0280

Y

 

FLT1

C13S522

8

0.0280

Y

 

CYP3A43

C7S2324

21

0.0976

N

 

UBA3

C3S2197

7

0.0108

N

 

TG

C8S4379

18

0.0050

N

 

GOLGA1

C9S4013

7

0.0308

N

 

PRKCA

C17S4578

16

0.1664

Y

 

CYP3A43

C7S2324

7

0.0976

N

 

PIK3C2B

C1S9189

15

0.0065

Y

Combined

Age

Age

200

NA

Y

 

Age

Age

200

NA

Y

 

Smoke

Smoke

163

NA

Y

 

Smoke

Smoke

185

NA

Y

 

FLT1

C13S523

49

0.0667

Y

 

FLT1

C13S523

81

0.0667

Y

 

FLT1

C13S522

16

0.0280

Y

 

FLT1

C13S522

34

0.0280

Y

 

PIK3C3

C18S2492

7

0.0172

Y

 

PIK3C3

C18S2492

18

0.0172

Y

 

HFE

C6S853

3

0.0036

N

 

PRKCA

C17S4578

8

0.1664

Y

 

ARNT

C1S6533

3

0.0115

Y

 

ARNT

C1S6533

8

0.0115

Y

 

ACP1

C2S1

2

0.0093

N

 

UBA3

C3S2197

7

0.0108

N

Combined restricted

Age

Age

200

NA

Y

 

Age

Age

200

NA

Y

 

Smoke

Smoke

163

NA

Y

 

Smoke

Smoke

180

NA

Y

 

FLT1

C13S523

49

0.0667

Y

 

FLT1

C13S523

75

0.0667

Y

 

FLT1

C13S522

17

0.0280

Y

 

FLT1

C13S522

32

0.0280

Y

 

PIK3C3

C18S2492

7

0.0172

Y

 

PIK3C3

C18S2492

17

0.0172

Y

 

ARNT

C1S6533

3

0.0115

Y

 

UBA3

C3S2197

6

0.0108

N

 

LARGE

C22S1540

3

0.0201

N

 

ARNT

C1S6533

6

0.0115

Y

 

MMS19

C10S4869

3

0.0050

N

 

KDR

C4S1861

5

0.0022

Y

  1. a Number of times a given variable was observed in four out of five trained models.
  2. b Minor allele frequency.
  3. c Variables used to determine disease risk by the GAW17 simulators.
  4. The top most frequent variables occurred in at least four out of five trained models for models 1 and 3. All models were run for the 200 simulation data sets.