Biological pathway analysis by ArrayUnlock and Ingenuity Pathway Analysis
- Ángeles Jiménez-Marín†1,
- Melania Collado-Romero†1,
- María Ramirez-Boo1,
- Cristina Arce1 and
- Juan J Garrido1Email author
© Jiménez-Marín et al; licensee BioMed Central Ltd. 2009
Published: 16 July 2009
Once a list of differentially expressed genes has been identified from a microarray experiment, a subsequent post-analysis task is required in order to find the main biological processes associated to the experimental system. This paper describes two pathways analysis tools, ArrayUnlock and Ingenuity Pathways Analysis (IPA) to deal with the post-analyses of microarray data, in the context of the EADGENE and SABRE post-analysis workshop. Dataset employed in this study proceeded from an experimental chicken infection performed to study the host reactions after a homologous or heterologous secondary challenge with two species of Eimeria.
Analysis of the same microarray data source employing both commercial pathway analysis tools in parallel let to identify several biological and/or molecular functions altered in the chicken Eimeria maxima infection model, including several immune system related pathways. Biological functions differentially altered in the homologous and heterologous second infection were identified. Similarly, the effect of the timing in a homologous second infection was characterized by several biological functions.
Functional analysis with ArrayUnlock and IPA provided information related to functional differences with the three comparisons of the chicken infection leading to similar conclusions. ArrayUnlock let an improvement of the annotations of the chicken genome adding InterPro annotations to the data set file. IPA provides two powerful tools to understand the pathway analysis results: the networks and canonical pathways that showed several pathways related to an adaptative immune response.
Microarray provides expression levels for thousands of genes simultaneously. The differentially expressed genes can be studied with different pathway analysis tools to connect with existing biological pathways by using public sources. Therefore, the integration of the differentially expressed genes into known biological pathways is a versatile tool for understand the biological complexity of gene expression. The EADGENE and SABRE post-analysis workshop evaluated different methods and software to deal with the post-analysis of microarray data . In this study the analysis tools employed were Array Unlock an IPA and the data set used comes from microarrays assays performed to characterize the gene expression profile after a homologous or heterologous challenge of broilers primed with Eimeria maxima as summarised in .
The microarray employed in this study was the Arkgenomics chicken 20 K oligo microarray prepared from 20,460 oligonucleotides designed against the chicken ENSEMBL transcripts .
Two weeks old chicken infected with Eimeria maxima were challenged two weeks later with Eimeria maxima (MM), Eimeria acervulina (MA), or PBS (PM). The samples were collected at 8 hours (MM8, MA8, PM8) and 24 (MM24) hours after infection. The analysis performed allow us to obtain information about: i) differences among a homologous second infection or a heterologous one with another specie of Eimeria (MM8_MA8); ii) how changes the response along the time after a second homologous immunization (MM8_MM24); and iii) the secondary immune response (MM8_PM8) .
The three lists of genes differentially expressed were previously filtered by an adjusted p-value < 0.05. Three working files were generated to perform both analyses using the three datasets. These files must contain one column including all gene ID annotations identified by the two bioinformatics tools. This column was generated according to the annotations provided in the annotation file: original gene IDs (Unigene, HGNC) and mapped with human, mouse and rat homolog. Additional file 1 contains the 'working files' for the three comparisons.
ArrayUnlock (Integromics S.L., Spain)  was used to explore the main biological processes associated to chicken infection employing the 'Biological Enrichment' functionality. This functionality is able to find those biological annotations that are highly associated to a list of genes differentially expressed. Selected annotations were GO Biological Process, GO Molecular Function, GO Cellular Component KEGG pathways and INTERPRO motifs. Annotation associations were filtered by a p-value ≤ 0.01.
Ingenuity Pathways Analysis
The ''Core Analysis' function included in IPA (Ingenuity System Inc, USA)  was used to interpret the chicken data in the context of biological processes, pathways and networks. All Identifier Types were selected since more than one type of identifier exists in our dataset (working file). Both up- and down-regulated identifiers were defined as value parameters for the analysis. After the analysis, generated networks are ordered by a score meaning significance. On the other side, significance of the biofunctions and the canonical pathways were tested by the Fisher Exact test p-value. Biofunctions were grouped in: Disease and Disorders; Molecular and Cellular Functions; and, Physiological System Development and Function. In a similar way canonical pathways were grouped in Metabolic Pathways and Signaling Pathways. Canonical pathways can also been ordered by the ratio value (number of molecules in a given pathway that meet cut criteria, divided by total number of molecules that make up that pathway). In contrast to ArrayUnlock, this pathway analysis tool generates networks where the differentially regulated genes can be related according to previously known associations between genes or proteins, but independently of established canonical pathways. Moreover, networks are associated to functions according to the molecules involves.
Functional analysis and biological enrichment by ArrayUnlock
Top ten Biological Processes significantly altered in ArrayUnlock analysis. In brackets, number of genes from the input file implicated in each annotation. Significance at p < 0.05.
- Regulation of transcription DNA-dependent (5)
- Signal transduction (34)
- Signal transduction (84)
- Metabolic process (4)
- Transcription (24)
- Regulation of transcription, DNA-dependent (71)
- Cell adhesion (4)
- Cell adhesion (19)
- Transcription (4)
-Multicellular organismal development (18)
- Multicellular organismal development (44)
- Actin cytoskeleton organization and biogenesis (4)
- Ion transport (15)
- Cell cycle (34)
- Cytoskeleton organization and biogenesis (2)
- Protein amino acid phosphorylation (13)
- Protein transport (33)
- Chromatin modification (2)
- Cell differentiation (12)
- Metabolic process (30)
- Small GTPase mediated signal transduction (2)
- Nervous system development (12)
- Apoptosis (29)
-Integrin-mediated signalling pathway (2)
- Cell cycle (12)
- Cell adhesion (28)
-Cation transport (2)
- Protein transport (11)
- Protein amino acid phosphorylation (28)
Top ten KEGG Pathways significantly altered in ArrayUnlock analysis. In brackets, number of genes from the input file implicated in each annotation. Significance at p < 0.05.
- GnRH signalling (1)
- MAPK signalling pathway (13)
- Focal adhesion (22)
- Regulation of actin cytoskeleton (1)
-Neuroactive ligand-receptor interaction (10)
- MAPK signalling pathway (21)
- Long-term potentiation (1)
- Regulation of actin cytoskeleton (8)
-Jak-STAT signalling pathway (16)
- Leukocyte transendothelial migration (1)
- Focal adhesion (8)
- Cytokine-cytokine receptor interaction (16)
- Focal adhesion (1)
- GnRH signalling pathway (6)
- Cell cycle (14)
- Ubiquitin mediated proteolysis (1)
- Axon guidance (6)
- Fc epsilon RI signalling pathway (11)
- MAPK signalling pathway(1)
- Calcium signalling pathway (6)
- Natural killer cell mediated cytotoxicity (11)
- Glycan structures-degradation (1)
- Pancreatic cancer (4)
- Insulin signalling pathway (10)
- Glycan structures-biosynthesis 2 (1)
- Long-term potentiation (4)
- Apoptosis (10)
- Glycan structures-biosynthesis 1 (1)
-Leukocyte transendothelial migration (4)
- T cell receptor signalling pathway (9)
Functional analysis by IPA
The results of the analysis were highly dependent of having the most complete annotation available in the data set file. According to this, the creation of the 'working file' was critical in order to take the maximum advantage of the analysis what can be considered as a drawback of both tools.
Both tools provided, information for a global understanding of the underlying biological processes, independently. First, homologous and heterologous second infection induces similar changes in gene expression, although some differences were found associated to several biological functions. Second, the response upon a second homologous infection varied with the time and differed significantly in a relative high number of biological functions. And third, a core of biological functions and pathways associated to a secondary response were similar when the second challenge varied in the time and also in the case of a heterologous secondary infection.
The two analytical tools provided overlapping information so as complementary information. Main differences were due to databases used for each tool. UrrayUnlock results are based in gene ontology terms or KEGG annotations widely known and used in other analytical tools and able to be consulted in free-access databases. On the other side, IPA makes use of a non public bibliographic database and own terminology for functions classification that not always are directly correlated with GO terms. An advantage of Ingenuity was that this tool classify the genes implicated in each function within sub-functions and provide direct link of each molecule to the bibliographic reference were that relationship is described. The results obtained for both tool to identify altered established pathways (canonical pathways in IPA and KEGG pathways in ArrayUnlock) were similar, however, IPA integrates the information of the differentially expressed genes within the figures highlighting the up or down regulation. In general, IPA provided a better presentation of the results and an easier identification of molecules implicated in each function within the interface of the software. Moreover, IPA generates networks where the differentially regulated genes can be related according to previously known associations between genes or proteins, but independently of established canonical pathways.
Authors are grateful to anonymous reviewers and editor for their comments and suggestions. Acknowledgement to EADGENE and SABRE for finance the workshop. To ASG-Lelystad for host the workshop meeting. Caroline Channing and Sandrine Ayuso for the organization. Annemarie Rebel and colleagues to offer their data set for the analysis.
This article has been published as part of BMC Proceedings Volume 3 Supplement 4, 2009: EADGENE and SABRE Post-analyses Workshop. The full contents of the supplement are available online at http://www.biomedcentral.com/1753-6561/3?issue=S4.
- Hedegaard J, Arce C, Bicciato S, Bonnet A, Ramerez-Boo M, Buitenhuis AJ, Collado-Romero M, Conley LN, SanCristobal M, Ferrari F, et al: Methods for interpreting lists of affected genes obtained in a DNA microarray experiment. BMC Proceedings. 2009, 3 (Suppl 4): S5-PubMed CentralView ArticlePubMedGoogle Scholar
- ArrayUnlock software web link. [http://www.integromics.com/ArrayUnlock.php]
- Ingenuity Pathways Analysis software web link. [http://www.ingenuity.com/]
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.