- Open Access
Multi-comparative systems biology analysis reveals time-course biosignatures of in vivo bovine pathway responses to B. melitensis, S. enterica Typhimurium and M. avium paratuberculosis
© Adams et al; licensee BioMed Central Ltd. 2011
- Published: 3 June 2011
To decipher the complexity and improve the understanding of host-pathogen interactions, biologists must adopt new system level approaches in which the hierarchy of biological interactions and dynamics can be studied. This paper presents the application of systems biology for the cross-comparative analysis and interactome modeling of three different infectious agents, leading to the identification of novel, unique and common molecular host responses (biosignatures).
A computational systems biology method was utilized to create interactome models of the host responses to Brucella melitensis (BMEL), Salmonella enterica Typhimurium (STM) and Mycobacterium avium paratuberculosis (MAP). A bovine ligated ileal loop biological model was employed to capture the host gene expression response at four time points post infection. New methods based on Dynamic Bayesian Network (DBN) machine learning were employed to conduct a systematic comparative analysis of pathway and Gene Ontology category perturbations.
A cross-comparative assessment of 219 pathways and 1620 gene ontology (GO) categories was performed on each pathogen-host condition. Both unique and common pathway and GO perturbations indicated remarkable temporal differences in pathogen-host response profiles. Highly discriminatory pathways were selected from each pathogen condition to create a common system level interactome model comprised of 622 genes. This model was trained with data from each pathogen condition to capture unique and common gene expression features and relationships leading to the identification of candidate host-pathogen points of interactions and discriminatory biosignatures.
Our results provide deeper understanding of the overall complexity of host defensive and pathogen invasion processes as well as the identification of novel host-pathogen interactions. The application of advanced computational methods for developing interactome models based on DBN has proven to be instrumental in conducting multi-conditional cross-comparative analyses. Further, this approach generates a fully simulateable model with capabilities for predictive analysis as well as for diagnostic pattern recognition. The resulting biosignatures may represent future targets for identification of emerging pathogens as well as for development of antimicrobial drugs, immunotherapeutics, or vaccines for prevention and treatment of diseases caused by known, emerging/re-emerging infectious agents.
- Gene Ontology
- Dynamic Bayesian Network
- Ileal Loop
- Brucella Melitensis
- Pathogen Condition
The complexity of host-pathogen interactions requires a system level understanding of the entire hierarchy of biological interactions and dynamics. A systems biology approach can provide systematic insights into the dynamic/temporal difference in gene regulation, interaction, and function, and thereby deliver an improved understanding and more comprehensive hypotheses of the underlying mechanisms [1, 2]. Moreover, the ability to consolidate complex data and knowledge into plausible interactome models is essential to promote the effective discovery of key points of interaction. Accordingly, a systems biology approach to study molecular pathway gene expression profiles of host cellular responses to microbial pathogens holds great promise as a methodology to identify, model and predict the overall dynamics of the host-pathogen interactome.
Gene expression data acquisition
An established in vivo perinatal calf ligated ileal loop model, in conjunction with custom bovine microarrays, was used to study the early temporal changes in the host response to a previously optimized dosage of 1 x 109 colony forming units of STM, BMEL, or MAP at four common sampling time-points post infection (0.5, 1, 2, and 4 hours pi) conducted under approved protocols by the Texas A&M University Institutional Animal Use and Care Committee. Gene expression data was collected by Dr. Adams’ lab at Texas A&M College of Veterinary Medicine following a surgical and sample collection methodology described elsewhere [3–6]. For each pathogen condition, under approved BSL2/BSL3 conditions, there were four biological replicate of infected and control loops in the non-survival surgeries for each pathogen performed on 3-week old male pathogen-free Holstein calves. Host RNA (from each host-pathogen surgery) was collected and co-hybridized in quadruplicate against bovine reference RNA to 13K custom bovine arrays (fabricated by W. M. Keck Center, University of Illinois at Urbana-Champaign) to allow for cross-comparison between experimental conditions. These custom microarrays consist of 70-mer oligonucleotides representing 13,258 unique oligos with 12,220 cattle ORFs. A detailed description of the design and development of the microarray has been published elsewhere . Briefly, the microarray was comprised of unique 70-mers oligonucleotide sets representing cattle ORFs obtained from bovine non-immune related placenta and immune-related spleen cDNA libraries and based upon the earlier cDNA array platform GPL2864 and subtracted cDNA libraries created from embryonic and extra-embryonic tissues (NCBI libraries 15993, 15993 and 17188). Time matched RNA from non-infected ligated ileal loops were used as healthy state controls. Proteomics analysis of the STM-Host samples was provided by Pacific Northwest National Labs (PNNL) utilizing an approach referred to as the accurate mass and time (AMT) tag approach .
Systems biology analysis and modeling
Computational analysis and modeling was completed by Dr. Drake at Seralogix using an integrated platform enabling a systems biology computational pipeline for multi-conditional analysis and modeling, termed the BioSignature Discovery System (BioSignatureDS™). Its core tools are based on Dynamic Bayesian Networks (DBN) , an advanced form of machine learning and pattern recognition. The platform enables comprehensive cross-comparisons of the genomic/proteomic data to identify key pathway/GO perturbations and underlying mechanistic regulatory points. BioSignatureDS™ is used to identify groups and individual genes that capture the perturbation in a pathway or biological process over time. This technique is named Dynamic Bayesian Gene Group Activation (DBGGA). DBGGA employs Dynamic Bayesian network (DBN) models that are trained using gene expression data replicates from the condition considered as controls. The other experimental condition expression data replicates are then used as evidence to test the goodness-of-fit of these data against the trained DBN control model. Goodness-of-fit is determined by Bayesian likelihood ratio tests that are subsequently transformed to a z-score test statistic (Bayesian z-score) to permit comparison of scores across all pathways and biological processes. The DBGGA computational method scores and rank groups of interrelated genes within a given pathway or gene ontology group across all time points in lieu of just one gene in a single time point (such as used in traditional t-tests) and thus determines the differences and commonalities between experimental and control conditions. DBGGA can also determine which genes are the significant sources of the perturbation. Such genes are designated as “candidate mechanistic genes”, a term we coined to describe those genes within a pathway that individually contributed significantly to the overall pathway Bayesian z-scores and thus are considered mechanistic candidates that may play key roles in governing the host response. Only those genes which are associated with a given pathway or biological process (GO group) were examined using the DBGGA modeling approach.
BioSignatureDS™ utilizes the significantly perturbed pathways, GO groups and the “mechanistic genes” as building blocks to construct system level network model of the disease/condition. Encapsulating global time-course patterns and multi-conditional behaviors of a large group of genes/proteins, the systems level model has great discriminating power even when the effects of individual genes are small. Thus, the disease models can be used for more efficient comparative modeling, pattern recognition (diagnostics) and simulations (prognostics). Further, proteomic data can be integrated into the models as an overlay to individual pathway and system-level models to both confirm the presence of proteins for their encoding genes as well as to visualize the temporal patterns of protein abundance.
For each host/pathogen interaction condition, BioSignatureDS™ modeled and scored 219 known metabolic and signaling pathways and 1620 biological processes (gene groups associated with Gene Ontology (GO) terms) at four time points. DBGGA was employed to identify the perturbations between pathogen conditions for pathways, GO categories, and genes. The DBGGA method generates Bayesian log likelihood scores that are normalized and transformed to a z-score equivalent (hereafter Bayesian z-score) so that all pathways and GO groups across all host/pathogen conditions can be equivalently compared and assessed for significance.
DBGGA pathway analysis
Arc weight correlation
In Table 1, we show a list of 20 specific gene-to-gene relations associated with the MAPK pathway (Figure 3) having strong positive and/or negative arc weight correlations. We normalized the DBN arc weights to allow equivalent comparison to other pathway gene-gene relations. In this arc weight table, a few significant relationships are labeled (1 thru ) in the table and on the network (Figure 3) with an encircled number and arrow pointing to the corresponding arc. It is hypothesized that virulence factors from each pathogen can have different disruptive influences on the host’s MAPK Signaling Pathway and that such disruption can be used to identify pathogenic mechanisms unique to each pathogen. For example, the relation arc TRAF2->FLNA had a strong positive weight (correlated) for STM-Host (0.241) and for the BMEL-Host (0.204) while MAP-Host had a large negative weight (anti-correlated) of -0.17. The reversal of gene-to-gene arc weight of MAP-Host may indicate a disruption of either TRAF or FLNA gene by virulent factors of the pathogens, identifying potential novel points of interaction. TRAF2 encodes a protein that is a member of the TNF receptor associated factor (TRAF) protein family. This protein is required for TNF-alpha-mediated activation of MAPK8/JNK and NF-κB. It has a binding relationship with the filamin-A protein encoded by the FLNA gene. FLNA participates in the anchoring of membrane proteins to the actin cytoskeleton. This type of interaction analysis can be done for every pathway and used to identify novel differences between pathogen conditions. The visualization of mechanistic genes and arc weight enables an efficient identification of the differences between pathogenic influences.
DBGGA gene ontology analysis
Biological system model generation for comparative host response analysis
Pathways Utilized for System Model Generation
Regulation of actin cytoskeleton
Fc epsilon RI signaling
T cell receptor signaling
Jak-STAT signaling pathway
Toll-like receptor signaling
VEGF signaling pathway
Hedgehog signaling pathway
Wnt signaling pathway
Cell Growth and Death
mTOR signaling pathway
MAPK signaling pathway
Sensitivity and Specificity Model Assessment
False Pos Rate
False Neg Rate
Cross-species protein-protein interactome model
Ultimately, the goal is to develop interactome models between the pathogen and the host to make predictions of protein-protein interactions (PPIs). A preliminary STM-Host model (model not shown herein) was created using BioSignatureDS™ augmented with new computational methods to learn PPIs between the pathogen and the host. Model PPI structure learning utilized (host and STM) microarray gene expressions and mass spec protein levels (extracted and measured simultaneously from the bovine host [14, 15]) guided by a prior biological interactions and cross-species PPI predictions. The model identified a set of 34 known and novel STM-Host protein interactions. These predicted interactions will be employed in guiding future biological experiments for their confirmation as host-pathogen virulence factors and are candidates for points of intervention. PPI interactome models will be developed for BMEL-Host and MAP-Host and full results reported in future publications.
We analyzed the set of system model genes (622) looking for strong differential expressions (99% confidence) between the three conditions (STM-Host vs. MAP-Host, STM-Host vs. BMEL-Host, MAP-Host vs. BMEL-Host). In this manner we were able to identify sets of genes that were uniquely different from the other two pathogen responses. We found 69 unique genes in the BMEL-Host, 253 genes in the STM-Host, and 64 genes in the MAP-host. Listing of these genes is not provided herein, but will be made publically available in a forthcoming publication. The results of our analysis and modeling identified common and uniquely perturbed genes and biosignatures. A cross-species interactome model identified 34 PPIs that was learned from STM-Host co-expressed gene and protein data. More importantly, these models produce gene networks that can be used to identify novel targets for intervention, as well as the overall improved understanding of the underlying pathogenicity of each pathogen.
The animal study was supported by U.S. Department of Homeland Security – National Center of Excellence for Foreign Animal and Zoonotic Disease (FAZD) Defense grant ONR-N00014-04-1-0, Public Health Service grants 2U54AI057156-06, AI040124, AI044170, AI079173, and AI076246 and USDA Cooperative State Research, 432 Education and Extension Service- National Research Initiative-CAP JDIP program.
The computational analysis was supported in part by the National Institutes of Allergies and Infectious Diseases SBIR grants 2R44AI058362-02 and R43AI084223-01.
This article has been published as part of BMC Proceedings Volume 5 Supplement 4, 2011: Proceedings of the International Symposium on Animal Genomics for Animal Health (AGAH 2010). The full contents of the supplement are available online at http://www.biomedcentral.com/1753-6561/5?issue=S4.
- Musser JM, DeLeo FR: Toward a genome-wide systems biology analysis of host-pathogen interactions in group A Streptococcus. Am J Pathol. 2005, 167 (6): 1461-1472. 10.1016/S0002-9440(10)61232-1.PubMed CentralView ArticlePubMedGoogle Scholar
- Franke R, Müller M, Wundrack N, Gilles ED, Klamt S, Kähne T, Naumann M: Host-pathogen systems biology: Logical modelling of hepatocyte growth factor and Helicobacter pylori induced c-Met signal transduction. BMC Systems Biology. 2008, 2 (4):Google Scholar
- Khare S, Nunes JS, Figueiredo JF, SD SDL, Rossetti CA, Gull T, Chu CH, Tsang L, Bevins CL, Adams LG, et al: Early phase morphological lesions and transcriptional responses of bovine ileum infected with Mycobacterium avium subsp. paratuberculosis. Vet Pathol. 2009, 46 (4): 717-728. 10.1354/vp.08-VP-0187-G-FL.View ArticlePubMedGoogle Scholar
- Zhang S, Santos RL, Tsolis RM, Stender S, Hardt WD, Baumler AJ, Adams LG: The Salmonella enterica serotype typhimurium effector proteins SipA, SopA, SopB, SopD, and SopE2 act in concert to induce diarrhea in calves. Infect Immun. 2002, 70: 3843-3855. 10.1128/IAI.70.7.3843-3855.2002.PubMed CentralView ArticlePubMedGoogle Scholar
- Santos RL, Zhang S, Tsolis RM, Bäumler AJ, Adams LG: Morphologic and molecular characterization of S. typhimurium infection in neonatal calves. Veterinary Pathology. 2002, 200-215. 10.1354/vp.39-2-200. 39Google Scholar
- Santos RL, Raffatellu M, Bevins CL, Adams LG, Tükel Ç, Tsolis RM, Bäumler AJ: Life in the Inflammed Intestine, Salmonella Style. Trends in Microbiology. 2009, 498-506. 10.1016/j.tim.2009.08.008. 17Google Scholar
- Loor JJ, Everts RE, Bionaz M, Dann HM, Morin DE, Oliveira R, Rodriguez-Zas SL, Drackley JK, Lewin HA: Nutrition-induced ketosis alters metabolic and signaling gene networks in liver of periparturient dairy cows. Physiol Genomics. 2007, 32 (1): 105-116. 10.1152/physiolgenomics.00188.2007.View ArticlePubMedGoogle Scholar
- Pacific Northwest National Laboratory (Mass Spec). [http://www.pnl.gov/biology/programs/msd/measurements.stm]
- Murphy K: Dynamic Bayesian Networks: Representation, Inference and Learning. Ph.D. 2002, UC BerkeleyGoogle Scholar
- Hobbie S, Chen LM, Davis RJ, Galan JE: Involvement of mitogen-activated protein kinase pathways in the nuclear responses and cytokine production induced by Salmonella typhimurium in cultured intestinal epithelial cells. J Immunol. 1997, 159: 5550-5559.PubMedGoogle Scholar
- Ruckdeschel K, Harb S, Roggenkamp A, Hornef M, Zumbihl R, Kohler S, Heesemann J, Rouot B: Yersinia enterocolitica impairs activation of transcription factor NF-kB: involvement in the induction of programmed cell death and in the suppression of the macrophage tumor necrosis factor production. J Exp Med. 1998, 187: 1069-1079. 10.1084/jem.187.7.1069.PubMed CentralView ArticlePubMedGoogle Scholar
- Tang P, Sutherland CL, Gold MR, Finlay BB: Listeria monocytogenes invasion of epithelial cells requires the MEK-1/ERK-2 mitogen-activated protein kinase pathway. Infect Immun. 1998, 66: 1106-1112.PubMed CentralPubMedGoogle Scholar
- Schorey JS, Cooper AM: Macrophage signalling upon mycobacterial infection: the MAP kinases lead the way. Cell Microbiol. 2003, 5: 133-142. 10.1046/j.1462-5822.2003.00263.x.View ArticlePubMedGoogle Scholar
- Boyce JD, Cullen PA, Adler B: Genomic-scale Analysis of Bacterial Gene and Protein Expression in the Host CDC. Emerging Infectious Diseases. 2004, 10 (8):Google Scholar
- Guerfali FZ, Laouini D, Guizani-Tabbane L, Ottones F, Ben-Aissa K, Benkahla A, Manchon L, Piquemal D, Smandi S, Mghirbi O, et al: Simultaneous gene expression profiling in human macrophages infected with Leishmania major parasites using SAGE. BMC Genomics. 2008, 9 (238):Google Scholar
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.