Proceedings of the 16th Annual UT-KBRIN Bioinformatics Summit 2016: proceedings

Fig. 1 (abstract P1). RAST server annotations for A. baumanii clinical isolate. Genes associated with virulence are highly represented. P1 Proteogenomic characterization of a clinical isolate of Acinetobacter baumanii from a case of fulminant sepsis: What does the data mean clinically? L Leon Dent, Sammed N Mandape, Siddharth Pratap, Jianan Dong, Jamaine Davis, Jennifer A Gaddy, Kofi Amoah, Steve Damo, Dana R Marshall Department of Surgery, Meharry Medical College, Nashville, TN 37208, USA; Bioinformatics Core, Meharry Medical College, Nashville, TN 37208, USA; Department of Biochemistry and Cancer Biology, Meharry Medical College, Nashville, TN 37208, USA; Department of Medicine, Vanderbilt University Medical Center, Nashville, TN 37232, USA; Department of Life and Physical Sciences, Fisk University, Nashville, TN 37208, USA; Department of Pathology, Anatomy and Cell Biology, Meharry Medical College, Nashville, TN 37208, USA Correspondence: Dana R Marshall (dmarshall@mmc.edu) BMC Proceedings 2017, 11(Suppl 9):P1


Background
A middle-aged African American male with a history of HIV/AIDS was admitted to Nashville General Hospital at Meharry (NGHM), with severe diarrhea, nausea and vomiting. Upon admission, he was diagnosed with sepsis of unknown etiology. During his hospitalization of almost one month, he stabilized with supportive care but ultimately required ventilation for several days. Efforts to isolate microbes were unsuccessful until one week after mechanical ventilation, when multi-drug resistant Acinetobacter baumanii was isolated from sputum, BAL, blood and urine. Genomic and proteomic analyses were used to better understand the course of this infection with the goal of identifying potential diagnostic markers of virulence and therapy. Materials and methods Whole genome shotgun sequencing consisted of single-end 43 bp reads by Illumina GA-IIx and paired-end 250 bp reads (with 2 kb and 5 kb inserts) on the Roche 454 platform. For proteome analysis, the isolate was cultured in the presence and absence of antibiotics (levofloxacin, tobramycin, gentamycin, cefotaxime, and meropenem) and tandem liquid chromatography mass spectrometry (LC-MS/MS) was used to identify proteins expressed, and comparative protein expression, between the two culture conditions. Results A hybrid de novo assembling technique yielded a draft genome of 3,985,367 bp length, 148 contigs, and an N50 of 66,205 bp (GenBank accession -AZNQ00000000). The NCBI Prokaryotic Genome Automatic Annotation Pipeline predicted 4,064 genes; many of them without annotations or known functions. RAST server annotation reported 79 genes in the subsystem 'Virulence, Disease, and Defense' and 21 genes in the 'Phages, Prophages, Transposable elements, Plasmids' (Fig. 1). Proteome analysis identified greater than 400 proteins total. Comparative protein expression with and without antibiotics, showed 21 proteins differentially expressed (p ≤ 0.005) with the significantly elevanted presence of superoxide dismutase suggesting a strong resistance mechanism that is dangerously broad in scope.

Conclusions
MDR A. baumanii was previously identified as bacteria frequently isolated at NGHM [1]. The WHO lists it as a leading drug resistant pathogen in the hospital environment [2]. Data gathered previously [3], and in this study, clearly show a high representation of genes associated with virulence, disease and defense in this MDR clinical isolate. Proteins differentially expressed suggest an initial non-antibiotic specific mechanism of resistance. Diagnostics targeting these genes or proteins quite likely would have shortened the progression of infection in this patient. There is a critical need to identify mechanisms of virulence, persistence and antimicrobial resistance for A. baumanii and systems biology approaches substantially contribute to this knowledge database.

Background
The Center for Disease Control (CDC) [1] and Tennessee Suicide Prevention Network (TSPN) report consistent findings in an increase of suicide related deaths at the national and state level. Collaborative efforts with the TSPN were initiated by the authors to contribute actionable analysis for significant variables in respect to suicide. Materials and methods CDC mortality data from 2003 through 2013 was extracted, transformed, loaded and analyzed utilizing Python, R Scripting, RStudio and Tableau. During this comprehensive analysis the CDC mortality data was subsetted into a data frame of 17 variables from the original 75 variables that indicated the most statistical significance as a function of the respective suicide ICD 10 codes.

Results
Education attainment levels of 12th grade emerged as one of the most statistically significant variables that contributed to all manner of deaths; this observation is consistent with initial observations of the 2013 CDC mortality data, as it pertained to suicide, analyzed in previous studies [2]. Based on this unique finding of education emerging as a strong and consistent variable in the comprehensive analysis of CDC data over a 11 year period, the authors hypothesized education attainment level segments are the most significant demographic predictor variable of suicide and a systematic approach to targeting continuing educational opportunities to patients with low education attainment levels paired with other high risk segments including but not limited to age, race, ethnicity, marital status and gender [3]. The 2010 census data was used in a hypothesis test to validate the significance of education attainment levels as a significant variable in respect to all manners of death and suicide. A comparison of the distribution between the reported deceased education levels from the CDC data and the education attainment levels of the reported living population from the census data lead the authors to reject their null hypothesis that the distributions of education attainment levels would be consistent between the reported deceased and living populations. The alternative hypothesis was accepted that the distribution of education attainment levels are significantly different among the deceased and living populations. Additional research of the significance of the education attainment level is required to explore the relationship between education attainment levels and the manner of death of the deceased population reported in the CDC data.

Background
The human body is inhabited by trillions of microbes that constitute the microbiome of each individual [1]. The composition of these microbial communities differs from one individual to another depending on gender, diet, genetics, geographical location, etc [2]. Such interindividual diversity affects human health, disease, and even response to therapy [3,4], and microbiome imbalance has been linked to dysbiosis and disease [1]. Moreover, the human microbiota is a major determinant of the human innate immunity as resident microbes help preventing colonization by external pathogens. A typical example is how the nasal microbiota can protect the host against carriage of multi-resistant Staphylococcus aureus.
Recently, an inverse correlation was discovered between nasal colonization by S. epidermidis and S. aureus; therefore, we launched this pilot study to analyze the relative abundance of nasal staphylococci and to investigate the competition between these two staphylococcal species.

Methods and results
Duplicate nasal swabs were collected from 17 healthy volunteers living in Cairo Egypt, including 10 nurses. One swab was used for culturebased analysis while the other was subjected to DNA extraction. The V4 regions of the 16S rRNA gene of five nasal swabs were amplified and sequenced on an Illumina MiSeq platform. The biodiversity of bacterial taxa was analyzed by the QIIME software package, and statistical analysis was performed in QIIME and R. Three out of the five samples were quite close to each other (low UNIFRAC distance), while the other two were rather divergent. All samples were dominated by genera Staphylococcus, Corynebacterium, and Propionbacterium, while the most differentiating genera were Peptoniphilus and Anaerococcus (Samples 1,2,3); Dialister and Prevotella (Sample 4); Bacteroidetes and Fecalibacterium (Sample 5). To validate the data and sample divergence, we compared the microbiome composition of the selected samples with randomly selected publicly available human microbiome samples from various body sites (oral, skin, nasal, intestinal, and vaginal microbiomes: 60-80 samples per site). Interestingly, no geographical signal was detected as the nasal samples from Egyptian individuals clustered with HMP nasal samples, and were most similar to skin microbiomes and least similar to intestinal microbiomes. Because 16S rRNA gene sequencing is not sensitive enough to resolve specieslevel variations, we used comparative genomics (MetaRef [5]) to define signature genes for S. aureus and S. epidermidis, and we used PCR to screen all 17 samples for these two potentially competing organisms.

Conclusion
This pilot study confirmed the lack of geographical signal among nasal microbiome samples from Egypt. Two distinct types could be identified, one dominated by Staphylococcus sp. & Corynebacterium sp., while the other is surprisingly rich in Bacteroides. PCR analysis of commensal vs. pathogenic staphylococci showed mutual exclusion except one sample only. Ongoing work will focus on using our primers in the quantification of each species in samples. The completion of the Human Genome Project led to the recognition that many human phenotypes, including variation in disease susceptibility and drug response, cannot be merely explained by human genetic variations [1]. Thus, the Human Microbiome Project (HMP) was initiated to investigate the contribution of human-associated microbes to human phenotypic variations [2,3]. Meanwhile, rapid advances and cost reduction of sequencing technologies led to the rapid accumulation of genomic, metagenomic, and 16S rRNA data from different habitats, aiming at exploring the biodiversity of almost every samplable ecosystem on earth [4]. Although most of these data were published and interpreted separately, re-examining them in a holistic way may lead to new insights about microbial ecology.
Here, we reanalyzed publicly available 16S microbiome sequences from 1,689 samples, representing 17 ecosystems, five of which are human-associated (the five HMP sites), aiming to examine overlaps and correlations between different environments. Materials and methods 16S rRNA gene sequence data were obtained from public databases (e.g., HMP [5] and Sequence Read Archive) and re-analyzed with QIIME [6] for beta-diversity by the unweighted and weighted UNI-FRAC distance metric for taxa composition and relative abundance, respectively. Principal coordinate analysis (PCoA) was used for clustering different samples according to their UNIFRAC distances.

Results
We observed a growing microbiome continuum, in which no ecosystem (or human body site) is fully separated from others, but with samples rather spreading to overlap and span multiple clusters. Yet, extreme clusters can be still distinguished, notably human fecal samples, river sediments, and a few microbiomes from hypersaline environments. Among HMP samples, nasal microbiotas were the most diverse, ranging from a majority clustering with skin samples to a few with similarities to oral and gut samples. Overall, human-associated microbiomes were quite distinct (upper and left clusters, Fig. 2) from plant and environmental microbiomes (bottom and right clusters, Fig. 2). Interestingly, aerosol microbiomes (n = 46) clearly bridged both worlds (environmental and host-associated samples).

Conclusion
In conclusion, we observed a "microbiome continuum" with two large separate clusters (human associated and environmental) bridged by aerosol microbiomes. We expect that the accumulation of future samples may continue to fill the gaps in this continuum.

Background
The Human Microbiome Project (HMP) was launched to explore human-associated microbes and their impact on human health, disease, and immunity status. The human microbiota at different anatomic sites (mouth, nose, skin, colon, and vagina) is being extensively studied; yet, studies on the eye microbiome are still lagging behind, possibly because the human eye was once considered free of resident microbiota and because of the scarcity of cultured bacteria isolated from the eye [1][2][3]. The human eye, being in contact with the environment, is continuously exposed to microbes and viruses, but the balance between resident and transient microbes is not fully understood. Here, we use conjunctivitis as a model to understand the potential role of the eye microbiota in health and disease.

Materials and methods
We collected duplicate sets of 128 conjunctival swabs, from 54 patients and ten healthy controls. All patients had only one infected eye, and were free of visible viral or fungal infection. Importantly, to minimize inter-individual variations, we compared infected and uninfected eyes of each patient, and we used healthy controls to estimate the natural variability between two eyes. One swab, from each eye, was used for culture-based analysis and the other was processed for DNA extraction and 16S amplicon analysis by highthroughput sequencing (MiSeq, Illumina, USA).

Results
Sixty-four samples were culture positive. Most commonly isolated bacteria were Staphylococcus aureus, S. epidermidis and Acinetobacter. Antimicrobial susceptibility profiles were mostly similar in both eyes. The microbiome composition of 102 samples (from 51 subjects) was analyzed and compared. Many bacterial taxa were determined that were undetected by culture-based techniques, e.g., Moraxella, Micrococcaceae and Corynebacteria. When compared to different HMP samples (from HMP and SRA databases), eye microbiome samples clustered together, but were quite close to the skin and nose microbiomes. Beta diversity between subjects was higher than the diversity between eyes of the same subjects (Fig. 3), and no significant differences were seen between the two groups. A few bacterial taxa were significantly more abundant in the control group, e.g., Pseudomonadaceae, and Enterobacteriaceae. Yet, the biological significance of these statistical differences is not certain because of the small proportion of these taxa. Inter-eye variability was more or less the same among patient and control samples, and interinidividual differences remained the most prominent.

Conclusion
In conclusion, our findings suggest that bacteria conjunctivitis is most likely caused by resident taxa of the eye, with no outside pathogen(s) singled out as causative agent(s).

P6
Identification of reference genes for quantitative RT-PCR analysis of miRNAs in equine serum Pouya Dini 1,2 , Shavahn C Loux 1 , Kirsten E Scoggin 1 , Alejandro Esteller-Vico 1 , Edward L Squires 1 , Mats HT Troedsson 1 , Peter Daels 2 , Barry A Ball 1 Background MicroRNAs (miRNAs) expression patterns are commonly determined by RT-qPCR; however, the identification of tissue-specific and speciesspecific reference miRNA is a prerequisite for miRNA expression analysis. The aim of the current study was to identify reference genes for normalization of miRNA expression patterns in serum throughout equine pregnancy.

Materials and methods
A list of twenty potential reference genes was initially chosen based on a previously generated miRNA sequencing dataset by NormFinder software (v 0.953). Expression stability of these miRNAs was evaluated in the serum samples collected from different stages of equine pregnancy (4, 6 and 10 month) and post-partum, as well as in the serum of geldings and diestrous mares. Of these, we identified the most stable miR-NAs with geNorm (SLqPCR package, Version 1.0.0) and NormFinder. The correlation between the "stability value" generated by NormFinder and the "M value" generated by geNorm was calculated by Pearson correlation coefficient. Spearman's rank-order correlation coefficient was calculated to assess the correlation between miRNAs ranking with geNorm and NormFinder.

Background
Horse breeds have undergone many outcrossing events throughout time. In a recent study, a region of introgression event was identified which is estimated by an evolutionary clock to have occurred about 500,000 years ago. Now we are searching for other events of introgression which might explain adaptive introgression in modern horses.

Materials and methods
Our study consists of 6 animals including three horses as well as three non-caballine equids, a kiang, zebra, and a Somali ass. Because our search is for events that happened so long ago, the haplotype blocks for which we are searching are likely to be much smaller than those that have been reported in humans such as the Human x Denisovans or Human x Neanderthal necessitating a different search strategy. Here we present an algorithm and preliminary results of our search for introgression of non-caballines in modern horses.

P8
Effects of ethanol and lipopolysaccharide on the renal cortex proteome and transcriptome The kidney is known to be damaged in end-stage liver disease caused by ethanol (EtOH) consumption (i.e. hepato-renal syndrome) [1]. However, the direct effects of EtOH on the kidney are unclear. Some studies suggest that EtOH damages the kidney (i.e. oxidative stress) [2,3]. These studies have largely been limited to hypothesisdriven approaches based on established mechanisms in other target organs; a discovery-based approach has not yet been applied. Additionally, the effects of EtOH on the kidney's response to secondary hits are unknown. We hypothesized that moderate EtOH consumption alters renal responses to acute lipopolysaccharide (LPS) exposure. This was investigated using a discovery-based proteomic and transcriptomic approach.

Materials and methods
Mice were pair fed EtOH-containing diet for 6 weeks and/or injected i.p. with LPS 4 h prior to sacrifice. Kidney cortex sections were isolated and snap frozen. Comparative Omics studies used (a) two-dimensional liquid chromatography-mass spectrometric analysis with an Orbitrap Elite and TMT labeling reagents and (b) mirVana™ isolation of total RNA, library preparation using the TruSeq Stranded Total RNA LT Sample Prep Kit-Set A with Ribo-Zero Gold and data collection using Illumina NextSeq 500/550 with the RNASeq protocol. Proteomic data were filtered by Benjamini-Hochberg (BH) corrected ANOVA p-value <0.05 and fold change (FC) ≥ 1.2. Transcriptomic data were filtered by q-value <0.05 and FC ≥ 2. Ingenuity Pathways Analysis (IPA) was used to identify pathways changed by EtOH and/or LPS. Individual protein changes were validated with immunoblot.

Results
EtOH and/or LPS exposure significantly changed expression of 853 of 47,719 transcripts. EtOH and/or LPS also impacted the proteome, significantly changing the abundance of 170 of 1863 proteins. The majority of these effects were LPS-derived. EtOH decreased the abundance of peroxisomal proteins, but this was not caused by a decrease in transcription; EtOH did not decrease expression of peroxisomal transcripts. Interestingly, LPS attenuated the decrease in peroxisomal protein abundance caused by EtOH. IPA results show that EtOH exposure downregulated the Nrf2 pathway, and EtOH preexposure attenuated LPS effects on LXR/RXR and Nrf2-mediated pathways. Western blot and immunohistochemistry of Nrf2 target proteins validated proteomic findings.

Conclusions
This study revealed new changes caused by EtOH and/or LPS in renal proteins, transcripts, and pathways. These changes provide insight into mechanisms by which EtOH affects the kidney and alters response to a second pathologic stimulus.

Background
Diseases such as cancers depend on enhanced nutrient utilization and metabolic reprogramming, the deciphering for which metabolomics at first appears to be well-suited. But each metabolite is used in numerous roles, so that even with excellent metabolite analytical data, the INFOR-MATION signal-to-noise of each metabolite is less than unity due to participation in multiple pathways. What is needed is to track the provenance of metabolite substructures that reveals the coupled metabolic reactions in a network. Stable isotope resolved metabolomics (SIRM) does just that, to distinguish pathways and functions among otherwise chemically identical metabolites. Materials and methods SIRM achieves functional proteomics via in situ enzyme assays, revealing up/down regulation of pathways in metabolic reprogramming. This is accomplished by incubating cell cultures with e.g. [U-13 C]-glucose or [U-13 C, 15 N]-glutamine [1], infusing human lung cancer patients with [U-13 C]glucose and resecting their tumor and non-tumor tissues [2,3], and incubation experiments of their resected tissue slices with [U-13 C]-glucose or [U-13 C, 15 N]-glutamine [3]. Some experiments simultaneously incubate the 13 C and 15 N and/or 2 H. Analyses of polar and non-polar metabolites are by multidimensional, multinuclear NMR (e.g. HSQC) combined with ultra-high resolution mass spectrometry; the latter requiring resolving power >400,000 to analyze multiple elemental isotopologues [4].

Results
Glucose vs glutamine nutrient sources of carbon in cancers has revealed key aspects of C, N resource allocation, including the up regulation of anapleurosis via pyruvate carboxylase in relation to glucose vs glutamine utilization [1]. In fact, we initially elucidated this pathway in lung cancers resected from 13 C glucose infused human subjects [2], and further detailed the metabolic reprogramming in humans, tissue slices, and cell cultures [3]. This demonstrates the unique ability of SIRM for untargeted discovery of metabolic networks directly in humans for maximal relevance, comparing the identical experiments in animal and cell models for unprecedented congruence of results. Among the key findings from simultaneous 13 C, 2 H experimentation is the mapping of preferred pools of metabolites for e.g. nucleotide vs protein synthesis; these apparent "channeling" pathways are previously unknown, and significant enough to impede or misdirect metabolic modeling.

Discussion
One of the greatest challenges is the metabolic landscape revealed by SIRM: that of shared/isolated metabolic pools corresponding to previously unknown dynamic compartmentation. This crucial area is poorly addressed by all of the omics, in part due to lack of extremely detailed, dynamic metabolic phenotype information, which SIRM can help to remedy.