Comparative transcriptome analysis of tree Eucalyptusspecies using RNAseq technology: analysis of genes interfering in wood quality aspects
© Salazar et al; licensee BioMed Central Ltd. 2011
Published: 13 September 2011
The Eucalyptus wood is one of the most important raw materials for pulp and paper industry. Brazil is currently the first producer of short-fiber pulp and sixth in total production of cellulose. To maintain the industrial competitiveness, investment in genomic research started in 2002 with the GENOLYPTUS project (Brazilian Network of Eucalyptus Genome Research). Recently, a new transcriptome library was generated using Next Generation RNA Sequencing by Illumina’s sequencing by synthesis technology.
Different species of Eucalyptus are recognized for their superior characteristics in terms of growth, wood quality and resistance to different types of stress (1). Such features are probably driven by the coordinated expression of numerous genes involved in processes of structural and regulatory genes in xylogenesis. Therefore, the main purpose of this study is to identify genes and key metabolic compounds directly involved in wood quality, as well as transcription factors involved. An extensive data mining in the RNAseq database was conducted to identify sequences over expressed in xylem and those that were differentially expressed between species.
Genolyptus Sanger sequenced ESTs (167,271) and NCBI Eucalyptus ESTs (36,981) were assembled using the program CAP3 (2). All unigenes were automatically annotated using BLAST (3) (e-value cutoff of 1e-5) against protein databases, including: non-redundant (NR) database, uniref (4), pfam (5) and keg (6). Moreover, a functional annotation using the BLAST2GO software was performed (7). The RNA-Seq reads produced from three different xylem libraries (Eucalyptus globulus, E. grandis and E. urophylla) were aligned against the assembled unigenes using the SOAP2 aligner (8) configured to allow up two mismatches, discard sequences with “N”s and return all optimal alignments. In order to perform the differential expression analysis between libraries, a normalization and statiscal pipeline were applied using DEG-seq (9) software considering a 99% confidence rate (cut-off of 0.01). From this analysis we obtained xylem genes and transcription factors differentially expressed between the three species.
Results and discussion
In the E. globulus X E. grandis comparison, most genes are in the macromolecule metabolic process category that includes genes for pectin, cellulose and hemicellulose metabolism and also transcription factors involved in such pathways. Over 10% of these genes are over-expressed in E. globulus. Over 30% of the genes are over-expressed in E.globulus in the category metabolic cellular process. In the E. urophylla X E. grandis comparison, the metabolic cellular process category is representative of the total number of contigs, however, the number of genes over-expressed in E. urophylla is much lower. This may be an indicative that genes that participate in such pathways can contribute to the differential wood qualities found in E. globulus.
These results may contribute to the understanding of wood formation processes and possibly help guide its improvement. The increase in wood quality and productivity has significant economic impacts especially in the pulp and paper industry.
- Grattapaglia D, Kirst M: Eucalyptus applied genomics: from gene sequences to breeding tools. New Phytologist. 2008, 179: 911-929. 10.1111/j.1469-8137.2008.02503.x.View ArticlePubMedGoogle Scholar
- Huang X, Madan A: CAP3: A DNA Sequence Assembly Program. Genome Research. 1999, 9: 868-877. 10.1101/gr.9.9.868.PubMed CentralView ArticlePubMedGoogle Scholar
- Alschul SF, Gish W, Miller W, Myers EW, Lipman DJ: Basic Local Alignment Search Tool. Journal of Molecular Biology. 1990, 215: 403-410.View ArticleGoogle Scholar
- Suzek BE, Huang H, McGarvey P, Mazumber R, Wu CH: Uniref: comprehensive and non-redundant UniProt reference clusters. Bioinformatics. 2007, 23 (10): 1282-1288. 10.1093/bioinformatics/btm098.View ArticlePubMedGoogle Scholar
- Bateman A, Birney E, Cerruti L, Durbin R, Etwiller L, Griffith-Jones S, Howe KL, Marshall M, Sonnhammer ELL: The Pfam Protein Families Database. Nucleic Acids Research. 2002, 30 (1): 276-280. 10.1093/nar/30.1.276.PubMed CentralView ArticlePubMedGoogle Scholar
- Kanehisa M, Goto S: Kegg: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Research. 2000, 28 (1): 27-30. 10.1093/nar/28.1.27.PubMed CentralView ArticlePubMedGoogle Scholar
- Conesa A, Gotz S, García-Gómez JM, Terol J, Talón M, Robles M: Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research. Bioinformatics. 2005, 21 (18): 3674-3676. 10.1093/bioinformatics/bti610.View ArticlePubMedGoogle Scholar
- Li R, Yu C, Li Y, Lam T-W, Yiu S-M, Kristiansen K, Wang J: SOAP2: an improved ultrafast tool for short read alignment. Bioinformatics. 2009, 25 (15): 1966-1967. 10.1093/bioinformatics/btp336.View ArticlePubMedGoogle Scholar
- Wang L, Guo K, Li Y, Tu Y, Hu H, Wang B, Cui X, Peng L: Expression profiling and integrative analysis of the CESA/CSL superfamily in rice. BMC Plant Genomic. 2010, 10 (282): 1-16.Google Scholar
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.