1 Introduction
Catharanthus roseus is a medicinal plant species that has many pharmaceutical applications. This plant species possesses over 100 terpenoid indole alkaloids (TIAs) including the two important anticancer bisindole alkaloids namely vinblastine and vincristine [1,2]. As being able to grow in the wild habitat, this plant species is considered as an attractive target for characterizing genetic makeup and detecting important genes that might be involved in the tolerance to a wide range of biotic and abiotic stresses [3,4].
The universal stress proteins (USPs) were reported to play a role in tolerating prolonged exposure to heat, starvation, different biotic/abiotic stresses or DNA damage [5]. However, the exact function of USP remains to be elucidated. The USP domain is a protein structure that is widespread in prokaryotes. This structure originally refers to USPA (universal stress protein A) of Escherichia coli due to its prominence in the stationary phase. However, the structure of USP (namely 1MJH or MJ0577) is also resolved in Methanococcus jannaschii [6]. Biochemical analysis showed that MJ0577 binds ATP, hence, it mediates ATP-dependent functions molecular switching. Another type of USP that exists in Haemophilus influenza (1JMV) cannot bind ATP. Many studies indicated the lineage of the plant USPA-like domain from a 1MJH-like ancestor [5,7]. The USPA-like domain of different organisms is either found solely as a small protein, or linked to another domain, likely a protein kinase, downstream the N-terminus of the protein [7]. Multiple sequence alignment and topographic cladogram of the USPs in plant indicated the presence of protein kinase domain types 1.3.1, 1.3.2 or 1.4.1. Annotation of these domains indicated similarity in function to the tomato Pto interactor protein or Pti1 [8]. However, very little is known about the function of the uspA gene family in plants.
The number of pathway databases (PDBs) constructed from annotated genomes of biological systems is increasing [9–13]. More recently, Van Moerkercke et al. [14] have conducted an RNA-Seq analysis to construct the CathaCyc database of different organs and hairy roots genotypes of the C. roseus transcriptomes. This database was utilized in the present work for the identification of putative USPA proteins in C. roseus.
In the present study, we have detected and characterized a number of 24 putative USPs in C. roseus from the SRA database available in the NCBI in order to detect topology and possible functions.
2 Materials and methods
A number of 19 RNA-seq raw data of C. roseus was retrieved from SRA database (http://www.ncbi.nlm.nih.gov/sra, experiment SRP005953). RSEM (v1.1.6) in conjunction with Bowtie aligner (Bowtie v0.12.1) were used to estimate the relative abundances and expected read counts for the transcripts and to map the reads against the transcripts. Then, data were merged and de novo-assembled using Trinity-RNAseq v (r2013_08_14).
Based on the conserved USPA-like domain of 1MJH from Methanococcus jannaschii, putative USPs harboring USPA-like domain in C. roseus were characterized. Conserved USPA-like domain sequences were examined by using the CDD of the NCBI (http://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml) in conjunction with BLAST searches. USPA-like domains of the recovered putative USPs in C. roseus were, then, compared to the two respective bacterial domains as well as 24 randomly selected USPA-like domains from Arabidopsis. Sequences of resulted uspA-like transcripts and proteins were deposited in the NCBI and USPA domains were identified by Transcoder program (http://www.wowza.com/transcodingsoftware). For comparison, the Arabidopsis sequences were retrieved from the Institute for Genomic Research (TIGR).
Multiple sequence alignment and construction of phylogenetic tree were done for the USPA-like domain of different putative USPs from C. roseus after removal of all the other up- and downstream protein sequences. Multiple sequence alignments among USPA-like domains in C. roseus and those known in Methanococcus jannaschii (protein MJ0577 or 1MJH), Haemophilus influenza (1JMV) as well as the randomly selected USPA-like domains in Arabidopsis were done by using DNAman 5.2.2 (http://www.lynnon.com). Then, the results obtained were utilized in constructing phylogenetic tree using the neighbor-joining (NJ) method based on MEGA version 3.1 [15] with bootstrap support at critical nodes (1000 replicates). ClustalW [16] was used to make the alignment, and the profile was generated using the method of Gribskov and Veretnik [17]. Alignments were reshuffled by bootstrap resampling (1000×) and bootstrap support of 40% or higher were used.
3 Results and discussion
3.1 Putative USPA structure in C. roseus
Putative USPAs in C. roseus were recovered from the SRA database of the NCBI and their structures were compared to those of Methanococcus jannaschii MJ0577 (1MJH) and the Haemophilus influenzae (1JMV). Accordingly, 24 Pfam putative USP proteins with the USPA-like domains (PF00582) in C. roseus were identified (Table 1). Twenty one of these proteins harbor full-length USPA domain sequences, two sequences (AID16052 & AID16053) are truncated in the N terminus and one sequence is truncated in the N and C termini (AID16048). These putative USPAs of C. roseus were arranged in six architectures (Table 1 and Fig. 1). Expectedly, the USPA-like domain is present in all the putative USPs analyzed. The PK-like (PF00069), tyrPK-like (PF07714) and U-box (PF04564) domains are found downstream four, one and two C. roseus putative USPs, respectively (Table 1). The latter numbers do not reflect the real presence of these domains in the C. roseus putative USPAs as few of them were truncated in the sequences downstream the USPA-like domain. Therefore, our subsequent comparisons was based only on the USPA-like domains of the different putative USPs under study. Besides, we have used USPA-like domains from Arabidopsis that are well-known for the presence of different protein kinase (PK) types (designated 1.3.1, 1.3.2, 1.4.1 or “1MJH-like, plant” in Kerk et al. [7]) downstream the USPA domain. PK-like and/or protein tyrosine kinase-like (or tyrPK) sequences in putative USPAs of C. roseus exist downstream of the USPA-like domain [18]. U-Box domain is usually detectable at extreme C terminus of PK. PK/U-Box structures were reported as modified zinc fingers that might play an important role in the process of protein ubiquitination. The U-Box portion of the molecule performs a coiled-coil conformation, which mediates protein-protein interactions [19,20].
Description of the C. roseus 24 putative USPA proteins and uspA-like transcripts recovered from SRA database (http://www.ncbi.nlm.nih.gov/sra, experiment SRP005953) and deposited in the NCBI.
No. | Nucleotide accession no. | Protein accession no. | aa | Description | Groupd |
1 | KJ634212 | AID16030a | 175 | USPA-like | I |
2 | KJ634213 | AID16031 | 910 | USPA-like, CDC37_N, PK-like, U-Box | VI |
3 | KJ634214 | AID16032 | 166 | USPA-like | I |
4 | KJ634215 | AID16033 | 200 | USPA-like | I |
5 | KJ634216 | AID16034 | 770 | USPA-like, tyrPK-like | III |
6 | KJ634217 | AID16035 | 215 | USPA-like | I |
7 | KJ634218 | AID16036 | 240 | USPA-like | I |
8 | KJ634219 | AID16037a | 163 | USPA-like | I |
9 | KJ634220 | AID16038 | 694 | USPA-like, PK-like | II |
10 | KJ634221 | AID16039a | 163 | USPA-like | I |
11 | KJ634222 | AID16040a | 177 | USPA-like | I |
12 | KJ634223 | AID16041a | 169 | USPA-like | I |
13 | KJ634224 | AID16042a | 164 | USPA-like | I |
14 | KJ634225 | AID16043a | 167 | USPA-like | I |
15 | KJ634226 | AID16044a | 163 | USPA-like | I |
16 | KJ634228 | AID16045a | 180 | USPA-like | I |
17 | KJ634229 | AID16046a | 277 | USPA-like | I |
18 | KJ634230 | AID16047a | 277 | USPA-like | I |
19 | KJ634231 | AID16048a,b,c | 103 | USPA-like | I |
20 | KJ634232 | AID16049 | 478 | TRP, USPA-like | IV |
21 | KJ634233 | AID16050 | 799 | USPA-like, ApoLp-III, PK-like, U-Box | V |
22 | KJ634234 | AID16051 | 497 | USPA-like, PK-like | II |
23 | KJ634235 | AID16052b | 177 | USPA-like | I |
24 | KJ634236 | AID16053b | 148 | USPA-like | I |
a Putative USPA proteins with 1MJH-like USPA domains.
b USPA-like domain missing amino acids at the C terminus.
c USPA-like domain missing amino acids at the N terminus.
d See Fig. 1.
Interestingly, three other domains were also shown to coexist with the USPA-like domain in three C. roseus putative USP sequences. These domains are tetratricopeptide repeat (TPR, PF00515), apolipophorin III (apoLp-III, PF07464) and Hsp90 co-chaperone Cdc37 (PF08565). TPR is a structural motif found in tandem arrays of 316 motifs [21], which fold together to produce a single, linear, protein-protein interaction module, solenoid domain to facilitate interactions with other protein(s). Proteins with such domains can involve protein kinase inhibitor with an evidence of functioning in chaperone, cell cycle, transcription, and protein transport complexes. The TPR motif was also reported to represent an ancient protein-protein interaction module used by other proteins to trigger specific functions. Apolipophorin III proteins are a family of exchangeable apolipoproteins functionally important in lipid transport and storage and lipoprotein metabolism in eukaryotes including insects [22]. Cdc37 domain is a Hsp90 co-chaperone protein that was reported to participate in cell cycle and signal transduction. It has been shown to form a complex with Hsp90 and protein kinase to direct Hsp90 to its target kinase [23,24].
3.2 Multiple sequence alignment and phylogenetic classification of USPA-like domains
The alignment of USPA-like domains of the 24 C. roseus was done including those of Methanococcus jannaschii, Haemophilus influenzae and Arabidopsis (Fig. 2). The alignment was annotated based on the known features of 1MJH secondary structure with five β strands, alternating with four α helices [6]. The general structure in C. roseus USPA-like domains involved the eight common residues required for ATP binding (D13, V41, G127, G130, G140, S141, V142 and T143) in 1MJH. This preliminary data indicates the possibility that most of the plant USP sequences might be arisen from a 1MJH-like ancestor. Referring to the classification made by Kerk et al. [7], USPA-like domain of each of the three types 1MJH-like, 1.3.2 or 1.4.1 aligned as a separate group of USPAs, except for those of types 1.3.1 and S (sequences not present in the PlantsP plant phosphorylation database [25]) that integrated in one group. USPA-like domains of C. roseus aligned close to those of USPA-like domain types indicated above, except for AID16032 and AID16035 that were not aligned close to any of these types. We speculate that the latter two accessions might not be USPs. Generally speaking, 1MJH was closely related to USPA-like domains of C. roseus than 1JMV (Fig. 2).
The eight residues in the USPA structure of 1MJH for ATP binding were also studied in USPA-like domains of C. roseus (Fig. 2). Accession no. AID16035 has none of the eight ATP binding residues. Accession AID16032 has only two residues, the first at position D13 and the second at position G140. The first position (e.g., D13) coordinates a Mn2+ ion, which binds the phosphate groups. This residue is conserved in different groups of USPA-like domains in C. roseus except for AID16030 of 1MJH-like 3 type, AID16051 of 1.3.2 type, and AID16053, AID16031 and AID16050 of 1.4.1 type. Position V41 of 1MJH, which binds adenine, is not conserved in AID16037 and AID16042 of 1MJH-like 2 type, AID16030 of 1MJH-like 3 type, AID16049 of 1.3.1 or S type and AID16050 of 1.4.1 type. The G127 position, which binds Rib, is conserved in all the sequences, except for AID16023 and AID16035. Position G130 of 1MJH, which binds beta phosphate, is conserved in the accessions of the three MJH1-like groups, except for AID16041 and AID16044 of MJH1-like 1. It is also conserved in AID16051 of 1.3.2 type, and AID16038 of 1.4.1 type. The G140 position is conserved only in accessions of the three 1MJH-like groups as well as accession AID16032 of 1.3.1 or S type. This residue does not bind to ATP, but it is conserved in all the MJH-like bacteria and plant. Position S141 in 1MJH, which binds gamma phosphate, is conserved only in the all MJH1-like sequences, except for AID16030 of 1MJH-like 3. The V142 position in 1MJH is conserved in all C. roseus accessions of 1MJH-like type. It is also conserved in accession AID16051 of 1.3.2 type. The T143 position, which binds alpha phosphate, is conserved only in accessions AID16042 of 1MJH-like 2 type and AID16038 of 1.4.1 type. Otherwise, this residue is replaced by S in accessions of the 1MJH-like type or A/S in accessions of the other types. In summary, most accessions of 1MJH-like type contain all ATP binding residues.
Based on the known nine features (five β strands and four α helices) of 1MJH secondary structure, numbers of 2 ((I/V) - - - D), 3 ((A/S) - - A(L/V)), 6 (VILLH(V/I)), 0, 1 (I/V/L), 2 ((I/L) - - - - - - - - (V/I/L)), 6 ((V/I/L)(I/L)(I/V)(M/L/V)GS), 2 (S - (T/S/A)) and 3 (V–(V/A)(V/I)) were common in USPA-like domains of MJH1 and plant for structures β1, α1, β2, α2, β3, α3, β4, α4 and β5, respectively. There are few residues that are common in USPA-like domain of 1JMV and plant in structures β1 (V–(V/I)), α1 (A(V/L/I)) and α3 ((D/E) - - - - - - - - (I/V)). In addition, there are few residues that are plant-specific, e.g., K/R in β1 structure, W - - - (N/H) in α1 structure, I/V in α3 structure, L/V/I between α3 and β4 structures, S/A in structure α4 and C - - - - - (R/K) in structure β5 (Fig. 2).
The phylogenetic tree is constructed from the multiple sequence alignment (Fig. 3). The data indicated several distinct groups of USPA-like domains confirming the presence of high level of sequence conservation between the plant and bacterial USPA-like sequences. The results indicated the complete distinction among different types of USPA-like domain, except for 1.3.1 and S, where domains of both types were located in the same cluster. Kerk et al. [7] indicated that the sequences of types 1.4.1, 1.3.1, and 1.3.2 are diverged considerably from those of the small (S) Arabidopsis proteins. In the present study, we did not find this divergence as the USPA-like sequences of 1.3.1 were in the same cluster with those of S in either C. roseus or Arabidopsis. The 1MJH-like type of USPA-like domain was subdivided into three groups, e.g., 1MJH-like 1, 2 and 3. There are two C. roseus accessions, namely AID16032 and AID16035, that did not fit within any of the known types of USPA-like domains. There are 13 C. roseus accessions that were more closely related to the bacterial 1MJH. The rest of them were closely related to USPA-like types 1.3.1, 1.3.2, 1.4.1 and/or S. This indicates that the latter USPA-like types might not be originated from the bacterial 1MJH. Instead, the domain types 1.3.1, 1.3.2 and S were more related to the bacterial 1JMV. The radial phylogenetic tree confirmed the close relationship between plant USPA-like domains and 1JMV, where accession AID16035, followed by the USPA-like type 1.3.1 or S, were the closest to the bacterial 1JMV. However, the overall results indicate the chance that accessions AID16032 and AID16035 might not harbor a USPA domain.
We speculate that functional analysis of the C. roseus putative USPAs should involve the accessions related to 1MJH as they proved to harbor basic elements for ATP binding, which is the main feature of USP in conferring the tolerance against biotic and abiotic stresses. Biochemical function(s) of USP is not resolved up to date. There is incidence for the possible role of the small Arabidopsis putative USP in stress responses [7]. Hohnjec et al. [26] identified a possible derivative of 1MJH namely, VfENOD18 protein, in the broad bean (Vicia faba) that possesses an ATP-dependent function. Kerk et al. [7] indicated the possible functions of AT3G53990 and AT3G17020 in Arabidopsis. They indicated their sequence homology with ATE1 and ATE3, respectively. The first was reported to be involved in the post-translational conjugation of arginine to the N-terminal aspartic or glutamic acid of a protein and required for degradation of the protein via the ubiquitin pathway (http://www.uniprot.org/uniprot/Q9ZT48). Sawa et al. [27] indicated that the ATE genes are generally responsible for repression of trans-differentiation into xylem cells in Arabidopsis (http://www.arabidopsis.org/servlets/TairObject?name=ATE3&type=locus). In the present study, AT3G53990 was closely related to AID16042 and AT3G17020 was closely related to AID16037 (Fig. 3). Therefore, we expect that these two C. roseus putative USPAs might have the same functions. Zegzouti et al. [28], in their study of ethylene-induced gene expression in tomato (Lycopersicon esculentum), identified a transcript, namely ER6, that showed up-regulation under ethylene treatment. Kerk et al. [7] indicated that accession AT1G09740 is a homolog of ER6 protein. The present results indicated that the Arabidopsis accession is closely related to C. roseus accession AID16041 (Fig. 3). Therefore, we speculate that the latter might possess the same function.
There are other Arabidopsis protein accessions with suggested functions (Table S1). For example, AT3G62550 (closely related to C. roseus AID16044) is an ER6-like, AT3G11930 (closely related to C. roseus AID16045) is ethylene-responsive and AT2G21620 (closely related to C. roseus AID16040) might respond to severe desiccation and auxin treatment [7,29]. In conclusion, structural and phylogenetic analyses indicated that many USPA-like domains in C. roseus originated from a 1MJH-like ancestor. Functions of many of them have been speculated based on the known functions of their analogs in Arabidopsis.
Disclosure of interest
The authors declare that they have no conflicts of interest concerning this article.
Acknowledgments
This project was funded by the Deanship of Scientific Research (DSR), King Abdulaziz University, Jeddah, under Grant No. (15-3-1432/HiCi). The authors, therefore, acknowledge with thanks DSR technical and financial support.