Amino Acids Sequence Based In Silico Analysis of RuBisCO ( Ribulose-1 , 5 Bisphosphate Carboxylase Oxygenase ) Proteins in Some Carthamus L . ssp

RuBisCO is an important enzyme for plants to photosynthesize and balance carbon dioxide in the atmosphere. This study aimed to perform sequence, physicochemical, phylogenetic and 3D (three-dimensional) comparative analyses of RuBisCO proteins in the Carthamus ssp. using various bioinformatics tools. The sequence lengths of the RuBisCO proteins were between 166 and 477 amino acids. Their molecular weights (Mw) ranged from 18,711.47 to 52,843.09 Da; the most acidic and basic protein sequences were detected in C. tinctorius (pI = 5.99) and in C. tenuis (pI = 6.92), respectively. The extinction coefficients of RuBisCO proteins at 280 nm ranged from 17,670 to 69,830 M cm, the instability index (II) values for RuBisCO proteins ranged from 33.31 to 39.39, while the GRAVY values of RuBisCO proteins ranged from -0.313 to -0.250. The most abundant amino acid in the RuBisCO protein was Gly (9.7%), while the least amino acid ratio was Trp (1.6 %). The putative phosphorylation sites of RuBisCO proteins were determined by NetPhos 2.0. Phylogenetic analysis revealed that RuBisCO proteins formed two main clades. A RAMPAGE analysis revealed that 96.3%-97.6% of residues were located in the favoured region of RuBisCO proteins. To predict the three dimensional (3D) structure of the RuBisCO proteins PyMOL was used. The results of the current study provide insights into fundamental characteristic of RuBisCO proteins in Carthamus ssp.


Introduction
Photosynthesis is arguably the most important energy conversion process on earth because the chemical energy it yields is the base of food chains that sustain the overwhelming majority of other life forms.Plants utilize atmospheric CO2 to liberate oxygen and synthesize carbohydrates during photosynthesis.It is an event where radiant energy of sunlight is utilized to convert carbon dioxide into photosynthetic by products (Naeem et al., 2013).Ribulose-1,5-bisphosphate carboxylase/oxygenase (EC4.1.1.39,RuBisCO), transforming the carbon dioxide and ribulose-1,5-bisphosphate (RuBP) into two molecular 3-phosphoglyceric acid, catalyzes the first reaction of carbon dioxide fixation in photosynthetic dark reaction.In addition, RuBisCO catalyses the reaction of oxygen and RuBP to phosphoglyceric acid and phosphoglycolic acid, which is the first reaction of photorespiration.Therefore, RuBisCO is the key enzyme deciding the photosynthetic efficiency by regulating photosynthesis and photorespiration.Based on the dissimilarity of the primary and quarternary structures, the RuBisCOs can be partitioned into three types: I form exists in higher plants and most prokaryotes, consisting of eight large subunits (50~60 kD) and eight small subunits (12~18 kD), presenting square symmetry structure (L8S8) (Andersson et al., 1989); II form was discovered in purple non-sulfur photosynthetic bacteria, and composed of only two large subunits (L2); III form was dug out in Thermococcus kodakaracinsis lately by Kitano (Kitano et al., 2001), likewise formed with only large subunits, and no small subunit, appearing structure of (L2)5 (Zhang et al., 2011).The large subunits of RuBisCO (rbcL) are encoded in the chloroplast, whereas the small subunits (rbcS) are encoded in the nucleus.Although both subunits are important for the functionality of the protein, the transcript levels of rbcS have been considered as a factor determining the level of RuBisCO in plants (Mukherjee et al., 2015).The rbcS promoter has therefore been a popular genetic element in molecular studies (Avci and Tezcan, 2016).
The Asteraceae family is considered to be the largest known plant family in the World (Nylinder and Anderberg, 2015) and comprises approximately 23,000 species in 1,535 (Thompson, 1994) and MEGA 6.0 (Tamura et al., 2013).A phylogenetic tree of Carthamus species RuBisCO proteins were constructed using maximum likelihood method with MEGA 6.0 and bootstrap values were performed with 1000 replicates.To predict the 3D structure of the RuBisCO proteins, homology models were performed using the following method options in PSIPRED v3.3 (http://bioinf.cs.ucl.ac.uk/psipred/): (Predict Secondary Structure), BioSerf v2.0 (Automated Homology Modelling by Homology).
The results were checked and verified by a Ramachandran plot analysis in RAMPAGE (http://mordred.bioc.cam.ac.uk/~rapper/rampage.php), which determined the best predicted models.Finally, 3D comparative analyses were performed using PyMOL (TM) (Schrodinger, LLC).

Results
The physicochemical analysis of the predicted RuBisCO protein was performed using ExPASy Protparam and results were shown in Table 1.The amino acid sequence length ranged from 166 to 477.The shortest amino acid sequence was in C. tenuis aa.), while the longest amino acid sequence was in C. tinctorius (477 aa.) species.The minimum and maximum molecular weights (Mw) were 18711.47 and 52843.09Da, respectively.The most acidic and basic protein sequences were in C. tinctorius (pI = 5.99) and C. tenuis (pI = 6.92), respectively.The extinction coefficients of RuBisCO proteins 280 nm ranged from 17,670 to 69,830 M -1 cm -1 .The higher extinction coefficients were detected in C. duvauxii, C. oxyacanthus and C. tinctorius while the lowest extinction coefficient was in C. tenuis.The instability index (II) values for the RuBisCO proteins ranged from 33.31 to 39.34.The aliphatic index (AI) of proteins of thermophilic bacteria was found to be higher and the index could be used as a measure of thermostability of proteins.This index is directly related to the mole fraction of Ala, Ile, Leu and Val in the protein (Idicula-Thomas and Balaji, 2005).The Al values ranged from 77.31 to 79.17.The highest Al values were seen in C. duvauxii and in C. oxyacanthus, while the lowest Al value was seen C. turkestanicus (Table 1).As expected, when the total ratios of aliphatic amino acid contents were compared in the proteins with the lowest and highest Al values, it was observed that Al values increased when aliphatic amino acid genera (Öztürk and Çetin, 2013).The family contains food plants, raw material resources, medical and medicinal plants, tender and succulent plants, wild weeds and poisonous plant (Süslü et al., 2010).Acquisition of esculents such as honey and acquisition of cooking oil from this family is used in many fields such as pharmaceutical industry.In addition, many of the species are cultivated as ornamental plants in Asteraceae (Paksoy et al., 2016).Carthamus L. is a genus belonging to the tribe Cynareae of the Asteraceae family.The eastern part of the Mediterranean region is regarded as the original centre of this genus.The genus includes about 25 species, distributed from Spain and North Africa across the Middle East to northern India (Ashri, 1960;Yue et al., 2013).The oil of Carthamus tinctorius plant from Carthamus species is found suitable for biodiesel manufacturing and used as an industrial good.Carthamin matter obtained from safflowers is of importance as a raw material for natural dye (Nagaraj et al., 2001).In addition, the plant itself is a valuable ornamental plant used as green fence and dry flower.Its residue is a valuable feedstuff and handles are used as fuel.Furthermore, it is an oil plant, with valuable properties which are rapidly increasing in the world, due to its high concordance with arid regions and high quality oil obtained from the seeds of the plant.(Yildirim et al., 2005;Uysal et al., 2006).In this study, RuBisCO was analysed in economically valuable Carthamus species with respect to physicochemical, phylogenetic and 3D structure properties utilizing bioinformatic tools.

Materials and Methods
The RuBisCO protein sequences of Carthamus species were retrieved in FASTA format from the National Centre for Biotechnology Information (NCBI: https://www.ncbi.nlm.nih.gov/protein).The NCBI and UNIPROT accession numbers of the selected proteins are given in Table 1.The physicochemical analysis and amino acid contents of the proteins were analysed by ExPASy's ProtParam (http://web.expasy.org/protparam/) to determined their isoelectric point (pI), molecular weight (Mw), total number of positive (+R), and negative (-R) residues, extinction coefficient (EC), instability index (II), aliphatic index (Al), and GRAVY (grand average of hydropathy) values.The putative phosphorylation sites of the RuBisCO proteins were determined by NetPhos 2.0 (http://www.cbs.dtu.dk/services/NetPhos/).All the protein sequences were aligned using ClustalW content increased (Table 2).The GRAVY values of RuBisCO proteins ranged from -0.313 to -0.250 (Table 1).A very low GRAVY value implies that the protein is more soluble in water than those with a higher GRAVY index.The most abundant amino acid ratio in the RuBisCO protein Gly (9.7%), while the minimum amino acid ratio in the RuBisCO protein was Trp (1.6%) (Fig. 1).The putative phosphorylation sites were determined using the NetPhos 2.0 server based on a score above 0.8 (Table 3).The most phosphorylated site was found in C. tinctorius (Fig. 2).The confidence rates that these were true phosphorylation sites were above the threshold (0.5) and output score was given in a 0.0-0.1 range.
For phylogenetic analysis, MEGA 6.0 software was used.Maximum likelihood with bootstrap analysis was constructed in order to identify the relationships among Carthamus ssp.A phylogenetic analysis was performed with the RuBisCO protein sequences, which were found to comprise two main clades.Clade 2 only consists of C. oxyacanthus (Fig. 3).The threedimensional structure of the RuBisCO proteins were constructed using the PyMOL program and alpha helix, beta helix structures were demonstrated (Fig. 4).The threedimensional structure of the proteins, contributes to the 207 understanding of protein function and active regions and facilitating drug design (Filiz and Koç, 2014).In the model validation, the Ramachandran plot analysis using the RAMPAGE server showed that 97.6%, 97.5%, 97.4%, 97.00% and 96.3% were in the favoured region; 2.2%, 2.1%, 1.7%, 2.1 % and 2.4 % in the allowed region; and 0.2 %, 0.4%, 0.9%, 0.9% and 1.2% in the outlier region in C. turkestanicus, C. tinctorius, C. duvaxii, C. oxyacanthus, and C. tenius respectively (Fig. 5), indicating that the 3D models were fairly good in quality.

Conclusions
In this study, a silico analysis was carried out using bioinformatic tools such as ExPASy's ProtParam, MEGA 6.0, NetPhos 2.0, PSIPRED v3.3, RAMPAGE, and PyMOL of the RuBisCO protein in Carthamus ssp.The results of this study pave the way for further researches into the RuBisCO protein in different plant species.

Fig. 2 .
Fig. 2. RuBisCO proteins in Carthamus ssp.determined by a score above a threshold of 0.5

Fig. 4 .
Fig. 4. The comprehensive 3D structures of Carthamus ssp.RuBisCO proteins.The visual data were obtained from PSIPRED software and analysed using PyMOL

Table 1 .
The physiochemical properties of RuBisCO proteins from Carthamus ssp.

Table 2 .
The RuBisCO proteins with aliphatic index (Al) values and their corresponding number of aliphatic residues

Table 3 .
Putative phosphorylation residue in RuBisCO protein sequences of Carthamus ssp. with a score above 0.8