Computational study of environmental stress-related transcription factor binding sites in the promoter regions of maize auxin response factor (ARF) gene family

Auxin response factors (ARF) gene family plays key roles in plant development and act as transcription factors (TFs) in the regulation of gene expression. An extensive bioinformatics analysis including analysis of conserved motifs, chromosomal map, phylogenetic relationships, and expression profiles were performed for the maize ARF gene family. In this study, a set of publicly available 38 ARF maize (Zea mays) nucleotide sequences were downloaded. Using microarray data, a bioinformatics search for identification of TFBs in ARF genes using plant promoter analysis (PlantPAN) was carried out. The 38 maize ARF genes were categorized into three groups (Class I, II, and III). ARF genes have been studied by molecular methods in several different plant species however to better understand the mechanisms of these genes more studies are needed. Gene cluster analysis showed that the same set of genes on the chromosomes were positively correlated with the same number of gene clusters. Several TFBs including AP2/ERF, ERF, WRKY, bZIP, bHLH, GATA, and NAC were identified in the promoter regions. These TFBs are responsible for modulation of several biotic stressresponsive genes. The main aims of the present study were to obtain genomic information for the ZmARF gene family and their expression under abiotic and biotic stresses.


Introduction
Auxin, a plant hormone involved in various stages of plants growth and development, is involved in apical dominance, tropisms, and vascular patterning. Auxin plays a key role in cell division and cell expansion during the developmental stages and in regulation of a variety of physiological processes including lateral root initiation and shoot elongation (Di et al., 2016). During the last decade, many studies have reported on the rapid effects of auxin on gene expression and regulation. It has been shown that auxin is involved in regulation of a large number of genes having roles in growth and developmental processes in Arabidopsis and other plant species (Guilfoyle and Hagen, 2012;Di et al., 2016). According to many reports, these processes are controlled by genes responsive to auxin at the molecular level.
Auxin response factors (ARF) are TFs that regulate auxin responsive genes. So far, members of ARF cascades from Arabidopsis, rice, tomato and maize have been identified (Wang et al., 2007). ARFs are involved in diverse biological and physiological processes and are essential for their potential applications for the development of improved stress-tolerant transgenic crop plants. ARFs genes can act as either an activator or a repressor of the transcription of auxin-responsive genes (Saidi et al., 2020). ARF regulates early responsive genes such as Aux/IAA, GH3, and the smallest auxin uptake (SAUR) gene family, activating or inhibiting these genes by binding to auxin-responsive elements (AuxREs). AuxREs elements may include the following sequences of TGTCTC, TGTCAC and/or TGTCCC elements in their promoter region (Hagen and Guilfoyle, 2002).
Auxin transcription factors are considered as significant gene family in response to auxin .
The first step in identifying and studying gene function is the use of data from databases using bioinformatics tools. The main aim of bioinformatics is to increase the understanding of biological processes, co-expressed gene network, and prediction of TFBs (Hogeweg, 2011).
The presence of TFBs in many of the studied genes causes the plant to respond appropriately to abiotic and biotic stresses such as physiological processes and plant growth and reproduction (Greenham and McClung, 2015). The promoter analysis of auxin-responsive genes makes identification of TFBs or AuxREs possible (Hagen and Guilfoyle, 2002). Recently, this gene family has extensively been investigated using molecular and bioinformatics analysis. Molecular analyses of ARF genes on several crops such as citrus, Medicago truncatula, and Gossypium raimondii have been performed Sun et al., 2015).
Most ARF proteins are made of three domains namely N-terminal DNA-binding domain (DBD), the variable middle region that functions as an activation domain (AD) or repression domain (RD), and a carboxyterminal dimerization domain (CTD) which is involved in protein-protein interactions by dimerizing with auxin/indole-3-acetic acid (Aux/IAA) gene family (Greenham and McClung, 2015). In some studies of A. thaliana, it has been indicated that glutamine (Q)-rich middle regions act as activation regions but that serine (S)-rich, serine and proline (SP)-rich regions function as repression regions in ARFs (Tiwari et al., 2003).
Several ARFs also play an essential role in reproductive development. Recently, functional studies have demonstrated that ARF genes play key role in signal transduction during plant organ development. In Arabidopsis thaliana, ARF2 expresses leaf senescence (Lim et al., 2010), root formation and flower organ senescence (Ren et al., 2017).
Arabidopsis ARF10 and ARF16 have essential roles in the process of root development, and the auxin signals cannot bypass them to initiate cell production . Furthermore, ARF7 and ARF19 may regulate Arabidopsis lateral root (LR) formation initiation by interacting with IAA14 (Ito et al., 2016).
Several ARFs also play an essential role in reproductive development. In Arabidopsis, ARF6 and ARF8 promote jasmonic acid production and flower maturation  and influence the developing floral organs (Tabata et al., 2010).
In this study, we investigated the maize ARF gene number, chromosomal map, and promoter analysis. The phylogenetic analysis of ARF genes would be useful for studying and understanding of their roles in plants.
The study of this gene family would be a useful strategy for increasing tolerance of crop plants against environmental stresses. The main aims of this study were to better understand the ARF gene family and their relationships with Aux/IAA gene families, to obtain genomic information for the ZmARF gene family, and their expression under abiotic and biotic stresses.

Materials and Methods
Identification of maize ARF gene family, phylogenetic analysis of ARF genes, and conserved motifs Maize ARF nucleotide sequences were downloaded from (http://grassius.org). Alignment of the sequences of the ARF genes was performed using CLUSTALW program in MEGA software version 6 (Tamura et al., 2013). Phylogenetic tree was constructed using the Neighbor-Joining (NJ) method and the diagrams of phylogenetic trees were drawn with MEGA6 (http://www.megasoftware.net/mega.php) (Tamura et al., 2013). In addition, confirmation of phylogeny tree was done using the bootstrap method with 1000 replicates. The Z. mays ARF protein sequences were downloaded from (www.grassius.org). ExPASy server (http://web.expasy.org/computepi/) was employed to predict the theoretical isoelectric point (pI) and the molecular weight (Mw) of each ZmARF protein. ZmARF transcription factors were identified using MEME (Suite version 4.9.1; http://meme-suite.org/tools/meme) according to the following criteria: maximum number of 15 motifs and an optimum width of 8-50 amino acids (Table 1). Expression study of ARF genes In order to determine the differential gene expression under different stresses, ARF gene expression data was extracted by Genevestigator from Zea mays database using mRNA-Seq Gene Level Zea mays (ref: AGPv4) (ZM_mRNASeq_MAIZE_GL-0) and 'Perturbations' tool. The fold change in gene expression was detected using filter 2 folds as benchmark. The fold changes in the expression of ARF genes under different environmental stresses were used to generate gene expression heatmap using Genevestigator (https://genevestigator.com/gv/) with Red/Green colour scheme as markers where "Red" and "Green" colors represent up and down-regulation of respective genes. Genevestigator (https://genevestigator.com/gv/) was performed to determine the expression of ARF genes in different developmental stages in maize.
Mapping ZmARFs genes on maize chromosomes and analysis of transcriptional factors binding sites A total of 38 ARF genes were mapped to chromosomes 1-10. In order to show chromosomal locations of Zea mays ARF genes, the MapChart software version 2.3 (Voorrips, 2002) was downloaded from https://www.wageningenur.nl/en/show/Mapchart-2.30.html. A total of 1.5 kb upstream sequence was analyzed for each ARFs gene by PlantPAN 3.0 software (http://plantpan3.itps.ncku.edu.tw/) for detection of TFBs in promoter regions on both strands (Blanco et al., 2006). PlantPAN is an appropriate resource for estimating regulatory elements and reconstructing transcriptional regulatory networks of plant genes. The database released 17230 TFs which were collected form 78 plant species.

Results and Discussion
ARF transcription factors play key roles in various plant growth and development. In the present study, we surveyed 38 ZmARF genes in different stress condition and developmental stages.
Identification of ZmARFs and phylogenetic analysis of ARF genes The predicted proteins encoded by ZmARF genes varied from 418 (ARF13) to 1147 amino acids (ARF20), with corresponding molecular weights from 46.84 kDa (ARF13) to 127.24 (ARF20) kDa, and the theoretical isoelectric points ranging from 5.38 (ARF33) to 9.27 (ARF13). The phylogenetic tree of all maize ARF nucleotide sequences was generated using MAGE v7.0 program by the N-J method. All ZmARFs genes were classified into three major classes -I, II, and III ( Figure 1). Classes I, II, and III each contained 22, 2, and 13 members, respectively. These results are similar to the grouping of these genes in related studies Saidi and Hajibarat, 2018).

Chromosomal locations of ZmARFs gene
The distribution of 36 members of ZmARF family distributed on 10 maize chromosomes is shown in Figure 2. Five ZmARFs genes were located on each of the chromosomes 1, 5, and 10; four genes on chromosomes 1 and 6; seven genes on chromosome 3, and, only one gene on chromosome 9 (Figure 2). Due to the location of these genes on maize chromosomes, all chromosomes contained ARF encoding sequences except chromosome 7. In this study, we found gene duplication in chromosomes 4, 5, and 10. Although, genes were distributed on 10 chromosomes but only one gene cluster was found on each of the chromosomes 1, 2, and 10 ( Figure 2). The clustering patterns of these sequences were similar with their placement on each chromosome.
Our results were in agreement with those reported by other researchers (Liu et al., 2012;Saidi and Hajibarat, 2019).  . The similar distribution of motifs in the ARF proteins suggests an evolutionary relationship among them. The sites of different motifs on the protein sequence may help to determine the differences in the performance of genes under different stress conditions (Liao et al., 2017). To identify the conserved motifs in this gene family, the protein sequences of ZmARF genes were surveyed by MEME software (Figure 3). ZmARF gene expression study Expression profiles of ZmARFs genes for Colletotrichum graminicola, Fusarium verticillioides, heat, and submergence were retrieved. The expression profiles of ZmARF genes in response to different abiotic stresses (submergence, heat) were also analysed. Up-regulation was observed in ZmARF6 under abiotic and biotic stresses; ZmARF1, ZmARF7, ZmARF18 under biotic stresses such as (C. graminicola and F. verticillioides); and in ZmARF6 under heat stress. Down-regulation was observed in ZmARF16 under submergence; ZmARF11 under C. graminicola, F. verticillioides, submergence, and heat stress. When tomato is encountered with biotic stresses, SlARF1, SlARF4, SlARF6, SlARF10 and SlARF18 genes were activated with a significant increase in expression, however some genes such as SlARF3 and SlARF5 were significantly repressed (Figure 4a).
Tomato ARF genes were induced in response to flagellin in only six hours. Therefore, the expression of almost all SlAR gene family is repressed except for SlARF6 and SlARF18 which showed a twofold expression. Our findings revealed that ZmARF6 and ZmARF18 genes increased expression significantly in response to biotic stresses, in contrast to the results reported by (Bouzroud et al., 2018). Expression analysis revealed that ZmARF was down-regulated in response to Colletotrichum graminicola. A recent report has shown that ZmARF4 was down-regulated in response to C. graminicola and activated in response to auxin (de Jesus Miranda et al., 2019). this may be due to the presence of different TFBs (Figure 4b). As a result, these genes can play key roles in different growth and developmental stages and under abiotic and biotic stresses. ZmARF5, ZmARF32, and ZmARF17 genes were down-regulated during several developmental stages such as dough stage, fruit formation, and seedling stage. ARF8 is reported to regulate fertilization and fruit development (Goetz. et al., 2006) and ARF6 and ARF8 are essential players in flower maturation and flowering processes (Finet et al., 2010). Our analysis showed that ZmARF6 and ZmARF8 are involved in flower and fruit formation in maize which is in line with the findings reported by others (Goetz et al., 2007;Finet et al., 2010).
ARF6 and ARF8 genes regulate development at reproductive and vegetative stages with overlapping functions (Goetz et al., 2007). ARF19 acts redundantly in controlling leaf expansion and lateral root growth (Wang et al., 2007). Our findings revealed that ARF2 was up-regulated in all different developmental stages (Figure 4b). A study showed that ARF2 control leaf senescence and floral organ abscission in Arabidopsis . ARF3 gene plays a key role in the formation of meristem, while ARF4 plays an important role in organ polarization (Hunter et al., 2006). The presence of ARF5 is essential for the formation of flower and root embryos whereas, ARF7 gene expression was increased during all developmental stages (Figure 4b). However, a report indicated that ARF7 acts as a negative regulator of fruit formation (ex., pollination and fertilization), as well as an intermediary between gibberellins and auxin messaging during tomato fruit formation (Wilmoth et al., 2005). ARF4 is involved in controlling sugar metabolism during tomato fruit development (Zouine et al., 2014). Analysis of transcriptional factors binding sites AP2; ERF participates in response to auxin, cytokinin, abscisic acid, and jasmonate signals. In rice, an AP2; ERF OsCRL5 is induced by exogenous auxin, treatment which in turn inhibits cytokinin signal transduction by enhancing the activities of two repressors (Khurana et al., 1988). Co-expression of a gene(s) with a transcription factor gene(s) can provide clues about regulatory mechanisms (Hirai et al., 2007).
Among the six TFBs surveyed in this study, the presence of AP2/ERFbs was the highest in the promoters of ZmARF gene families of maize. Analysis of TFBs in the promoter sequences provide a good understanding of plant responses in stress conditions. Genes promoter analysis revealed that AP2; ERF, ERF, bZIP, GATA, and WRKYs were present in majority of the promoter regions of auxin-responsive genes.
Among 38 ZmARF gene promoters, the highest number of bZIPbs was observed in ZmARF11-12 promoter.
AP2/ERFs are reported to be involved in improving resistance to biotic stresses. In addition, few of the AP2/ERF TFs have been reported to enhance resistance to biotic and abiotic stresses, simultaneously (de Jesus Miranda et al., 2017). According to other reports many of the AP2/ERFbs in the upstream genes can enhance resistance to biotic and abiotic stresses (Kasuga et al., 1999;Berrocal-Lobo et al., 2002). For example, the bZIP TF is reportedly involved in the regulation of abiotic and biotic stresses. The role of bZIP proteins in response to biotic stress is widely known (da Silva and de Oliveira, 2014).
Several studies have validated the role of bZIP in ABA signalling in rice. Also, they have been found to play a regulatory role upon drought stress during vegetative stages (Fujita et al., 2013). bZIP transcription factors are a gene family which regulates several steps of plant's life such as light signalling, pathogen defence, flower development, and abiotic stress signalling (Rahaie et al., 2011). Additionally, ZmARF16 gene did not contain any WRKYbs in its promoter region. WRKYs are of particular interest as they are involved in diverse environmental stress responses as well as in physiological/biological processes (Jiang et al., 2015).
Moreover, OsWRKY regulates cascade of signalling pathways mediated by jasmonic acid and salicylic acid for protecting plants from bacterial and fungal infections (Chu et al., 2015). In addition to the biotic stress involvement, a number of WRKY family members are also associated with abiotic stress responses in rice (Zheng et al., 2006). The up-regulation of some members of WRKY TFs have been found to implicate salt, drought, and heat tolerance in plants (Yang et al., 2017).
All AP2/ERFbs were found to be distributed throughout in ZmARF promoter regions. The presence of AP2; ERFbs in promoter regions can play a role in environmental and hormonal stresses. In the present study, the largest number of TFbs (four TFbs) were detected in ZmARF10-12 followed by ZmARF35.

Conclusions
In the present study, an extensive analysis of maize ZmARF genes was conducted to evaluate their chromosomal distribution, phylogenetic tree, conserved motifs, transcriptional factor binding sites (TFBs) in promoter regions, and expression. The number of ARF family members varies in different plants which can be due to differences in their genomic size. The chromosomal distribution and phylogenetic relationships of the ARF genes provide a valuable tool into the evolutionary status of maize. However, there are few studies of ARF genes in maize. In this study, we used maize ARF gene sequences and applied various bioinformatics tools to gain more knowledge about biological/physiology processes. Analysis of 38 ZmARF promoter regions revealed the presence of different types and frequencies of TFBs among genes. In our analysis, several TFBs including AP2/ERFbs, WRKYbs, bZIPbs, bHLHbs, GATAbs, and ERFbs were identified in the promoter regions of ZmARF genes. Gene cluster analysis showed that the same set of genes on the chromosomes were positively correlated with the same number of gene clusters. Further analysis is required to confirm these ZmARF promoters for future utilization in crop breeding as stress inducible promoter.

Authors' Contributions
Both authors read and approved the final manuscript.