Epidemiology of Enterococcus faecium isolates sampled from different sources in Romania using MLST technique and eBURST algorithm

Enterococcus faecium is emerging as an important cause of multidrug resistance and hospital acquired infections, special attention being paid to the vancomycin resistant species. Therefore, the characterization of pathogenic strains/isolates plays an important role in the epidemiology of infectious diseases. The enterococcal rate was determined from wastewaters in Cluj-Napoca area. As presence of E. faecium was detected, a number of isolates from wastewater, birds and humans were epidemiologically analyzed according to the MLST website. Comparisons were performed against a collection of available isolates, with multiple origins, contained in the MLST database. Out of the Enterococcus isolates collected from wastewater, 11 were identified as E. faecalis (40.74%); 8 as E. casseliflavus (29.62%); 5 as E. faecium (18.50%); 2 as E. gallinarum (7.40%) and one isolate as E. durans. Based on the MLST data and using the eBURST algorithm, the isolates of E. faecium sampled from Romania were split in three groups: one group comprised isolates from human hosts and wastewater (Cj316, 106/6, Cj197, Cj22, 129/6, Cj117, Cj24, 284/7, and 43/7), while the second (G9, G10-2, G7, G3-2, and G9-1) and the third group (G8, G6, and 40/7) originated from bird hosts. The rest of the isolates were not joined in a particular group, assuming the lack of a phylogenetic bond between these isolates. The obtained data suggested the existence of at least two phylogenetic lines of E. faecium in Romania: a line that had mainly human host prevalence, while in the other line the animal hosts dominated.


Introduction Introduction Introduction Introduction
Bacteria of the genus Enterococcus (formerly the 'fecal' or Lancefield group D streptococci) are ubiquitous microorganisms. They occur in large numbers in different types of soil, surface waters, vegetables, plant material and foods, especially those of animal origin (Giraffa, 2014), but have a predominant habitat in the gastrointestinal tract of humans and animals. E. faecium and E. faecalis are the predominant Gram-positive cocci in human stools, while E. faecium is the prevalent species in production animals like poultry, cattle, and pig, and E. mundtii and E. casseliflavus are found in plant sources (Klein, 2003).
From an ecological point of view, the distribution of Enterococcus species varies throughout Europe: in Spain and the UK E. faecalis and E. faecium are the most commonly isolated species from both clinical and environmental sources, in Sweden E. faecium has a lower incidence and E. hirae a higher isolation rate, whereas in Denmark E. hirae is the dominant species, isolated mainly from slaughtered animals .
Enterococcus species play an important role in food industry (dairy production, storage of meat and vegetables -Foulquié Moreno et al., 2006) or as probiotics to treat diarrhea and improve immunity (Franz et al., 2011). Nevertheless, some species of Enterococcus, such as E. faecalis and E. faecium, were also reported to be associated with many infections, including urinary tract infection, bacteremia, endocarditis, neonatal infection and infection of the central nervous system (O'Driscoll and Crank, 2015). A much bigger problem is represented by the fact that, with the extensive use of antibiotics, Enterococcus species have developed resistance to many antibiotics (Zhong et al., 2017). Currently, E. faecium is emerging as an important cause of multidrug resistance and hospital acquired infections, special attention being paid to VRE (vancomycin resistant Enterococcus) (Lebreton et al., 2014;Adegoke et al., 2022;Correa-Martínez et al., 2022;Toc et al., 2022). Even if E. faecalis is responsible for about 80% of all enterococcal infections in humans, while E. faecium causes only about 20% of the infections, E. faecium represents the majority of the VRE (Sievert et al., 2013).
Ecology and epidemiology studies of Enterococcus have reported that E. faecalis and E. faecium are being regularly isolated from cheese, fish, sausages, minced beef, and pork (Klein, 2003;Foulquié Moreno et al., 2006). The characterization of pathogenic strains plays an important role in the epidemiology of infectious diseases, generating the necessary information for the identification, tracking and intervention against epidemics (Tavanti et al., 2005).
MLST (Multilocus Sequence Typing) is one of the techniques used in global epidemiology, aiming to identify strains with a high pathogenicity and, thus, providing an improved picture of the activity of bacteria within environment and human populations. Studies on microbial populations using the MLST technique are generally intended to estimate the genetic diversity (usually counted as the relative contribution of recombination and mutations per allele or per locus), as well as evaluate the relative impact of genetic dispersion and natural selection in the evolutionary history of these pathogens (Stefani and Agodi, 2000;Pérez-Losada et al., 2005). MLST is based on the sequence of housekeeping genes which exhibit in each strain a distinct numerical allelic profile, abbreviated to a unique identifier: the sequence type (ST). The relatedness between two strains can be then inferred by differences between the allelic profiles (Francisco et al., 2009).
The possible patterns of evolutionary descent obtained within MLST can be further analyzed by eBURST. eBURST represents an advanced algorithm that involves partitioning the large number of MLST data sets into groups of related, non-overlapping STs, clonal complexes, and then discerns the best-fit pathway within each clonal complex to the founder genotype . Thus, following a set of very simple rules, eBURST can be used to find out how the diversification of bacterial clones has occurred and to provide evidence for the emergence of clinically relevant clones .
The aims of this study were i) -to determine the enterococcal ratio in wastewater from the Cluj-Napoca area; ii) -to determine using MLST the genetic relatedness of randomly selected E. faecium isolates from wastewaters, birds and humans, and confront the results against an international MLST database.

Materials and Methods Materials and Methods Materials and Methods Materials and Methods
Bacterial isolates sampled from wastewater in Cluj-Napoca area. Cultures and identification Between June 2006 and August 2008, 27 water samples were collected from the Someşeni wastewater treatment plant in Cluj-Napoca, Romania. Residual water samples were isolated in pre-sterilized containers and analyzed within 6 hours of isolation, according to international standards. The water samples divided into 6/100 mL portions were subjected to filtration on a 0.45 μm pore diameter membrane. To ease the filtration process and to reduce the number of colonies, the samples were diluted with H2O UV/UP, the initial dilution being 1:4, followed by a subsequent dilution of 1:9. After the filtering process, the samples were placed directly on the culture medium in Petri dishes. Two types of selective culture media were used. The filters were initially placed on Petri dishes containing M Enterococcus Agar (MEA) (Fluka, Buchs, Switzerland), because the sodium azide, captan and nalidixic acid contained in this formula, are known to inhibit the growth of many species of bacteria and fungi, conferring selectivity to the medium. After casting the plates, 300 μL of 1% TTC solution (2,3,5-triphenyl tetrazolium chloride) were added to the surface of the medium. Within the bacterial cell, TTC was reduced to insoluble formazan, which conferred a dark pink color to the colonies. Cultivation of filters on MEA was performed in the oven, for 48 hours at a temperature of 41 °C. As a safety measure, with respect to the identity of the cultivated microorganisms, cultivation was carried out on a second type of medium, namely Esculin Iron Agar (Fluka, Buchs, Switzerland). This second medium was used to confirm the identity of the colonies based on the ability of enterococci to hydrolyze esculin. The procedure consisted in transferring the M Enterococcus Agar filter membranes onto Esculin Iron Agar and incubating them in the oven at 41 °C for 20 minutes. The hydrolysis of esculin resulted in black or dark brown colonies, confirming the presence of enterococci.
For species identification of enterococcal isolates (Table 3) the following methods were used: classical biochemical identification by observing hemolysis pattern on blood agar and two standardized methods: API ® 20 Strep from BioMérieux (a kit includes strips that contain up to 20 miniature biochemical tests for manual identification of microorganisms) and VITEK (a fully automated system which performs bacterial identification).
DNA extraction of the bacterial isolates sampled from wastewater in Cluj-Napoca area The bacterial isolates were cultivated on LB (Luria-Bertani) liquid media overnight at a temperature of 37 °C. In order to disrupt the cell walls, a digestion with lysozyme (Sigma, St. Louis, USA) and proteinase K (Promega, Madison, USA) was performed, following the protocol described in Băcilă et al. (2007). DNA purification were performed using Wizard Genomic DNA Purification Kit (Promega, Madison, USA), according to the manufacturer's recommendations. DNA quality was estimated on a 1% agarose gel stained with ethidium bromide, and DNA concentration was quantified using a NanoDrop 1000 Spectrophotometer (Thermo Fisher Scientific, Wilmington, USA).

Identification of Enterococcus faecium isolates by PCR
The predominant species in cultures are: E. faecalis and E. faecium. To distinguish between the two species of enterococci, one fragment of the ddl gene was amplified within PCR. The amplification reaction was carried out on isolates from wastewater (one sample from Bucharest was added to the 27 samples previously collected from Cluj-Napoca area), as well as on isolates from other sources: human and birds (Table 1). As positive controls, the strains E. faecium ATTC 35667 and E. faecalis ATTC 51299 were used.

MLST technique and DNA sequencing applied on Enterococcus faecium isolates sampled from different sources and different parts of Romania
A number of 32 isolates of E. faecium with different origins were analyzed with MLST and eBURST. The samples were taken from the wastewater (five isolates including one from Bucharest and four out of the 27 samples previously collected from Cluj-Napoca area), from birds (12 isolates collected from their faeces) and from humans (15 isolates collected from hospitalized persons: samples of secretions of infected wounds, blood or peritoneal fluid) (Table 1).
Sequence types of E. faecium isolates were determined employing a MLST scheme, performing amplification of housekeeping genes by PCR and subsequent Sanger sequencing of the PCR products, as previously described by Homan et al. (2002). The typing scheme used regions of seven structural genes (gdh -glucose-6-phosphate dehydrogenase; purK -5-(carboxyamino) imidazole ribonucleotide synthase; pst -phosphate ATP-binding cassette transporter; atpA -ATP synthase, alpha subunit; gyd -glyceraldehyde-3-phosphate dehydrogenase; adk -adenylate kinase; ddl -D-alanine ligase). These genes are well-conserved structures in the bacterial genome, so the likelihood of mutations over time is rather low, allowing them to be successful tools in the accurate genetic characterization of certain pathogenic bacterial species, as precise identification of infectious agent strains is essential for both epidemiological studies and public health decisions. The sequences of each amplified gene fragment were compared to all previously identified (alleles) sequences for that locus, thus determining the number of alleles for all seven loci. Combining the number of alleles from all the loci defined the allelic profile of the strain. Each different allelic profile was considered a ST-sequence type. Such a ST represented a suitable and clear label for each isolate. Characterization of bacterial isolates was achieved by comparing the sequences obtained with the sequences existing in the databases.
The PCR method is particularly precise, safe and time-efficient, allowing a large number of samples to be analyzed in a relatively short time. The difficulty encountered was the setting of optimal temperatures for the amplification of genetic material. Optimal boost temperatures varied in a fairly wide range, starting at 47 °C for the purK gene and reaching 59 °C for the ddl gene ( Table 2). The primers required to amplify fragments belonging to the seven structural genes are also presented in Table 2. Table 2  Table 2  Table 2  Table 2. Primers used for gene amplification in MLST technique and their annealing temperature Name Name Name Name Sequence The amplification reaction was performed in a 50 μL volume including: 1x PCR buffer; 1.5 mM MgCl2; 0.2 mM of each dNTP; 1 μM forward primer; 1 μM reverse primer; GoTaq Polymerase 1.25 U; and 6 μL genomic DNA. The following amplification program was used: initial denaturation for 3 min at 94 °C, followed by 35 cycles of 30 sec at 94 °C, 30 sec at 47-59 °C, and 30 sec at 72 °C; and a final extension for 5 min at 72 °C.
Prior to sequencing, a purification step with Promega Wizard SV Gel and PCR Clean-UP System (Promega Corporation, Madison, USA) was carried out according to the manufacturer's manual. Sequencing of both strands was performed using DTCS Quick Start Master Mix; 3.2 μL primer (1 pmol/µL); 0.5-10 μL of template DNA; and 0-9.5 μL H2O UV/UP. The following thermal cycle parameters have been used: 96 °C -2 min; 40 cycles of 96 °C -20 sec, 50 °C -20 sec, 60 °C -4 min. Excess primers and labelled ddNTPs were removed by purification with DNA Clean & Concentrator™ (Zymo Research, Orange, USA) according to the standard recommended by the manufacturer. The samples (total volume 20 μL) were prepared prior to sequencing by 6 adding 20 μL of HiDi formamide and then were loaded onto CEQ 8000 Genetic Analyzer (Beckman-Coulter, Fullerton, USA).

Data analysis
After performing the sequencing reactions, the sequences were analyzed using BioEdit (Hall, 1999) and eBURST (http://www.eBurst.mlst.net). The purpose of eBURST algorithm was to identify, based on sequence data, groups of isolates with related genotypes in population, and, then, to establish the founding genotype of each group. In the next step, the algorithm determined also the descendants of the founding genotype, which would be able to represent foundational genotypes for other subgroups of bacterial isolates. The graphical form of the analyses performed by the eBURST algorithm was a radial diagram centered on an ancestral founding genotype.

Results Results Results
Species identification of enterococcal isolates sampled from wastewater in Cluj-Napoca area The identification of enterococcal species isolated from wastewater was quite difficult to achieve and required both a constant repetition of the performed analyses and a combined use of several analysis methods. 27 isolates were identified in the collected samples of wastewater and the results obtained by a classical method have been confirmed by at least one standardized method (Table 3). Table 3  Table 3  Table 3  Table 3 The percent indicate the degree of confidence in accuracy of the results The dominance of E. faecalis and E. faecium in the wastewater system was documented by previous studies that have examined species distributions of Enterococcus in wastewater (Blanch et al., 2003;Moore et al., 2008). Moreover, Ferguson et al. (2013) in a study performed on marine water and wastewater, found that E. faecalis was the dominant species in a ratio of over 90%, while E. faecium was present in a ratio of about 5%, the last 5% being represented by other enterococcal species. Nevertheless, in the present case, the species hierarchy differed from expectance (Figure 1). E. faecalis was placed on the first position with 11 identified isolates (40.74% from the total of analyzed isolates), on the second place was the species E. casseliflavus represented by eight isolates (29.62%), E. faecium was represented by only five isolates (18.50%), whereas the rest of the species were identified as E. gallinarum -two isolates (7.40%) and E. durans -one isolate (3.70%).

Identification of the Enterococcus faecium species by PCR
This genetic identification method for species of enterococcal isolates was based on the specific amplification of a D-alanine-D-alanine ligase-encoding fragment (ddl). Following migration of the PCR products on 1% agarose gel, in case of E. faecium isolates amplification of a 550 bp fragment was obtained, whereas no amplification was visible for E. faecalis isolates (Figure 2).   The isolates codes are the same as in Tables 1 and 3 The amplification reaction was performed post species identification by classical, biochemical, and automated phenotypic methods (Table 3), as a confirmation of the previous results.
Inference of the phylogenetic relationships between Enterococcus faecium isolates sampled from Romania using the MLST approach and eBURST algorithm Allelic variation and genetic diversity of Enterococcus faecium isolates A number of 32 isolates of E. faecium from different sources were included in the MLST assay (Table  1). For these isolates a variable number of unique locus alleles was obtained, ranging from four unique alleles for the adk gene to 12 unique alleles for the atpA gene (Table 4). Following the analysis of 32 isolates of E. faecium, the existence of 31 STs was established. Isolates 41/6 and 105/6 exhibited the same allelic profile. Both isolates originated from humans, being sampled from the same medium -blood. Surprisingly, isolates 105/6 was isolated in Bucharest, while isolates 41/6 was isolated in Timișoara. A number of five isolates showed 100% similarity for six of the analyzed loci, with differences appearing only in atpA locus. These isolates were: Cj22, 106/6, 197, Cj316, and 113/6. The source of these isolates differed: isolate Cj22 originated from wastewater and was isolated in Cluj-Napoca, isolates 106/6 and 113/6 were of human origin and have been collected from blood in Bucharest, the other two isolates, Cj197 and Cj316 were sampled from peritoneal fluids collected from patients hospitalized in Cluj-Napoca.

9
Another number of five isolates of E. faecium presented 100% similarity for five out of seven analyzed loci. Of these, isolates 43/7 and 129/6 were sampled in Bucharest from blood. The other three isolates derived from Cluj-Napoca, Cj24 isolate being collected from wastewater, while Cj117 and 284/7 isolates are of human origin, collected from plague and blood.
There was also a batch of eight isolates which exhibited a 100% similarity for four out of seven loci, despite the fact that sources of origin are quite heterogeneous: isolates 18/7 and 19/7 are of human origin, being collected in Oradea from blood, respectively plaque discharge; isolates Cj20 and 283/7 were taken from Cluj-Napoca, but from different sources (wastewater and human blood); isolate B176 was sampled from wastewater in Bucharest; the remaining three isolates are of animal origin, being sampled from poultry faeces collected in Timișoara (G3-2 and G9 isolates) and Cluj (G10).
The rest of the isolates showed 100% similarity for three or less of the analyzed loci.
The other Romanian isolates were distributed into two separate MLST groups, respectively group 5 MLST and group 11 MLST (data not shown). This grouping confirmed the results obtained by applying the eBURST algorithm exclusively for isolates from Romania, when these isolates formed groups 2 and 3 (Figure  3 b, c).

Discussion Discussion Discussion
Enterococcus species distribution in wastewater The dominance of E. faecium and E. faecalis is well documented in previous studies which examined the distribution of Enterococcus species in wastewater systems (Sinton and Donnison, 1994;Blanch et al., 2003;Moore et al., 2008). Similarly, the prevalence of E. casseliflavus is also consistent with clinical human fecal samples (Ruoff et al., 1990;Stern et al., 1994). Nevertheless, the enterococcal rate is very variable, depending on various factors such as: degree of urbanization, number of hospitals, and so on. While performing a study on a hospital wastewater treatment plant, Karimi et al., (2016) found that out of 315 enterococci isolates, 162 (51.42%) were identi ed as E. faecium, 87 (27.61%) as E. hirae, 35 (11.11%) as E. faecalis, 11 (3.5%) as E. gallinarum, 7 (2.22%) as E. casseli avus, and 4 (1.26%) as E. avium.  . Dendrogram of group 1 of Enterococcus faecium isolates from MLST database. Green color is used to highlight the isolates sampled from Romania included in this group.

Assortment of Enterococcus faecium isolates sampled from Romania
A first first first first group group group group of enterococcal isolates was formed after eBURST analysis around Cj316 isolate ( Figure  3a). This isolate was sampled from peritoneal fluid of human source collected in Cluj-Napoca. Interestingly, among the SLVs belonging to this isolate, the isolate Cj197 (sampled from the peritoneal fluid collected from patients hospitalized in Cluj-Napoca) was noted, indicating a significant relationship between these two isolates. Another interesting connection was based on the observation that other two SLVs of the Cj316 isolate are collected from the wastewaters from Cluj-Napoca, Cj22 and Cj24. This observation led to the assumption of a connection between the isolates collected in the hospital environment and those taken from the environment (here represented by wastewaters).
Other close connection, in terms of origin, within this first group was also observed on the DLV level, involving two isolates sampled in Cluj, namely the isolate Cj117, sampled from human wound secretion, and isolate 284/7, collected from human blood.
The other three SLVs of the group were represented by the isolates 43/7, 106/6, and 129/6. These isolates were also connected by the place of origin, all three isolates being taken from samples collected from patients from Bucharest, two from the blood (106/6 and 129/6) and 43/7 from sputum.
A second group second group second group second group of E. faecium consisted of five isolates, all of animal origin, sampled from poultry faeces taken from Cluj and Timișoara. According to the dendrogram (Figure 3b), the central isolate of this group was G9, sampled from Cluj. The group contained three SLVs, two of them being isolated in Timișoara and one in Cluj. The only DLV was represented by the isolate G9-1, collected in Timișoara.
The central isolate of the third group third group third group third group (Figure 3c) is the G8 isolate, of animal origin. The group had two SLVs represented by the isolates G6 (sampled in Cluj from bird faeces) and 40/7 (collected from human blood in Bucharest).
Based on data obtained by querying the MLST database and the use of the eBURST algorithm, some general conclusions regarding the local epidemiology could be drawn for the Romanian isolates of E. faecium. Thus, for 17 isolates, straightforward connections in terms of their origin were established. However, for 14 isolates (45.65%) the phylogenetic pattern could not been so easily explained. Some possible answers could include (but are not limited to) the high genetic variability of bacterial isolates (Martínez-Carranza et al., 2018), the high degree of genetic recombination (Shikov et al., 2022), the horizontal genetic transfer between bacterial isolates (Heuer and Smalla, 2007), and genetic dispersion (Oh et al., 2010).

Inference of phylogenetic relationships between Enterococcus faecium isolates sampled from Romania and from other parts of the world
Subsequent to the eBURST analysis of 1387 isolates of E. faecium sampled from different parts of the globe, 20 groups of isolates were obtained. The main group was constituted of 1135 isolates. The founder genotype of this group was an isolate with a ST=17 (Figure 4) and an allelic profile of 1 1 1 1 1 1 1. According to the database, the first isolate with ST=17 was collected from a patient in the United Kingdom in 1992. This lineage (ST=17) was characterized by ampicillin resistance, pathogenicity island, and was associated with hospital outbreaks (Willems et al., 2005). The distribution of these isolates (ST=17) was higher in Europe, probably due to the existence of more extensive studies: 47 isolates in Germany, 23 isolates in the UK, 20 isolates in Netherlands, 15 isolates in Italy, 4 isolates in France, 3 isolates in Greece, 2 isolates in Denmark and Poland, whereas only one isolate was isolated from Hungary, Norway, and Serbia. By comparison, few isolates with ST 17 were identified on other continents: 11 isolates on the Australian continent, 4 isolates in the North America (USA), while in South America (Brazil) only one isolate (Willems et al., 2005). The source of E. faecium ST=17 isolates was in an overwhelming proportion the hospitalized patients. Still, there were also some exceptions, in Australia (two cases) and in the UK and France (one case each), where the source of the isolates was represented by people in outpatient treatment. Furthermore, in two cases found in 2001 in the USA, the isolates were taken from the environment.

13
The presence of five lines of E. faecium sampled from Romania as SLV, pertaining directly to the ST=17 complex, is particularly alarming because ST=17 is a predictor of subsequent bacteremia in hospitalized patients carrying VREF and has been frequently isolated from nosocomial settings (Kim et al., 2021).

Con Con Con Conclusions clusions clusions clusions
Out of 27 Enterococcus isolates sampled from wastewater, 11 isolates were identified as E. faecalis (40.74%); 8 isolates as E. casseliflavus (29.62%); 5 isolates as E. faecium (18.50%); 3 isolates as E. gallinarum (7.40%) and one isolate as E. durans. By applying the eBURST algorithm for E. faecium isolates sampled from Romania, three groups were delineated mainly on origin source. Thus, group 1 joined isolates from human sources and environment (wastewater), whereas groups 2 and 3 comprised isolates originated mainly from birds. This pattern suggested the existence of at least two phylogenetic lines of E. faecium isolates in Romania: a line that had mainly the human host prevalence, and a second line with mainly animal host prevalence. The first phylogenetic line highlighted the existence of a clear connection between the isolates of human origin and those of wastewater origin. This connection was most likely determined by the uncontrolled spills in flowing water networks, which underlined a direct link between the hospital environment and the surrounding environment (wastewater in the present case). The results of our analyses pointed out that the degree of isolate differentiation revealed by MLST and eBURST was high enough to be used in epidemiological investigations.