Isolation and Characterization of NDM-Positive Escherichia coli from Municipal Wastewater in Jeddah, Saudi Arabia

The emergence of resistance to last-resort antibiotics is a public health concern of global scale. Besides direct person-to-person propagation, environmental pathways might contribute to the dissemination of antibiotic-resistant bacteria and antibiotic resistance genes (ARGs). Here, we describe the incidence of blaNDM-1, a gene conferring resistance to carbapenems, in the wastewater of the city of Jeddah, Saudi Arabia, over a 1-year period. blaNDM-1 was detected at concentrations ranging from 104 to 105 copies/m3 of untreated wastewater during the entire monitoring period. These results indicate the ubiquity and high incidence of blaNDM-1 in the local wastewater. To track the bacteria carrying blaNDM-1, we isolated Escherichia coli PI7, a strain of sequence type 101 (ST101), from wastewater around the Hajj event in October 2013. Genome sequencing of this strain revealed an extensive repertoire of ARGs as well as virulence and invasive traits. These traits were further confirmed by antibiotic resistance profiling and in vitro cell internalization in HeLa cell cultures. Given that this strain remains viable even after a certain duration in the sewerage, and that Jeddah lacks a robust sanitary infrastructure to fully capture all generated sewage, the presence of this bacterium in the untreated wastewater represents a potential hazard to the local public health. To the best of our knowledge, this is the first report of a blaNDM-1-positive E. coli strain isolated from a nonnosocomial environment in Saudi Arabia and may set a priority concern for the need to establish improved surveillance for carbapenem-resistant E. coli in the country and nearby regions.

for NDM activity (2); chelation by EDTA would result in the inhibition of the hydrolytic activity. Hence, in bacterial isolates positive for blaNDM-1, a zone of inhibition would be anticipated near the sterile disc spotted with EDTA-meropenem and no zones of inhibition are anticipated near the remaining three discs. Isolates that exhibit such phenotypic traits were further confirmed for blaNDM-1 by end-point PCR using the primer pairs 5'-CATTAGCCGCTGCATTGA-3' forward and 5'-TAGTGCTCAGTGTCG-3' reverse primer pair (1). Gene sequences were obtained by Sanger Sequencing and subsequently matched against the National Center for Biotechnology Information (NCBI) nucleotide sequences database using BLASTN to determine the presence of blaNDM-1.

Processing of genome and plasmid sequences.
Raw sequencing reads were subjected to data trimming and filtering. In the first step, we check for the presence of adapter sequence and removed the adapter sequence from the reads. Bases at the 3' end that fell below a quality of 20 were trimmed off. Reads with average quality score of 20 were discarded. Finally, reads with at least 50 bases length were retained for further analysis.
Two de-novo assemblers, namely CLC genomics workbench and SOAPdenovo were used. k-mer sizes of 40 and 50 were used for CLC-Genomic workbench and k-mer sizes 31, 41, 51, 61, 71, 81 and 91 were used for SOAPdenovo. Then, the output of all the assemblies were combined into a large super-set of sequences. Contig that contained "N" was split into contigs by removing the N's. In order to reduce the redundancy, merged contigs were first processed by CD-HIT-EST with 100% identity to remove identical fragments. The contigs were then sorted based on their size. Sorted contigs were used to create preliminary scaffolds. Each contig was used to search against the remaining contigs to find an overlap at the ends of the contigs. If the contig finds at least 50 bases overlap with another contig at the ends, then both contigs were merged to form a single contig. Then, the merged contig was used to search against the remaining unmerged contigs to find overlapped contigs. This iterative process of overlap determination and contig assembly is repeated until there are no remaining overlaps among the contigs.
Subsequently, we ordered and oriented these contigs into larger units (scaffolds), which usually requires a reference genome. Since we do not have complete reference genome for our organism, we used another closely-related bacterial genome as a reference. To identify a closely-related reference microorganism, we performed BLAST search for each contig against NCBI bacterial genome database. Then, we calculated alignment coverage for each bacterial genome from total number of bases aligned in the genome divided by the size of the genome. Finally, we selected NC_011741 (Escherichia coli IAI1 chromosome-complete genome) as a reference genome. There are two reasons for selecting NC_011741 genome. First, the size of NC_011741 genome is similar to the genome we have sequenced. Second, more than 95% of the genome is aligned with our contigs.
The contigs were sorted based on their alignment position on the IAI1 chromosome genome. Finally the gap-size between two contigs was calculated and evaluated. If the calculated value is negative and an overlap was found, contigs are merged. On the other hand, positive value indicates that there is a gap between the contigs. These gaps are closed by mapping the raw reads that span gaps between two scaffolds. If the gap is not connected by any read, then a gap between contigs using one or more undefined 'N' nucleotides depending on the gap-size were inserted.
The same plasmid assembly process was performed with the exception that the reference plasmids selected was pKDO1 (GenBank no. JX424423) from Klebsiella pneumoniae.
The reason for this was that more than 95% of the plasmid is aligned with our plasmid contigs.