Molecular diagnostics in South Africa and challenges in the establishment of a molecular laboratory in developing countries

The laboratory plays a significant role in public health surveillance, outbreak investigation and infection prevention and control strategies. Microbiology laboratories are moving towards incorporating molecular biology techniques for the surveillance and identification of pathogens causing infectious diseases as well as the genotypic characterisation of these organisms. These methods are accurate, rapid, reliable, and provide a wealth of information that are not available using conventional phenotypic methods. However, establishing such a laboratory can be challenging in developing countries due to poor infrastructure, the lack of funding and the required expertise. This manuscript discusses the essential issues that need to be addressed when establishing a molecular microbiology laboratory and the usefulness of molecular techniques in public health surveillance and outbreaks in developing countries. Molecular data on South African findings obtained from surveillance and outbreak studies are also presented in this manuscript.


Introduction
Infectious diseases pose a serious threat to public health. Identifying pathogenic microorganisms that cause infections and tracing the sources and transmission of these infections are important in the surveillance of organisms, outbreak investigations and in infection prevention and control (IPC) programmes [1,2]. Conventional microbiological techniques such as standard biochemical tests and disc diffusion tests have played a pivotal role in the identification of these pathogens but are more time consuming, less sensitive and specific and offer limited information as compared to molecular methods [3]. Because of this, molecular techniques are in demand; they are used to make clinical decisions for patient management, to monitor the effectiveness of therapeutic strategies and to identify resistant strains that may influence treatment programmes [1]. In order to understand pathogen distribution and control the spread of infections timeously, accurate, reliable and rapid diagnosis is required. This can be achieved by the application of molecular typing techniques on which almost all epidemiological studies rely for pathogen identification, characterisation and genetic relatedness [1]. However, establishing a molecular microbiology laboratory can be challenging in developing countries such as African countries due to poor infrastructure, the lack of funding and the required expertise. In order for a molecular microbiology laboratory to be fully functional, a number of factors need to be considered and put in place. The laboratory environment needs to be well-designed into sterile and non-sterile designated areas, standard operating procedures need to be optimised and implemented and laboratory administration procedures such as good housekeeping viz. temperature monitoring, laboratory cleanliness, stock control, access control, laboratory safety, waste management and equipment need to be maintained [1].
In this manuscript, we emphasize the key issues that need to be addressed when establishing a molecular laboratory in developing countries and the usefulness of molecular techniques in public health surveillance and outbreaks. We also present molecular data on findings obtained from surveillance and outbreak studies in South Africa.

Establishment of a molecular microbiology laboratory in developing countries
Access to reliable laboratory testing is limited in developing countries, many of which are low-resource settings. The consequences of this are misdiagnosis and delayed diagnosis resulting in inappropriate and ineffective treatment causing increased morbidity and mortality [4]. The capabilities and capacity of laboratories differ depending on the country and location; challenges include the lack of laboratory supplies e.g. reagents and consumables, the lack of equipment, limited skilled personnel, inadequate quality control procedures and good laboratory practice and the absence of training programmes and government standards [5,6]. Furthermore, poor infrastructure such as a lack of clean water, electricity supply and appropriate storage facilities limit the quality of a laboratory's output. A well-controlled, clean laboratory environment is vital. Ventilation systems providing containment areas with increased negative pressure, a stable supply of gas and an electricity generator should be available. It is not always possible to design separate laboratories for designated procedures; however, it is possible to designate separate functional work areas for different procedures to be performed. This will create contained environments for sterile and non-sterile work where consumables, reagents and equipment should be dedicated to each work area and should not be interchanged [7]. Designated work areas for reagent preparation, polymerase chain reaction (PCR) mastermix preparation, nucleic acid processing, amplification and a post-amplification area are recommended [1]. Each area should have its own personal protective equipment (PPE) such as gloves, safety goggles, laboratory coats etc. and all surface areas and consumables such as racks, reagent containers etc. should be swabbed thoroughly on a regular basis with 10% bleach solution and 70% ethanol. Laboratory equipment should be calibrated, verified and maintained regularly as a lack of optimally functional equipment will compromise the quality and efficacy of the laboratory test and laboratory personnel should be properly trained on the use and maintenance of equipment [8]. Good laboratory practice is essential and the development of standardised operating procedures is fundamental to ensure that laboratory procedures are carried out adequately. All laboratory staff should be trained in the skills that are required for molecular techniques such as precise pipetting, nucleic acid handling and disposal and software manipulation for the analysis and reporting of results [9]. Waste should be managed appropriately; solid waste should be discarded in biohazard boxes, liquid waste in canisters and needles, scalpels, glass etc. in a sharps container. Biomedical waste should be decontaminated and incinerated [1]. Because of biohazardous material, a risk assessment and biosafety plan should be put into place. This should include guidelines for working with the material and primary and secondary care designs such as protective equipment e.g. biosafety cabinets and laboratory facilities respectively. This plan should take into consideration preventing the spread of laboratoryassociated infections and should protect the environment [1]. Medical laboratories should also undergo laboratory accreditation where it is assessed by an authorised body to implement and maintain quality control. This will ensure that the laboratory is governed by standardised international guidelines that have been endorsed by professional organisations e.g. the South African National Accreditation System (SANAS) and the International Laboratory Accreditation Cooperation (ILAC) [1]. Furthermore, laboratories should subscribe to Proficiency Testing (PT) scheme/External Quality Assessment (EQA) Programmes. This is the external evaluation of a laboratory's performance on a set of samples and allows for the comparison of results among different test sites. It identifies weaknesses in laboratory and thus areas for improvement [10]. All of the above needs to be taken into consideration when establishing a molecular laboratory; however, this is not always possible in countries with limited resources. Recommendations for a stepwise approach of establishing a laboratory would be useful. The Regional Office for Africa of the World Health Organization (WHO AFRO) has established a framework, the Stepwise Laboratory Quality Improvement Process Towards Accreditation (SLIPTA) for improving the quality of public health laboratories in developing countries to achieve ISO 15189 standards [11]. This framework measures and evaluates the progress of laboratory systems towards international accreditation and awards a certificate of recognition using a standardised process. This enables laboratories to improve and develop their quality management systems thereby resulting in reliable, accurate and timely laboratory results [13,14].

The use of molecular techniques surveillance programmes and outbreak investigations
Molecular typing methods have greatly contributed to the success of surveillance programmes and outbreaks and have aided epidemiologists and infection control specialists. In order to track infections and establish effective IPC programmes, the pathogens causing the infections need to be identified and infection trends need to be monitored. The molecular microbiology laboratory plays a fundamental role in achieving this by providing access to high-quality data [12]. Molecular typing techniques have successfully contributed to the widespread surveillance and outbreaks of a number of pathogens [13]. Molecular typing of a particular bacterial isolate generates a specific genetic fingerprint for assessing strain relatedness. This allows the differentiation between bacterial genomes such as single nucleotide polymorphisms (SNPs), insertions and deletions. These can be detected by several molecular methods but the typing resolution varies depending on the method used. Currently, molecular typing methods include PCRbased methods, DNA fragment analysis-based methods and DNA sequence-based methods [2].
Conventional PCR-based methods that target a single or multiple regions in the genome is the most widely-used technique as it is simple, quick and cost effective when compared to other molecular typing methods. It makes use of a mastermix containing buffer, magnesium chloride, the enzyme Taq polymerase, deoxyribonucleotide triphosphates (dNTPs) and specific primers that target the region of interest on the DNA template. The amplified product is run on an agarose gel by electrophoresis where an electrical current is applied and visualised at the end of the process by the presence of a band. This method can target a single region in a single PCR reaction or multiple regions in a multiplex PCR reaction. This method may be used for organism identification, toxin gene, virulence gene and antibiotic resistant gene amplification [2,14,15]. In addition to conventional PCR, real-time PCR is the amplification and quantification of DNA that occurs simultaneously. The mastermix is similar to that of a conventional PCR but includes either a fluorescent dye or probe for visualisation of the amplified product which is in the form of an amplification curve that is detected in real time as the reaction progresses. This method is quicker than the conventional PCR method as there are no postamplification steps but it is more costly [14].
DNA fragment analysis-based methods are useful in establishing strain relatedness. Amplified Fragment Length Polymorphism (AFLP) is the amplification of genomic restriction fragments making use of the PCR methodology. Genomic DNA is restricted using two restriction enzymes and amplified resulting in DNA fragments. Genetic polymorphisms are identified by the presence or absence of DNA fragments. Restriction enzymes are used to digest the genomic DNA. This creates 5'-and 3'-ends. Double-stranded oligonucleotide adapters, homologous to one 5'-or 3'end are ligated to the DNA fragments which are then amplified by PCR using primers complementary to the adapter and restriction site sequence with additional selective nucleotides at their 3'-end. This results in those fragments with complementary nucleotides extending beyond the restriction site being amplified by the primers. Amplified fragments are shown on a polyacrylamide gel and polymorphisms can be seen. The patterns generated for each sample are compared and related isolates display similar or identical banding patterns [16]. This method is highly discriminatory and covers a large portion of the genome [2].
The Enterobacterial Repetitive Intergenic Consensus (ERIC) PCR is a method for genotyping Enterobacterial strains and used for determining DNA fingerprints. It is simple, quick and cost effective. This methodology is based on short (30-150 bp) interspersed repetitive sequences within the bacterial genome. These regions are amplified and visualised on an agarose gel. A banding pattern is obtained and specific software can be used to generate dendograms [17,18].
Pulsed-field gel electrophoresis (PFGE) is a DNA fragment analysis-based method where the bacterial genome is digested with a restriction enzyme to generate a number of DNA fragments of different sizes. This is also called restriction fragment length polymorphisms (RFLP). The digested samples are separated on an agarose gel and electrophoresis is carried out by applying an electrical current that periodically changes directions in a gel matrix. The fragments are separated and a banding pattern is obtained. Like with AFLP and ERIC PCR, the patterns generated for each sample are compared and related isolates display similar or identical banding patterns. This method has a good discriminatory power and typeability but only patterns generated on the same gel can be compared and in order to compare results from different runs, strict protocols must be applied [2,14,15].
Variable number tandem repeat (VNTR) typing makes use of short nucleotide sequences that occur as tandem repeats. When these repeats lie adjacent to one another between isolates and their number at the same locus varies the genomic regions are called VNTR loci. The number of tandem repeat sequences at different loci is determined by multilocus VNTR analysis (MLVA). This method is a multiplex PCR that amplifies a number of well-selected VNTR loci and the amplicons are run on an agarose gel by electrophoresis. The fragment sizes of each locus is measured and together with the known length of a single repeat and the flanking consensus regions to which the primers were designed, the number of repeat units at each locus can be calculated. This method is not a universal method as it depends on the design of the primers so it may not be useful for inter-laboratory comparison of bacterial isolates due to differences in the protocols employed [2,15].
DNA sequence-based methods are the method of choice as they are the most advanced methods that are accurate and provide much information. These methods can analyse single or multiple genes within the genome depending on the required application.
Single-locus sequence typing (SLST) analyses the sequence of a single gene i.e. the relationships between bacterial isolates based on comparing variations in the sequence of a single gene is determined. For example, sequencing of the staphylococcal protein A (spa) gene in Staphylococcus aureus. Spa-typing first makes use of VNTR analysis as it determines the number of repeats of a repeat region by PCR. The region is then sequenced and the sequence analysed using a special software programme to determine strain relatedness where isolates belonging to the same spa-type are thought to be genetically related [2,15].
Multilocus sequence typing (MLST) compares nucleotide sequences of a series of housekeeping/reference genes (usually seven or more). The genes selected for a particular species are present in all isolates of that species. Genetic polymorphisms in sequences occur in each gene and are called alleles. Each isolate will therefore have a specific allele at each housekeeping/reference loci. A combination of all alleles creates an allelic profile. Allele numbers are assigned using an online programme specific to the MLST scheme and a combination of the allele numbers (i.e. allelic profile) produces a particular sequence type (ST). Isolates with identical profiles are genetically related and belong to the same clone and evolutionary relationships can be investigated. Strains from around the world can be compared. The data obtained are unambiguous and portable [2,14,19].
Next generation sequencing (NGS) has greatly advanced DNA sequenced-based methods as genomics and whole genome sequencing (WGS) coupled with a bioinformatics approach enhance the knowledge and understanding of clinical microbiology and infectious diseases [20]. This method is likely to replace traditional typing methods, other sequence-based methods and resistance/virulence gene detection etc. as it involves extensive strain characterisation and epidemiological analyses. Analysis may be used to extract information on the entire genome, the exome, a selection of genes, high-resolution quantitative comparison between copies of different chromosomal segments or chromosomes and mobile genetic elements e.g. plasmids etc. This sequencing method produces large quantities of data. Modern computation methods (bioinformatics analyses) are then applied to the data to assemble the sequence reads into larger contigs that can be constructed into almost complete genomes. Genomic characterisation can then be performed (e.g. bacterial identification, gene identification and gene annotation). Comparative genomics can also be performed to determine the genetic relatedness between the strains and phylogenetics can be applied to establish routes and sources of transmission [20,21]. Information on pathogen detection and identification, epidemiological typing and drug susceptibility can therefore be obtained in a single run. However, although information on the presence (and absence) or expression of resistance genes can be obtained, there is the limitation of not having a phenotypic antimicrobial susceptibility profile and resistance could be excluded in cases where mutations are not known. Nevertheless, NGS methods are extremely useful for surveillance and outbreak investigation purposes [22].

Molecular diagnostics and typing in South African surveillance programmes and outbreaks investigations
Molecular diagnostics and typing methods have been useful in surveillance programmes and outbreak investigations.
In response to the Ebola virus outbreak in West Africa in August 2014, the South African National Institute for Communicable Diseases (NICD) established a modular high-biosafety field Ebola diagnostic laboratory in Sierra Leone. The laboratory consisted of multiple rooms equipped with air conditioning units, emergency electricity generators, a back-up diesel-powered generator, uninterrupted power supply (UPS) units, storage equipment such as biocontainment devices, real-time PCR instruments, refrigerators, freezers and other critical equipment. Designated rooms were allocated for centrifugation, aliquoting and inactivation of specimens, addition of positive control ribonucleic acid (RNA), reverse transcriptase (RT)-amplification of the RNA templates, PCR mastermix preparation, RNA extraction and storage. The laboratory contained a biocontainment negative pressure chamber which included an airlock, main chamber and an air filtration system. Strict decontamination procedures were adhered to and PPE were used at all times. Laboratory staff underwent vigorous training to ensure safety and integrity of laboratory procedures. Between 25 August 2014 and 22 June 2016, 11256 specimens were tested of which 2379 (21.1%) were positive for Ebola virus RNA. Although challenges were experienced in the establishment of the laboratory, rapid molecular laboratory confirmation was crucial for the detection of Ebola Virus Disease cases, contact tracing and secure burial practices in Sierra Leone [23].
A public healthcare sector surveillance study [24] conducted in four provinces (Gauteng, KwaZulu-Natal, Free State and the Western cape) in South Africa from June 2010 to July 2012 aimed to gain an understanding of the molecular epidemiology and antimicrobial resistance trends of S. aureus bacteraemia. A series of molecular techniques were employed; these included a real-time PCR assay to screen for the presence of the methicillin resistant determinant, mecA and the species specific nuc gene, a conventional PCR assay for SCCmec typing to determine the circulating SCCmec element types. Further on, SLST for the amplification and sequencing of the spa-gene and MLST to determine circulating ST was performed. Methicillin resistance was confirmed in 43% of the isolates (n = 1160) due to the presence of the mecA gene; the most common SCCmec type was type III (41%, n = 531) followed by type IV (31%, n = 402) which are typically associated with hospital and community-acquired infections, respectively. Additionally, spa-typing revealed 47 different spa-types with the five most common spatypes (t037, t1257, t045, t064 and t012) accounting for 87% and the most common ST was ST612 clonal complex (CC8) (n = 7) followed by ST5 (CC5) (n = 4), ST36 (CC30) (n = 4) and ST239 (CC8) (n = 3). Findings from this study correlated with previous studies in South Africa [25][26][27]. This study demonstrated the importance of monitoring molecular typing trends to detect changing epidemiological trends in antimicrobial resistance patterns [24].
Another surveillance study conducted from mid-2010 to mid-2012 in 13 academic centres from the public healthcare sector from five South African provinces (Gauteng, KwaZulu-Natal, Free State, Western Cape and Limpopo) made use of real-time multiplex PCR assays to determine the presence of extended-spectrum-beta-lactamases (ESBLs) in a subset (n = 270) of Klebsiella pneumoniae blood culture isolates. The presence of ESBLs (blaCTX-M, bla SHV and bla TEM genes) were confirmed in all isolates. Furthermore, 93% of the isolates tested expressed more than one resistance gene. In addition to screening for ESBLs, the carbapenemases (bla KPC and bla NDM-1) were screened for using a multiplex real-time PCR assay. Of the phenotypically carbapenem non-susceptible isolates (5%), no isolate contained bla KPC or bla NDM-1. This study showed a high proportion of ESBL-producing K. pneumoniae isolates which is concerning [28].
A cross-sectional surveillance study evaluating the phenotypic Microscan Walkaway system for the detection of Carbapenem Resistant Enterobacteriaceae (CRE) submitted from public healthcare sector laboratories from July 2015 to July 2016 from four provinces (Gauteng, KwaZulu-Natal, Free State and Western Cape) in South Africa used a molecular multiplex real-time PCR assay as the gold standard. Of a total of 219 isolates tested, 173 (78.9%) were positive for carbapenemases. The most predominant carbapenemase was bla NDM (38.8%; n = 85), followed by bla OXA-48 and its variants (32.8%; n = 72), bla VIM (6.9%; n = 15) and bla GES (0.5%; n = 1) [29]. The findings from this study clearly showed that carbapenemases were driven largely by bla NDM and bla OXA-48 and its variants which is in agreement with other South African studies from both the public and private healthcare sectors [30][31][32].
A possible outbreak of carbapenem-resistant Enterobacter cloacae producing OXA-48-, VIM-and IMP-Type-beta-lactamases from January 2013 to April 2014 isolates from five hospitals in the Eastern Cape was also investigated making use of a series of molecular typing methods [33]. A combination of conventional and real-time PCR assays were used for the detection of the carbapenemase genes, the isolates were typed using MLST and a further subset of isolates (15 IMP-positive isolates) were processed using PFGE. Results showed that 15 isolates contained blaIMP, two harboured blaVIM and one was positive for blaOXA-48. PFGE of the IMP-producing strains identified three major clusters and phylogenetic analysis showed that all isolates were related. MLST results revealed seven different ST plus a new ST for five isolates. The ST for four isolates could not be obtained. Phylogenetic analysis showed that all strains shared a common ancestor and were distantly related as evidenced by the different number of STs detected. Those belonging to the same ST clustered together. The authors state that these results show that horizontal transmission of organisms is potentially an important factor to take into consideration in the hospital setting. They further illustrated a timeline of each isolate from all five hospitals over the 16-month period and this indicated that horizontal transmission was possible and that identical ST within the same hospital as well as among different hospitals could show intra-and inter-clonal spread. Since epidemiological data was not obtained to support this, the authors state that transmission cannot necessarily be concluded [33].
An increased number of human cases of Listeria monocytogenes were identified in the South African Western Cape province hospitals in September 2015. Three L. monocytogenes isolates from blood culture specimens were obtained. WGS was performed and genomic sequence data was used to contribute to the limited epidemiological data that currently existed for the organism in the South African setting. MLST revealed that all isolates belonged to the ST6, a subtype associated with unfavourable patient outcomes [34].
In another South African study investigating Salmonella Enteritidis outbreaks, MLVA was performed as it is a useful technique for highly homogenic serotypes such as Salmonella Enteritidis. Thirty-nine isolates occurring in six provinces from seven foodborne illness outbreaks between 2013 and 2015 were investigated. Among all isolates three MLVA profiles were identified. All isolates within each outbreak produced the same MLVA profile. Profile 28 accounted for the majority of the cases (n = 30), followed by profile 22 (n = 6) and profile 21 (n = 3). Phylogenetic analysis was carried out and the minimum spanning tree (MST) revealed a close relationship between all three MLVA profiles with only a single VNTR locus difference between them [35].
An earlier South African outbreak of cholera was investigated using molecular typing methods. Between November 2008 and April 2009, 720 Vibrio cholerae O1 strains were characterised by serotype testing and antimicrobial susceptibility testing. A subset of 248 isolates were investigated molecularly by PFGE and a smaller subset of 90 isolates were further characterised by conventional PCR of virulence determinants, cholera toxin (CT) enzymatic A subunit (ctxA) and toxin coregulated pilus (tcpA) genes. Two isolates were selected for sequencing of the complete coding region, CT (ctxAB gene). A number of molecular mechanisms conferring antimicrobial resistance were further investigated in the 90 isolates. PFGE showed that 25 different banding patterns were obtained. However only subtle differences were observed suggesting that all isolates were closely related. The most common banding pattern was seen in 64 isolates (25.8%). The 90 isolates investigated for virulence determinants were positive for both the ctxA and tcpA-El Tor genes. Mutations were seen in the ctxB region of the ctxAB gene of the two isolates sequenced. Antimicrobial resistance mechanisms were observed; tetA (7.8%, n = 7), STX element-integrase (100%, n = 90) and ESBL, blaTEM (6.7%, n = 6). All 90 strains harboured chromosomal mutations in the GyrA and ParC genes. Isolates described in this study were multidrug resistant as they displayed resistance to at least three classes of antimicrobial agents. Virulence determinants and antimicrobial resistance genes were identified and PFGE showed to be a valuable tool as it was able to differentiate strains in this outbreak from other banding patterns of South African strains on their database although the data was not shown. Furthermore the most commonly identified pattern was extremely similar to the predominant banding patterns from a Haiti cholera outbreak and this was confirm by WGS analysis [36].
Another South African study investigating an outbreak of Neisseria meningitidis serogroup W135 used PFGE and MLST to characterise strains. PFGE was performed on 93% (n = 377) of the 406 serogroup W135 isolates and MLST was performed on a subset of 20 isolates. PFGE showed the isolates to be highly clonal with 350 (93%) isolates falling into a distinct cluster. The majority (80%) of isolates in this cluster were indistinguishable or differed by a single band when compared to two previous Hajj-related outbreaks in 2000. Thirteen of the isolates in this cluster were typed by MLST and all were ST11 of the ST11/ET37 complex. Although isolates in this cluster were from all nine provinces in South Africa, the majority (81%, n = 285) were from the Gauteng province. The remaining isolates comprised 2 smaller clusters (12 and 4 isolates) and 11 isolates did not fall into any cluster. The remaining seven isolates represented different ST. The authors state that it is concerning because the (W) ET-37 clone is causing endemic disease in the country [37].
A study characterising 21 Corynebacterium diphtheria outbreak isolates from KwaZulu-Natal, South Africa using whole genome sequencing showed that the outbreak was caused by a single strain. This strain produced a novel ST which did not belong to any known clonal complex. This strain was toxigenic and unrelated to the nontoxigenic strain that was isolated in the outbreak. Furthermore, when this ST was compared to other historical, non-outbreak-associated South African isolates and other documented C. diphtheria isolates worldwide, the outbreak strain was not related [38].
An outbreak of 17 cases of lymphocutaneous sporotrichosis among mine workers in a gold mine in South Africa in 2011 was reported in a descriptive cross-sectional study that made use of molecular typing methods to characterise Sporothrix isolates. Isolates were identified by sequencing the internal transcribed spacer region of the ribosomal gene which allows identification to the species complex level and the nuclear calmodulin gene which allows identification to the cryptic species level. Phylogenetic analysis was performed to investigate genetic relatedness. Clinical isolates were confirmed to be S. schenckii sensu stricto and environmental isolates were identified as S. mexicana by calmodulin gene sequencing. This was confirmed by phylogenetic analysis where the sequences of the clinical isolates clustered closely with the S. schenckii sensu stricto type reference strain as well as South African S. schenckii sensu stricto clinical isolates and the sequences of the environmental isolates clustered closely with the S. mexicana type strain. This was the first occurrence of S. mexicana in South Africa. Although clinical and environmental isolates belonged to different species, the source of infection was thought to be contaminated soil and untreated rotting wood in the underground mine levels [39].

Conclusion
Molecular laboratories are fast becoming widespread as molecular typing techniques are proving to be effective in the characterisation of microorganisms. This is of particular importance for public health surveillance and outbreak investigation purposes. When coupled with epidemiological data, molecular typing is even more valuable as it provides information on the source of infection and routes of transmission. Considering challenges in setting up molecular laboratories, we emphasize the importance of a stepwise approach in developing countries.