Whole genome analysis of Human Mastadenovirus D causing Keratoconjunctivitis in India - A multicentre study

Introduction: Human mastadenovirus (HAdV) types 8, 37, 64 have been considered the major contributors in Epidemic keratoconjunctivitis (EKC) epidemics, but recent surveillance data have shown the involvement of emerging recombinants, including HAdV-53, HAdV-54, and HAdV-56. In our initial work, positive samples for adenovirus revealed that our strains were closer to HAdV-54 than HAdV-8. Hence, the current study aimed to use whole genome technology to identify the HAdV strain correctly. Methodology: Oxford Nanopore technique was used, wherein a Targeted sequencing approach using long-range PCR amplification was performed. Primers were designed using HAdV-54 (AB448770.2) and HAdV-8 (AB897885.1) as reference sequences. Amplicons were sequenced on the GridION sequencer. Sequences were annotated using Gatu software, and similarities with standard reference sequence was calculated using Bioedit software. The phylogenetic tree was built after alignment in MEGA v7.0 using Neighbour joining method for each of the genes: Penton, Hexon, and Fiber. The effect of novel amino acid changes was evaluated using the PROVEAN tool. The Recombination Detection Program (RDP) package Beta 4.1 was used to identify recombinant sequences. Results: Of the five samples sequenced, OL450401, OL540403, and OL540406 showed nucleotide similarity to HAdV-54 in the penton region. Additionally, OL450401 showed a statistically significant recombination event with HAdV-54 as minor and HAdV-8 as major parents. This was further supported by phylogenetic analysis as well. Conclusions: In the present study, we have found evidence of a shift from HAdV-8 towards HAdV-54, thus stressing the need for surveillance of HAdVs and to stay updated on the rise of new recombinants.

Studies have shown novel types to arise at least in part from recombination between two or more viruses and can lead to human infections that were not previously associated with their parental strains [9].These viruses can be characterized correctly with whole genome sequencing and comprehensive recombination analysis.Recent literature shows a decline in HAdV-8 while recombinant serotypes have risen [10][11][12].The initial characterization of HAdV-53 using DNA sequencing showed similarity with HAdV-22 (hexon region), while the fiber gene was similar to HAdV-8.Later, whole genome sequencing, bioinformatics, and detailed in-vivo analysis revealed that it was a product of recombination between HAdV-37, HAdV-22, HAdV-8, and a fourth unknown virus that conferred corneal tropism.After HAdV-53, others, including HAdV-54 (previously known as HAdV-8I) and HAdV-56, were identified to be associated with EKC [13][14][15] by whole genome sequencing and recombination analysis.
Scientists speculate that the likely prerequisites for recombination are co-infection between a minimum of two adenoviruses with highly similar nucleotide sequences at recombination hotspots in the genome and long-term presence in the gastrointestinal, respiratory, and genitourinary tracts [16][17][18][19][20][21][22].Additionally, immunosuppression, either due to HIV infection or other causes, probably contributes to long-term viral persistence in affected patients and creates a conducive environment for co-infection [23].
In India, several outbreaks of EKC have been observed, with HAdV-8 being the most common causative agent [24], along with others like HAdV-6 and HAdV-2 [25].In our previous published report based on sequencing the partial region of hexon we found the isolates to be closer to HAdV-54 than HAdV-8 phylogenetically [26].Studies have shown that HAdV-54 and HAdV-8 share 95% similarity along the entire genome [10,27].Hence, the current study was undertaken to perform whole genome analysis using the next generation sequencing method.

Methodology
This study is part of a collaborative effort to identify the causative agents of keratoconjunctivitis in India.The study was approved by Jawaharlal Institute of Postgraduate Medical Education & Research (JIPMER) ethical Committee, and informed consent was obtained from the participants.A total of 709 samples were collected from 5 centres across India in which HAdV was observed to be the major contributor (47.8%) of keratoconjunctivitis. Briefly, samples positive for HAdV by real-time PCR were subjected to typing using conventional PCR targeting the partial hexon region.All samples were typed to be HAdV-8; hence, representative samples from all centres were then subjected to partial hexon gene sequencing, which revealed that the sequences were closer to HAdV-54 rather than HAdV-8.Hence, we performed a whole genome analysis in the present study to confirm the HAdV genotype in circulation.For this purpose, we chose a total of 11 samples that had CT value ≤ 25 and subjected them to whole genome sequencing using Oxford Nanopore technologies.
A targeted sequencing approach using long-range PCR amplification was performed.Primers were designed using HAdV-54 (AB448770.2) and HAdV-8 (AB897885.1)as reference sequences.11 primer pairs (Supplementary Table 1) were designed to cover approximately 34 kb using the primal scheme and Prime3 integrated tool.Each primer pair covers approximately 3.5 kb of the genome with a ~300 bases overlap of amplicons.Of the 11 samples taken, 6 passed the multiplex PCR quality check (QC) and were found optimal for Nanopore library preparation and sequencing.
DNA from these six samples was end-repaired and cleaned with 1x AmPure beads.Native barcode ligation was performed with NEB blunt/ TA ligase using EXP-NBD114 (ONT) and cleaned with 1x AmPure beads.Qubit quantified barcode ligated DNA samples were pooled at equimolar concentration, and Adapter ligation (BAM) was performed for 15 minutes using the NEBnext Quick Ligation Module.The library mix was cleaned up using 0.6X AmPure beads, and finally, the sequencing library was eluted in 15 µL of elution buffer and used for sequencing.Sequencing was performed on GridION X5 (Oxford Nanopore Technologies, Oxford, UK) using SpotON flow cell R9.4 (FLO-MIN106) in a 48-hour sequencing protocol on MinKNOW 2.1 v18.08.3.Nanopore raw reads ('fast5' format) were basecalled ('fastq5' format) and demultiplexed using Guppy basecaller v2.3.5 [28], following which adapter trimming was done using Porechop1 tool [29].The processed reads were mapped against the reference genome (Human mastadenovirus 54, AB448770.2) using a Minimap22 aligner [30].The mapped reads were further utilized using the Samtools3 pipeline [31] to generate a consensus genome sequence.The workflow for sample processing and data analysis is given in Supplementary Figure 1.Consensus is polished using an IUPAC-based in-house script to reduce the error rate and improve the quality of the consensus.The genome coverage was calculated at 1000X (minimum read depth of 1000 reads supporting each position across consensus genome sequence).One sample showed a higher number of non-ATGC characters in the consensus sequence and was not taken for further analysis.The remaining five sequences were annotated using Gatu software [32], and similarities with standard reference sequence was calculated using Bioedit software [33].The phylogenetic tree was built after alignment in MEGA v7.0 using the neighbour joining method for each of the genes: Penton, Hexon, and Fiber [34].
The Recombination Detection Program (RDP) package Beta 4.1 was used to identify recombinant sequences in default mode [35].A recombination event with a significance of p < 0.01 in at least three out of seven selected algorithms: RDP, GENECONV, BootScan, Maxchi, Chimaera, SiScan, and 3Seq, was considered to be reliable.The sequence of each of the five samples from the current study was used as the query sequence and compared to those of other HAdV-D isolates with a sliding window of 200 bp.

Nucleotide sequence identities for penton, hexon, and fiber genes
Based on the nucleotide alignment of different gene sequences, the nucleotide sequence of OL450401, OL540403, OL540406 for penton region showed the highest degree of homology with HAdV-54 (NC012959) (identities of 98.6%, 96.2%, and 95.5% respectively), while for hexon and fiber genes all the samples were closer to HAdV-8 (AB448767) (Table 2).

Phylogenetic analysis
Phylogenetic analysis also corroborated with the homology analysis, wherein OL450401 was close towards HAdV-54, followed by OL540403 and OL540406 in the penton region (Figure 1A).Furthermore, when it came to hexon and fiber genes, all five samples were placed close to HAdV-8 (Figure 1B, 1C).These results suggested a recombination event in at least one of the three samples-OL450401/OL540403/OL540406.

Recombination analysis
To identify the recombination events within the genome of these samples, recombination analysis was performed using the RDP4 package with multiple algorithms.Amino acid (AA) residues in bold are similar to HAdV54 (NC012959).AA residues in italics represent unique changes found in the current study.
The results indicated a probable recombination event between HAdV-54 (minor parent) and HAdV-8 (major parent) in sample OL450401 (Figure 3).The recombination event appeared with a beginning breakpoint at around 14168 (without gaps) and an ending recombinant breakpoint at around 18091 (without gaps), which encompasses the genes penton (partial), pVII, pV, pX, pVI, and Hexon (partial).None of the other samples showed any significant recombination event.

Discussion
Recombination is a mechanism by which advantageous properties from various genomes are combined to form a new one, thereby eradicating deleterious mutations.Recent literature shows ample evidence for rise in recombinants involving HAdV-8, which causes keratoconjunctivitis with varying degrees of severity [14,37].
Older techniques, such as sequencing partial genes or tests involving neutralization of antibodies, may not determine the types correctly; hence, it is suggested to perform whole genome analysis to determine the new strains of HAdVs [12,13].With the advent of high throughput sequencing techniques, including Nanopore, whole genome sequencing has become much cheaper and faster.In the current study, we have included representative samples from a multicentre project across India to identify such recombinants.To the best of our knowledge, this is the first report of the presence of recombinant HAdVs in India.
Out of the five samples taken for sequencing, we observed a potential recombination event in one sample, OL450401, with HAdV-8 as major and HAdV-54 as minor parent.This recombinant region encompasses the partial penton, pVII, pV, pX, pVI, and partial Hexon genes.Earlier studies have shown that the three major viral proteins, Penton, Hexon, and Fiber, are the hotspots for recombination, which can be likely attributed to the increased host immune pressure in these regions [15,38].
Apart from a single study from Greece [39], HAdV-54 has been reported only in Japan since its first report in 2000 [13].The probable reason could be the lack of EKC surveillance in most countries.The recombinant identified in the present study has HAdV-54 as the minor parent, indicating the probable presence of this genotype in circulation in India.Since it is believed that it could be present in the population without any symptoms, only surveillance measures can help to identify it.
Earlier reports state that classification using information from the three major capsid proteins is  sufficient and reliable to identify the HAdV genotypes [40,41].Classification of OL450401 using complete genome sequence showed consistent results with the phylogenetic tree build using hexon, penton, and fiber genes.This substantiates the theory that classification based on these three genes is a consistent and quick method.

Conclusions
In conclusion, the present study identified a potential HAdV recombinant in India from keratoconjunctivitis patients with HAdV-8 and HAdV-54 as major and minor parents, respectively.This reveals the fact that HAdV-54 could also be in circulation in India and pose a threat to the emergence of recombinants, which could lead to epidemics.These findings warrant a rigorous surveillance of HAdV strains in India.

Figure 3 .
Figure 3. Pairwise identity plot displaying the region of recombination in OL450401.

Table 1 .
Genome characteristics and homology analysis of whole genome sequences.

Table 3 .
List of amino acid residues in penton, hexon, and fibre regions on comparison with HAdV8 (AB448767).

Table 4 .
Algorithms of the RDP4 package used to predict the recombination event.