Rega Institute for Medical Research, Leuven, Belgium,1 Stanford University, Stanford, California,2 Hospital Egas Moniz, Lisbon, Portugal,3 BC Center for Excellence in HIV/AIDS, Vancouver, Canada,4 Universidade Federal do Rio de Janeiro, Rio de Janeiro, Brazil,5 Health Protection Agency, Salisbury, United Kingdom,6 Wright Fleming Institute, London, United Kingdom,7 National Institute for Communicable Diseases, Johannesburg, South Africa,8 Chulalongkorn University, Bangkok, Thailand,9 National Institute of Infectious Diseases, Tokyo, Japan,10 Hospital Carlos III, Madrid, Spain,11 Ministry of Health, Tel Aviv, Israel,12 Instituto Adolfo Lutz, Sao Paulo, Brazil,13 Fundación Huesped, Buenos Aires, Argentina,14 Laboratory of Virology, Bichat, Claude Bernard Hospital, Paris, France,15 Bayer Health Care-Diagnostics, Toronto, Canada,16 HPA Antiviral Susceptibility Reference Unit, Birmingham, United Kingdom,17 National Hemophilia Center, Sheba Medical Center, Tel Aviv, Israel,18
Received 5 May 2005/ Returned for modification 25 August 2005/ Accepted 26 November 2005
| ABSTRACT |
|---|
|
|
|---|
| INTRODUCTION |
|---|
|
|
|---|
Although genotyping is commonly used, there are still many uncertainties with respect to the value of genotype in the assignment of a new regimen. The current genotypic assays are not always able to report all drug resistance mutations among non-B subtypes (11, 18, 19, 24). Regardless of subtype, genotyping is not sensitive to mutations that are present as a minor variant in the population (22, 40). Genotyping results also differ depending on the laboratory where they are performed. Quality control studies indicate that mutations, even present as a pure variant, are often underestimated (32).
However, separate from the quality and sensitivity issues, the interpretation of genotypic results is still not standardized. Several interpretation algorithms have been designed to aid in this, but they may differ in the prediction of therapy response and/or drug susceptibility. Studies were performed mainly on subtype B viruses, and even within this subtype, differences have been detected (6, 21, 29, 34, 35, 36).
Non-B subtypes are a challenge for these systems, since algorithms for these subtypes were designed using genotype, phenotype, and therapy response information that was largely derived from experience with subtype B. Recent analyses suggest that non-B viruses can develop specific mutations that differ from those identified in subtype B under the same treatment pressure (1, 20). For example, in CRF01_AE but not in subtype B viruses, V75M seems to be significantly associated with stavudine treatment (2) and, in subtype C but not in subtype B, V106M is a signature substitution of patients treated with efavirenz (4). There is a continuing controversy about the impact of secondary protease mutations (positions 36, 71, 77, etc.) which evolve in subtype B following protease exposure and are relatively frequent in untreated patients with non-B subtypes. It has been suggested that some of these can affect the susceptibility to certain protease inhibitor (PI) therapies in B and non-B subtypes (14, 28).
Although some short-term studies suggest little difference in therapy response in patients carrying non-B subtypes from that of patients infected with subtype B (12), other studies showed a significant difference in responses to treatment for different subtypes (8, 13). However, current studies have included a limited number of subjects. Potential differences can be due to differences in drug resistance. It is therefore important to know how the current drug resistance interpretation systems perform on different subtypes, and first of all, we need to know what the subtype-dependant discrepancies between the systems are.
Comparisons between these interpretation systems have already been made for subtype B strains; however, the subtype dependency of resistance assessment by these interpretations systems has not yet been determined (6, 21, 29, 34, 35, 36). In this study, we investigated four frequently used interpretation systems across a large number of non-B sequences to determine whether discordance between the systems was dependent on the viral subtype.
| MATERIALS AND METHODS |
|---|
|
|
|---|
Subtyping. Subtyping was performed by phylogenetic analysis using the subtyping tool developed by de Oliveira et al. separately for protease and reverse transcriptase sequences (7). Briefly, sequences are first analyzed using pure subtypes as a reference; in a second step, known circulating recombinant forms are added to the alignment. To detect recombination, bootscanning was performed using a sliding window of 400 nucleotides that was advanced 20 nucleotides at a time. Recombinants were included only if they were CRF01_AE or CRF02_AG since we had sufficient data for only these two circulating recombinant forms.
Algorithms. Four publicly available algorithms were applied on each of the sequences: Agence Nationale de Recherche sur le SIDA (ANRS) July 2004 (http://www.sante.gouv.fr/htm/actu/36_vih_2.htm) (25), HIV RT and Protease Sequence Database (HIVDB) August 2004 (http://hivdb.stanford.edu) (33), Rega Institute (Rega) version 6.3 (http://www.kuleuven.be/rega/cev/pdf/ResistanceAlgorithm6_3.pdf) (39), and Bayer Health Care-Diagnostics (VGI) version 8 (30) (formerly Visible Genetics).
Mutations considered. In all statistical analyses (see below), we scored all mutations that are included in one of the algorithms we used in the analyses: 18 NRTI resistance positions, i.e., 41, 44, 62, 65, 67, 69, 70, 74, 75, 77, 115, 116, 118, 151, 184, 210, 215, and 219; 16 NNRTI resistance positions, i.e., 98, 100, 101, 103, 106, 108, 179, 181, 188, 190, 225, 227, 230, 234, 236, and 318; and 23 PI resistance positions, i.e., 10, 20, 24, 30, 32, 33, 36, 46, 47, 48, 50, 53, 54, 60, 63, 71, 73, 77, 82, 84, 88, 90, and 93. For most positions, more than one mutant amino acid can be scored. All mixtures at resistance positions were scored as mutants.
Scoring of discordancesstatistical analyses and data mining. The algorithm specification interface at the web site for the Stanford HIV drug resistance database (http://hivdb.stanford.edu) was used to apply the interpretation algorithms to each sequence (3). We assigned three levels of resistance: susceptible (S), intermediate (I), and resistant (R). For HIVDB, which assigns five levels of resistance, we obtained three by pooling the two highest and two lowest categories.
Interpretations were considered concordant if each of the algorithms assigned the same level of resistance to a sequence for a particular drug. We considered the algorithms to be fully discordant if one of them scored the sequence S for a particular drug, and another one scored it as R. Interpretations were considered partially discordant when, among the scores of the different systems, both S and I or both R and I were found for the same drug. The numbers of fully discordant (counted as 1) and partially discordant (counted as 0.5) strains were added to compute the proportion of discordant strains.
Statistical analyses were performed to see whether the number of discordances were drug and subtype dependent. We performed a one-way analysis of variance (ANOVA) with Tukey's confidence intervals to check for differences between different drugs and different subtypes. Differences between only subtype B and each of the other subtypes have been analyzed in this study.
The data mining program Weka, version 3.4.4 (http://www.cs.waikato.ac.nz/
ml/weka/), was used to identify mutational patterns that were responsible for the observed discordances, thereby also identifying the algorithms that caused the discordances. We used this tool to build binary decision trees with which it tries to predict all observed discordances. To evaluate the predictive power of the decision trees, we performed a 10-fold cross-validation. In this method, the data set is split 10-fold and the predictive performance for every subset is evaluated for a decision tree trained on the other subsets.
We built a model for each drug in which we found a statistically significant effect of subtype on discordance. We included all subtypes in the model and tried to predict discordances (three levels, concordant, discordant, and partially discordant). For each leaf in the resulting tree that predicted discordance, we calculated the subtype distribution. Fisher exact tests were performed to analyze whether a rule in the decision tree explained significantly more discordances for a particular subtype.
| RESULTS |
|---|
|
|
|---|
|
|
Protease inhibitor analysis. The number of discordances seemed to be drug and subtype dependent for therapy-naive patients as well as treated patients (Tables 2 and 3).
|
|
In treated patients, the results were different. The highest level of discordance was obtained for amprenavir (50%), whereas 36% of the sequences were scored as discordant for lopinavir and 14% for nelfinavir. Tipranavir gave still the least discordant results; only 2% of the sequences were causing discordances between algorithms. Compared to subtype B, more discordances were observed for nelfinavir in subtype F and for indinavir and saquinavir in subtype G (P < 0.01), while less discordances were observed for amprenavir in subtypes C and D and for atazanavir in subtype C (P < 0.01) (Table 3).
Nonnucleoside reverse transcriptase inhibitor analysis. For therapy-naive patients, no differences could be found between drugs, while for treated patients, efavirenz scored the most discordances (11%), followed by delavirdine and nevirapine (5%).
The proportion of sequences displaying full or partial discordances was subtype dependent in this drug class except for delavirdine and nevirapine in naive patients. But no specific subtypes were found that had differences in the resistance interpretation compared to subtype B.
Nucleoside reverse transcriptase inhibitor analysis. In 1.6% of the sequences, zidovudine (AZT) was responsible for most of the discordances in therapy-naive patients; didanosine (ddI) was responsible for most of the discordances in treated patients (54%). The difference between drugs in this class was significant for both therapy-naive (Table 2) and therapy-experienced (Table 3) patients.
For zidovudine, zalcitabine, and stavudine in the naive population, the number of discordances was associated with subtype (P < 0.01). For only stavudine, subtype C was found to display less discordances than subtype B.
The number of discordances was significantly associated with subtype for all drugs in therapy-experienced patients (P< 0.01). For lamivudine and emtricitabine, CRF01_AE seemed to display significantly more discordances than subtype B. Subtypes C and D had fewer discordant interpretations for didanosine, and subtype C had also fewer for zalcitabine. For tenofovir, a lot of non-B subtypes had fewer discordant results than subtype B. This was the case for subtypes A, C, D, and G.
Mutational features of the subtype dependency. The results have been summarized in Table 4.
|
For subtype C, the most frequent pattern that caused partial discordances was a combination of protease (PRO) 82V/I + 63P + 36V/I. This pattern significantly explained more partial discordances for subtype C than for subtype B (P < 0.0001). This seemed due to the HIVDB interpretation algorithm. All subtype C sequences displaying this pattern also had the PRO 93L mutation. This mutation is taken into account for only nelfinavir by the HIVDB algorithm, which scores this pattern as intermediate, while all other algorithms score these sequences susceptible.
Two rules were discovered in the tree for subtype G that explained significantly more discordances than subtype B. One was a rule very similar to that for subtype C, PRO 82I + 63P + 36I (P = 0.04), and the other rule was PRO 82I + 63mt (any mutation) + 20I (P = 0.01). In practice, these rules cover the same sequences, as all subtype G sequences with the first pattern also harbor a mutation at position PRO 20 and all sequences with the second pattern also harbor a mutation at position PRO 36. Again, these discordances were due to the HIVDB algorithm, which is the only one that takes into account mutations at position PRO 20 and gives a rather high weight for the PRO 82I mutation for nelfinavir.
For ritonavir, subtype F caused more discordances than subtype B. We found a rule, PRO 20R + 10V/I, in the decision tree explaining significantly more subtype F partial discordances than those observed in subtype B. An example of the Weka decision tree with subsequent statistical analyses is shown in Fig. 2. Those subtype F sequences all had the PRO 36I mutation and thus harbored three secondary PI mutations. The Rega algorithm scores this as intermediate for ritonavir, while all other algorithms score this as susceptible.
|
For the PI saquinavir in therapy-experienced patients, the full discordances observed in subtype G sequences could be attributed to mutations PRO 90 M + 82I. This was due to the ANRS interpretation system, which does not score this as resistant (as HIVDB and VGI did) if PRO 82I is present. Only PRO 82A is taken into account by ANRS.
For indinavir, subtype G also displayed more discordances than subtype B, apparently due to PRO 90 M + 82I + 54V, which was scored as resistant by HIVDB and ANRS because all these samples also had the PRO 36I mutation. Another rule predictive for discordance was PRO 90 M + 82I + 71T + 20I. The Rega system scores this pattern as susceptible, since the PRO 90 M mutation by itself is not scored as resistant by this algorithm.
Subtype F causes more discordances for nelfinavir in treated patients. The PRO 88S mutation was partially responsible for these discordances. The Rega algorithm considers these isolates to be susceptible, while the score from other algorithms was at least intermediate resistant. The partial discordances for subtype F are explained by PRO 82A + 54V. All these sequences had also PRO 36I, which is not considered resistant by ANRS relative to the other algorithms.
Subtype B displayed a lot of discordances for amprenavir. In fact, the decision tree incorporated subtype in this model. The resulting rule was PRO 90 M + 54V + 20R + 82A. All these sequences had an additional PRO 36I mutation, which is not included in the amprenavir rules of the Rega algorithm. This mutation pattern scored as intermediate for this system, while for the other algorithms, the additional PRO 36I mutation is responsible for the resistant score.
For atazanavir, subtype B caused a lot of discordances. The decision tree was very complex, and no clear rule had a high coverage and was predictive for the observed discordances in all subtypes. The atazanavir rules incorporate a number of mutations also observed for other PIs. Patients harboring a subtype B virus are probably treated with protease inhibitors more often and for a longer time, since subtype B has dominated since the beginning of the epidemic in countries where treatment was available and subsequently has been subject to drug selective pressure earlier. In these sequences, the large background of PI resistance mutations probably causes the discordances observed for atazanavir.
For lamivudine and emtricitabine (FTC), CRF01_AE scored more discordances than subtype B. For lamivudine resistance interpretation, this was caused by RT 65R + 151 M (P < 0.05). ANRS scores the presence of both mutations separately as intermediate but does not provide a rule for the presence of both of them, while the Rega algorithm for example scores this combination as resistant.
For emtricitabine, no clear rules were found in the tree, although it seemed that RT 41L + 67N + 118I + 215Y caused most of the partial discordances observed for CRF01_AE. The Rega algorithm is the only one that scores the RT 67N mutation for FTC. VGI does not provide rules for FTC.
For didanosine, tenofovir, and zalcitabine, subtype B had a lot more discordant interpretations than a number of non-B subtypes. The decision trees were very complex and also for these drugs, no clear rules could be deduced.
| DISCUSSION |
|---|
|
|
|---|
In this study, performed on sequences obtained from 5,030 patients, we investigated subtype-dependant discrepancies between four commonly used interpretation systems (Rega 6.3, HIVDB-08/04, ANRS [07/04], and VGI 8.0). The versions analyzed were the ones available to us at the time of analysis. In the meantime, updates have become available for all of these systems. None of these systems include subtype-dependant rules.
We did find drug- and subtype-dependent differences in the drug susceptibility/therapy response predictions of commonly used interpretation algorithms. We also identified mutational patterns that seemed to be partially responsible for the observed discordances.
Concordance was the lowest in the interpretation of therapy-experienced sequences, which means that it is less clear which mutations are really important for resistance development. This may explain some of the differences seen between algorithms in predicting treatment outcome (6). For lopinavir especially, the pathway towards resistance is unclear, which explains the high number of discordant results between the interpretation systems found in therapy-experienced patients (26, 27).
Our analyses revealed that the proportion of discordances between commonly used algorithms is subtype dependent for many drugs, in naive as well as in therapy-experienced patients. Concordance was higher in naive patients. However, non-B subtype sequences and subtype B sequences overall had equal numbers of resistance mutations. Both groups had mostly "wild-type" sequences. Therefore, the higher number of concordances is probably due to a larger agreement on what is a wild-type sequence.
In naive patients, discordances were found for nelfinavir (subtypes C and G). Incidentally, it is known that the pathway towards resistance for nelfinavir differs for subtypes C and G from that for subtype B. The PRO D30N mutation is not the preferred one as in subtype B; it seems that, rather, the PRO L90M is selected (15) (P. Gomes, I. Diogo, M. F. Gonves, et al., Abstr. 9th Conf. Retrovir. Opportunistic Infect., abstr. 46, 2002). We found mutational patterns that partially explained these discordances. Those were mostly due to combinations of secondary PI mutations, which are often present as a polymorphism in non-B subtypes. Some algorithms include these mutations in their rules, while others do not. The PRO 93L mutation for example, is included by only HIVDB and not by the other systems. This mutation was present in all subtype C sequences with the pattern PRO 82I/V + 63P + 36I/V. Similarly for subtype G, the PRO 20I mutation is incorporated by only HIVDB.
For subtype F and ritonavir, the pattern PRO 20R + 10V/I also included the PRO 36I mutation. Three secondary PI mutations are scored as intermediate by only the Rega Algorithm.
For NNRTIs, we did not find any subtype-dependent discordances in resistance scoring, although some differences in resistance development have already been reported for subtype C under efavirenz treatment (2).
For NRTIs, only in naive patients did we find that the proportion of discordances is subtype dependent for stavudine. Subtype C had significantly less discordances than subtype B due to a mutation on RT 215 that occurred more frequently in subtype B sequences.
For PI resistance in treated patients, a lot of discordances are observed for subtype G in predicting resistance for saquinavir and indinavir and in subtype F for nelfinavir resistance prediction. The patterns observed here are related to a single algorithm that scores this differently. Differences often occur due to the presence of the PRO 36I mutation, which is present as a polymorphism in non-B subtypes. This mutation often triggers the switch to score an isolate as intermediate, while other systems do not take into account the substitution and consider the isolate to be susceptible. Apparently, there is no agreement on the role of some of these polymorphic resistance mutations in PI resistance.
For amprenavir and atazanavir, subtype B displayed a lot of discordances for treated patients. The decision trees for these drugs were very complex. The tree for amprenavir included subtype as a node, so a rule, PRO 90 M + 54V + 20R + 82A, could be deduced. For atazanavir, no clear rule was found. These two drugs are only recently being used in clinical practice, and the pathway towards resistance is not fully understood yet. The presence of a number of PI mutations, instead of some clear rules, is mostly used in the algorithms.
For lamivudine and emtricitabine in treated patients, CRF01_AE scored more discordances than subtype B. Although resistance for both drugs are predicted by the same rules in the algorithms, different mutation patterns are found in the decision trees. For lamivudine resistance interpretation, this was caused by RT 65R + 151 M. For emtricitabine, this was RT 41L + 67N + 118I + 215Y (although not statistically supported).
Tipranavir has a low number of discordances for naive patients as well as treated patients. This is mainly due to the limited amount of information that is available on resistance towards this drug (9). All algorithms are based on the same available information and thus predict the same level of resistance.
The four evaluated algorithms, in fact, belong to two different models. The Stanford algorithm assigns a score to each of the observed mutations and uses the sum to decide on the level of resistance, allowing complex patterns of mutations to be taken into account. The VGI, ANRS, and Rega algorithms are restrained to specific rules that describe specific mutational patterns. Therefore, the discordance for complex patterns is especially inevitable since both models use different ways to take these into account.
This study is not intended to draw conclusions on the validity of the different algorithms, but rather to identify mutation patterns that result in divergence between the algorithms, among different subtypes. The mutations and particularly the patterns of polymorphisms in non-B subtypes that are associated with viral resistance warrant further in vitro studies and ultimately need to be confirmed by clinical observation. We acknowledge, as a limitation of this study, the absence of measures of either in vitro or clinical resistance, which are phenotype and therapy outcome, respectively. However, the mutation patterns associated with discordance between the algorithms may identify the sequences of interest in larger datasets, obtained prospectively, and linked to viral load and/or CD4 data to correlate treatment outcomes.
In conclusion, the different algorithms agreed quite well on the level of resistance scored. However, where there are differences, in many cases these can be attributed to specific subtype-dependent combinations of mutations. The mutations found here should further be investigated as to whether they contribute to differences in resistance and therapy response between different subtypes. Our expertise in interpretation of genotypic resistance will increase with a scale-up of treatment to include millions of individuals with non-subtype B virus infections.
| FOOTNOTES |
|---|
| REFERENCES |
|---|
|
|
|---|
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Clin. Vaccine Immunol. | Clin. Microbiol. Rev. |
|---|---|
| J. Clin. Microbiol. | ALL ASM JOURNALS |