These indices show if specific codons are used more often or less

These indices show if specific codons are used more often or less often in the observed sequence data than expected. The expected value of codon usage is selleck chemical calculated as the ratio of total number of amino acid counts divided by the number of synonymous codons that code for the amino acid. Then the RSCU values are calculated as the ratio of the observed number of codons to the expected number. The stop codons were included for this analysis. Also, Trp and Met codons were excluded from this analysis as only one codon is used to code for these amino acids. The preferred and non-preferred codons have RSCU > 1 and

RSCU < 1, respectively. Based on this, each synonymous substitution site was examined to determine whether it corresponded to a preferred codon PF-4708671 solubility dmso or non-preferred codon. The codon context analysis was performed using Z-VAD-FMK mw the Anaconda software [25, 26]. It includes a set of statistical and visualization methods to reveal information about codon context (sequential patterns of codons in a gene), codon usage bias as well as nucleotide repeats within open reading frames (ORFeome). We used the cluster analysis tool, which is based

on calculating similarities between two vectors of the contingency tables of codon frequencies, to group codon pairs (represented by rows and columns of the correlation matrix of residual values for each serotype). The cluster patterns represented global patterns of codon contexts within each serotype. Analysis of recombination Population recombination analyses in DENV were performed using the composite likelihood method of Hudson

2001 [27], but adapted to finite-sites models (applicable to diverse genomes such as those of some Verteporfin viruses and bacteria) [28]. The PAIRWISE program included in the LDhat package (freely available at http://​ldhat.​sourceforge.​net/​), a suite of population genetic recombination tools [28] was implemented to analyze recombination in each serotype of DENV. The PAIRWISE program performs estimation of the population-scaled recombination, 2Ner for haploid species, where Ne is the effective population size and r is the genetic map distance across the region. The composite likelihood method implements a finite-sites model to estimate the coalescent likelihood of two-locus haplotype configurations. The coding sequences of DENV genomes within each serotype were formatted by ‘Convert’, a program included in LDhat, to generate data files of sites and positions of mutations in the sequences of the sample. Then these files were used in the PAIRWISE analysis to generate likelihood lookup tables for sequence data of each serotype. The likelihood values utilized the estimated Watterson’s theta per site, 100 as the maximum value of 2Ner for the grid and 101 as the number of points on the grid as recommended.

Comments are closed.