The programs tRNA scan [71] and ARAGORN [72], which is a program

The programs tRNA scan [71] and ARAGORN [72], which is a program that detects tRNA and tmRNA genes. selleck kinase inhibitor For functional annotation, JCVI uses a combination of evidence types which provides consistent and complete annotation with high confidence to all genomes. The automated annotation pipeline has a functional annotation module (AutoAnnotate), which assigns the function to a protein based on multiple evidences. It uses precedence-based rules that favor highly trusted annotation sources based on their rank. These sources (in rank order) are TIGRFAM HMMs [73] and Pfam HMMs, best protein BLAST match from the JCVI internal PANDA database and computationally derived assertions (TMHMM and lipoprotein

motifs). Based on the evidences, the automatic pipeline assigns a functional name, a gene symbol, an EC number and Gene Ontology domains [74], which cover cellular component, molecular function and biological process(es). The assigned domains are related to evidence codes for each protein coding sequence with as much specificity as the underlying evidence supports. The pipeline also predicts the metabolic pathway using Genome properties [75], which are based on assertions/calculations made across genomes for the presence or absence of biochemical pathways. Genome properties incorporate both calculated and human-curated assertions Crenigacestat of biological processes and properties

of sequenced genomes. A collection of properties represents metabolic pathways and other biological systems and these are accurately detected computationally, generally by the presence/absence of TIGRFAMs and Pfam HMMs. This is the basis for the automatic assertions made for the presence of the whole pathway/system in any genome. Finally a curator checked for consistency and quality of annotation, deleting spurious assertions and inserting any missed ones. This resulted in the manual merging of some genes, primarily the MBA genes, which were problematic for the automated

genome annotation pipeline due to the nature of their repeats. JCVI’s internal Manual Doxacurium chloride Annotation tool (MANATEE) [76] was used extensively to annotate these genomes. MANATEE is a freely available, open-source, web-based annotation and analysis tool for display and editing of genomic data. The genome comparisons and annotation transfer were done using the Multi Genome Annotation Tool (MGAT) which is an internally developed tool integrated within MANATEE to transfer annotations from one gene to other closely related genes. The clusters are generated based on reciprocal best BLASTP hits determined by Jaccard-clustering algorithm with a BLASTP identity > = 80%, a P value < = 1e-5 and a Jaccard coefficient threshold of 0.6. The clusters are composed of genes both within the genome and across different ureaplasma genomes. The same clusters are used in the genome comparisons generated by SYBIL ( http://​sybil.​sourceforge.

Comments are closed.