CONTENTS
Upcoming Pathway Tools Tutorial at SRI, January 16-17, 2020We will offer a two-day Introduction to Pathway Tools tutorial at SRI from January 16-17, 2020. The early registration deadline is December 12. For more information click here.
New Pathway Tools PublicationThe following just-published article describes the many enhancements made to Pathway Tools during the past four years including multiple extensions to its metabolic network modeling, omics data analysis, and core database management capabilities:"Pathway Tools version 23.0 update: software for pathway/genome informatics and systems biology" in Briefings in Bioinformatics.
The PathoLogic Inference Component of Pathway ToolsPathoLogic is the computational inference module of Pathway Tools. Here we summarize the basic operation of PathoLogic as well as some new pathway inference capabilities coming in our version 23.5 release in the next few weeks.The current inference capabilities of Pathway Tools are as follows:
Since PathoLogic takes as its input the output of genome annotation pipelines, it must map protein names, EC numbers, and Gene Ontology terms to reaction assignments. Since the latter two entities are controlled vocabularies, they are straightforward for PathoLogic to accept and map to reactions. Enzyme names are much less straightforward to handle. The core of our approach is to utilize the enzyme name and reaction associations recorded by curators in the MetaCyc database. We also supplement those names with additional enzyme synonyms that we search out using a program that iterates across the 14,000 genomes in BioCyc and finds the most frequently unrecognized enzyme names across all of these genomes. Those names become the highest priorities for curation. Our curators have recently entered an additional 300 new names from this list so that we now have a total of 48,000 enzyme names and synonyms available to the "enzyme name matching" component of PathoLogic. These new names have significantly boosted reaction inference. However, some of those enzyme names are ambiguous, either because an enzyme with one name catalyzes multiple reactions at multiple active sites, or because different enzymes with the same name catalyze different reactions. In version 23.5, the enzyme name matcher will use gene names (if available in the genome annotation) to disambiguate these ambiguous enzyme names. Gene names will also be used to infer reaction mappings when the enzyme name is not recognized. Although there are occasional errors, our review of these assignments indicate they are quite accurate. These two strategies further boost reaction inference. Version 23.5 will also include several improvements to pathway inference. MetaCyc pathways now include more extensive use of "key reaction" definitions, where pathways specify reactions that must have an enzyme present in the organism to infer that pathway as present. We also introduce a new pathway field called "key non-reactions", which prevent inference of a given pathway if a specified reaction outside that pathway is catalyzed by the organism. In particular these new rules enable more accurate inference of the appropriate variants of the TCA cycle and glycolysis pathways. We also added new rules that result in more accurate inference of super pathways (pathways built from several smaller pathways). Another new feature in 23.5 is a new variant of the pathway evidence report available from the web command Analysis → Reports → Pathway Evidence. Previously, this report sorted pathways by pathway ontology. The new variation of the report sorts the report by pathway score to speed user review of low-scoring pathways.
|