What is a Subset?
Sometimes, you have a list of GO terms but would like to summarize them- maybe group them into higher level categories. We have Subsets for that! Subsets were previously mentioned in the Usage notes: Do_Not_Annotate post.
GO subsets (also known as GO slims) are cut-down versions of the GO containing a subset of the terms. They are specified by tags within the ontology that indicate if a given term is a member of a particular subset. GO slimming is commonly used to report an overview of a genome or to a set of summarize experimental results. GO hosts a number of predefined slim sets including a generic GO slim, and a number of slims/subsets tailored to give good coverage for some well studied/annotated model species.
How do I map a set of annotations to high level GO terms (GO slim)?
- One method is to use GO Term Mapper. Choose the aspect (Molecular Function, Biological Process, or Cellular Component) and indicate if you want to map to a generic subset or one curated for your organism (for example, the S. cerevisiae slim omits terms applicable only to plants or bacteria).
- In order to map your annotations to a GO slim, use the Map2Slim option in OWLTools. Given a GO slim file, and a current ontology (in one or more files), the Map2Slim script will map a gene association file (containing annotations to the full GO) to the terms in the GO slim. This script is an option of OWLTools, and it can be used to either create a new gene association file, which contains the most pertinent GO slim accessions, or in count-mode, in which case it will give distinct gene product counts for each slim term.
Background information and details on how to download, install, and implement OWLTools, as well as instructions on how to run the Map2Slim script are available from the OWLTools Wiki.
Read more or download the GO slims here.