During the IWM there were some requests to unify the existing ortholog data sets in WormBase and include additional ones.
If you have any ortholog data sets that are not listed below and want to have them included, please send them to us:
TreeFAM
Inparanoid (2007 Sonnhammer dataset)
KOGS
EnsEMBL compara
orthoMCL (CalTech)
ortholog dataset from the C.briggsae paper (Hillier et al)
Ideas and proposals how to improve the integration of the data is welcome.
The current strategy is to merge the dataset based on EnsEMBL IDs (linked by OMIM / UniProt xrefs).
I don’t have a unified scoring system for the confidence, so the number of predictions might do.
The C.elegans - C.briggsae orthologs are now unified and all different datasets have a respective Analysis connected to them.
It is not viewable on the website’s default gene page, but you can try the Treeview to see it.
As a first use case, I used Orthologs where at least 3 predictions are in consensus (3+ Analysis objects connected) to annotate ortholog C.briggsae genes with a gene name based on C.elegans.
I am trying with less supporting predictions but addition of mutation rate and/or dN/dS for assigning more gene names. I will write in more detail when I have a working method.
We added another ortholog set called OMA from the ETH in Zurich http://www.cbrg.ethz.ch/research/orthologous which is in WS184.
The OMA group will also provide us with an updated version using the last stable C.briggsae data from WS180.