I am trying to get sequences which are specific to C. elegans or at least those with homologs or orthologs in other nematodes. Is there any quicker way I can get this done?. I hope to hear from somebody who can help, moderator?
If you want all C.elegans genes without non-Wormbase orthologs, you can use the Ortholog_other tag, which contains ortholog predictions to external genes (mostly EnsEMBL species).
As example using WQL: http://www.wormbase.org/db/searches/wb_query
then query for:
find Gene WHERE Corresponding_CDS AND Species=“Caenorhabditis elegans” AND !Ortholog_Other
=> 13897 genes in WS186
The Corresponding_CDS part is just because most predictions are protein based, so most non-coding or uncloned genes have no orthology predictions.
And another disclaimer: the Ortholog_other tag for non-wormbase species is not as comprehensive as the Ortholog tag used for nematode species, as I focused on nematode genes for orthology data mining.
If you want to query for C.elegans specific genes in that set, just add a !Ortholog to the query:
find Gene WHERE Corresponding_CDS AND Species=“Caenorhabditis elegans” AND !Ortholog_Other AND !Ortholog
=> 2644 genes in WS186
but due to the varying annotation and assembly stages of the other Caenorhabditis genomes, there will be quite a few false positives.
Thanks very much for the usual help you give, I will try that.
Hi Jfosu and the good brains in here. I’m also interested in nematode specific genes, preferably a locus conserved within nematodes only. Jfosu, have u any suitable candidates? Could we perhaps collaborate on this with whoever is interested?