True human orthologs?

neokao · December 18, 2008, 6:09pm

Could we trust the Best BLASTP matches in wormbase?
How was that done?

For example:

This is one of the genes I am interested.

NCBI BlastP -nr

The first H. sapiens hit:
Score = 145 bits (366), Expect = 3e-33, Method: Compositional matrix adjust.
Identities = 89/258 (34%), Positives = 143/258 (55%), Gaps = 7/258 (2%)

Wormbase:
H. sapiens best blastp
BLAST E-value : 2.3e-38 % Length: 92.7%

The results are pretty different.

What is the best way to find the true human orthologs to verify the C. elegans genes (say 200 genes) function in human cells?

Thanks a lot.

wrmbase_ant · December 19, 2008, 9:27am

Hi,
As far as I am aware there is nothing wrong with the blastp results but if you let us know what gene it is you’re looking at I could check it out and make sure.
One difference is that we use wublast rather than NCBI blast, whether that would cause the difference you’re seeing I don’t know.

As for determining the human ortholog of these genes there is a section in the WormBase wiki http://www.wormbase.org/wiki/index.php/FAQs#About_Orthologs_and_Homologs discussing this.

Ant

mh6 · December 19, 2008, 3:20pm

just to add my 2pence:

the reason why the p-value from NCBI blast is different from the e-value from the WormBase blastp is
a.) the search space influences the scoring, and the ncbi-nr is a lot bigger than the human-ipi blast database we use
b.) one is an e-value, one is a p-value which makes a bit of difference when the value is quite bad
there were actually a couple of papers and books written about it.
c.) the blast results represent homology and not necessary orthology

then about the orthologs:
blast searches are unidirectonal homologies, meaning the reverse search does not need to come up with the same results.
A lot of groups developed algorithms to still use blast (like) methods for seeding orthology clusters, but they post-process that heavily (look at Inparanoid, OrthoMCL, TreeFam, Compara, OMA,…).
We have cooperation with most orthology groups, so you can usually find our frozen releases in their datasets. In addition we reimport some of their results back into wormbase (try the other orthologs link on the gene pages).

Just dig through some orthology papers, they tend to explain it quite nicely.