I’ve been asked to say why I think that this gene model looks wrong.
Firstly, the mRNA AF164113 has been submitted to two journals, but was never accepted. This is grounds to be very suspicious of the mRNA sequence to the extent that I do not trust it at all. This caused me to re-evaluate the gene model.
The BLAT_EST_BEST and BLAT_Caen_EST_BEST alignment data are sparse and so not at all conclusive, but there is no alignment to join the first three exons to the rest of the gene model.
The BLAT_NEMATODE data is similarly inconclusive, but there is again no overlap in alignments between the first three exons and the rest of the gene model.
The C. briggsae protein homology has an alignment at the initial three exons to BP:CBP19572 (CDS CBG19351) - this aligns the full length of the protein (198 aa). The rest of F58E6.1b aligns to the briggsae protein BP:CBP04512 (CDS CBG19363) - again this matches over the full extent of BP:CBP04512. The briggsae CDS CBG19351 is on contig cb25.fpc4126. The briggsae CDS CBG19363 is also on contig cb25.fpc4126, but is 40Kb upstream on the opposite strand: they are very definitely different gene in briggsae (assuming the briggsae assembly is correct). In C. remanei - the two matching proteins BP:CBP04512 (at the 5’ end) and RP:RP20407 (at the rest of the gene) are different genes (CDS cr01.sctg34.wum.21.1 on Crem_Contig34 and CDS cr01.sctg34.wum.22.1 on Crem_Contig34) but here the two gene are next to each other (with what looks like a small pseudogene between them). It is conceivable that these two remanei genes could be merged, but I think they are distinct. What do you think?
Excluding the mRNA data, all of the alignment data indicates that the third of the exons should be extended in line with the genefinder prediction F58E6.gc2 and should be split off to form a separate gene.
A Blast search of the protein product of F58E6.1b against the NCBI protein nr database shows (apart from self-matches and a match to the mRNA) a marked dichotomy between matches to a set of proteins with ring-finger domains at the start of the protein and matches to a set of signal transducer proteins towards the end.