I’ve just added an isoform to C42C1.10 by inserting a new exon just 5’ of the old gene structure and splicing into the start of C42C1.10. This isoform then ends after translating a short part of the old C42C1.10 first exon in a different frame.
The structure of this is rather speculative and I would appreciate any lab work that could confirm it.
There is evidence from tiling array expression data and from WABA coding potential for this new first exon. There is evidence from the Hillier high-throughput transcriptome data of expression in this region and evidence for a transcription start site at the beginning of the new first exon.
There is no evidence of splicing from this exon to the start of the old C42C1.10 structure, so it is possible that this should not be an isoform, but a separate gene on this operon, but with unusually close spacing between it and the gene C42C1.10.
If it is a separate gene, then the coding region would cover the whole of this ORF, however the WABA coding potential only indicates that only the part of the ORF as curated in this first exon is coding. Also, if it is a separate gene then it will have a coding region overlapping in a different frame with C42C1.10 which is very unusual.
This new first exon appears to be composed of a copy of the 5’ end of F47B8.10, so this may be an expressed pseudogenic fragment, but there is a mass-spec peptide at this position indicating that this is translated.
I am not happy with the current structure because the second exon of this isoform is in a different frame to the one used by the old C42C1.10 first exon and because the new isoform appears to be a duplicate of the 5’ end of another gene, bu the evidence for expression in this region is compelling and there is plausible evidence for translation.
This new isoform, C42C1.10b, will be available in release WS201.