Anti-coding sense Transcript

The 454 transcriptome data has been useful recently in fixing various aspects of our gene models, but it sometimes thows up some surprising things.

Today’s surprise is a couple of 454 transcripts (MM454_FPK17YK01B08N2, MM454_FPK17YK01BSONU) which are spliced and the splice sites show that they align to the opposite sense to the CDS ZK822.1.

So there are two 454 transcripts and both align across regions that are on the other sense to exons of a well-characterised gene.

The second region of alignment, after the “intron” starts with a SL1 site which is defined by a TEC-RED tag.

Taking a look at the Illumina RNAseq data from Hillier et al. on the UCSC browser we see a very faint hint of possible expression in the aligned region which overlaps with the last exon of ZK822.1 - this is not a very convincing bit of expression, however looking at the region where the SL1 site is, there is a strong region of expression starting with the characteristice asymmetric RNASeq signal for a transcription start site in the exon and part of the intron corresponding to the region where the 454 reads align.

The RNASeq data shows the gene ZK822.1 is expressed strongly in the Early Embryo life-stage, but the anti-sense region starting at the SL1 site is expressed well in all the other life stages. This looks like a ‘Natural anti-sense’ mechanism of control of gene expression, see: http://en.wikipedia.org/wiki/Cis-natural_antisense_transcript

No very long convincing coding sequence can be formed from these 454 sequences.

So we have 3 sources of evidence for this transcript: 454 reads, TEC-RED SL1 site, Illumina RNASeq reads.
Any ideas what it is?

[Edit]

I’ve just found another example.
The CDS Y57G11C.1 has 27 clustered 454 reads (MM454_contig03575) aligned to its first exon and part of the first intron.
The RNASeq shows a highly expression region in the same place (If you also look at the RNAseq reads that also align to other locations, then the region is expanded at the start). This is only expressed in the ‘MALE6’ library.
No convincing coding sequence can be formed from these 454 sequences.

[Edit]

And another two:
F32F2.3 and F32F2.2 opposite the CDS F32F2.1a/b/c
F32F2.3 has an SL1 on its second exon.

This is another example :C44C10.13
Which is opposite the third to fifth exons of CDS C44C10.4 [Edit by gw3]

[Edit]

An example where coding is not overlapped:W10G6.5
The gene on the opposite strand (F52E10.5) is within the intron of W10G6.5

… and another one

Y48B6A.17 opposite the CDS Y48B6A.10
It is possible to make a CDS in the first half of Y48B6A.17, but having two coding exons on opposite strands sounds very unlikely.

Gary

This one is on the reverse sense of a Pseudogene and has supporting evidence from a regular EST, rather than a 454 read:
CDS Y105C5A.1272 on the reverse sense to Pseudogene Y105C5A.14

Gary

And another one…

This one (Y57G11C.1135) is on the opposite strand to the fifth exon of Y57G11C.5 and the RNASeq short read data shows that it is strongly expressed.
It appears to regulate the expression of Y57G11C.5 - when one is strongly expressed the other is not.

Evidence for this transcript comes mainly from the RNASeq data which also defines the SL1 and SL2 features at the 5’ end of this reverse-sense transcript: WBsf128390, WBsf129195, WBsf128391

This reverse-sense transcript is not spliced.

Y73F8A.20 is a fairly close paralog of Y57G11C.5, but it shows no evidence of a similar reverse-sense transcript being expressed.

The C. briggsae ortholog CBG13833 shows no evidence for or against having a similar reverse-sense transcript.

Gary Williams

… and of course there are some regions that are currently marked up as being pseudogenes that could also be regarded as reverse-sense transcripts, like the pseudogene R10E8.7 which is on the oposite sense to the CDS R10E8.3.

Gary

…and another one: R102.15 is a ncRNA supported by OST transcript evidence and is on the opposite strand to the CDS R102.8.

Gary

… and another two:

F17C11.20 opposite the CDS F17C11.5a
F20G2.9 opposite the CDS F20G2.1

Gary

…and another: T13F3.10 opposite T13F3.7

… and another one: C14C10.9
opposite C14C10.5