Classic RNASeq data issue for gene prediction

R09B3.3 and R09B3.2 represent a tandem gene duplication event.

The RNASeq data on the whole looks pretty sensible over this region, however there are a number of “Gene lets” that have been built using RNASeq data where you get splicing from exon 1 of R09B3.3 to exon 2 of R09B3.2.

This to me looks like an alignment artefact where localised mis-alignment of a read which spans from exon 1-2 causes the gene to be built using components of both copies.

Thought this was worth documenting.

http://wormbase.sanger.ac.uk/wiki_pics/RNASeq_RB216.2_R09B3.png

P