Is there an accepted nomenclature that can be used to distinguish ‘transcriptional’ and ‘translational’ gene expression reporter transgenes as all it says in the nomenclature page on Wormbase is ‘No specific recommendations have been made for distinguishing between transcriptional and translational fusions.’ I’ve got strains with both types and would like to clearly distinguish between them in the text of a MS I’m writing. So, for example, I’ve thought of using Pabc-1[gfp] to describe plasmid/PCR-based systems for driving a simple, non-fused reporter from a ‘promoter’ § sequence rather than the more conventional Pabc-1::gfp as, in my book, the double colon in the latter infers a gene product that fuses two peptides from different coding sequences and that is not the case. Also, the P is often dropped, eg., abc-1::gfp, which is even more confusing. For translational reporters I would then like to use the form Pabc-1[abc::gfp], Pabc-1[NLS::gfp], Pxyz-1[abc::gfp], etc.
Am I breaking too many rules or does this sound sensible?
I was under the impression that for a reporter it would be Pabc-1::GFP, and for a translational fusion Pabc-1::abc-1::GFP. According to Wormbase the double colon indicates a gene fusion, rather than a peptide fusion.
I don’t see the reason to not use Pabc-1::GFP or abc-1p::GFP. However, I know FlyBase does it differently, though:
Fusion genes are defined (by FlyBase) as the fusion of protein coding regions of distinct genes constructed by in vitro mutagenesis. They are named using the gene symbols of their component parts, separated by a double colon, e.g., Antp::Scr or Act88F::Scer\act1 .
I think you should stick with the mostly commonly used nomenclature.
This question has come up before on the forums, with a pretty good discussion of the issues. As I more-or-less said then the main thing is to be extremely clear in the descriptions of any transgenes you publish or distribute, and hope others are, because a whole range of improvised nomenclatures are in use.
It might seem desirable to create a correct nomenclature going forward, especially for when multiple available reporters are being compared and when a single curator has compiled all those records - I’m thinking, obviously, of on WormBase. On the other hand, this could mean a significant workload in trying to clean up all the old records, which effort I really don’t foresee anyone undertaking, and thus a mixed situation lacking backwards compatibility - leaving the interested party once more hoping the construct has been well described.
Still, the problem of pre-existing, non-systematic records will only get worse as time goes on, so I’d certainly welcome a good nomenclature. That previous discussion I linked was almost six years ago, and even then there were calls for rules to be written. So far as I know, they haven’t been.
If anyone is taking votes, I’d argue against the use of subscripts (or superscripts, not that they’ve been mentioned), as they can be a pain to do, especially in formats like email.
A standard nomenclature is challenging because many cases don’t fall cleanly into one category. There are numerous examples where the first intron contains regulatory elements and is included in the reporter construct to retain the proper expression pattern. Would that count as a translational reporter? If not, how much of the gene needs to be included to qualify?
And what about a full-length gene fusion that expresses properly but fails to rescue? Molecularly, it’s a translational fusion, but it’s only useful as a transcriptional reporter.
Don’t get me wrong, it would be very useful to have a standardized system to capture this information. But at some point the system becomes unwieldy. Personally, I’d find it more useful if the transgene sequences were readily available.
I think Ben’s suggestion of indicating the promoter and gene body separately makes sense (which would also accommodate cases where a translation fusion is driven from a heterologous promoter), and completely agree with Hillel’s opposition to super/subscripts (difficult to parse).
At one of the 2012 topic meetings, each session started with a slide outlining preferred nomenclature for strains and transgenes.
As this was different from what we thought was the standard for transcriptional reporters (“pgene_name::GFP”, with only “gene_name” italicised),
I asked Jonathan Hodgkin who was then WormBase Gene Name and Genetic Map Curator.
He replied:
"The format that you have used is generally deprecated, even though it can be found in many publications,
and it has been specifically criticised by various researchers and curators,
(particularly now that we have four-letter gene names,
which means that the initial p tends to get seen as part of the gene name).
Hence, the general recommendation is for promoter fusions to be written as gene-1p::gfp,
all italicized. This is what many labs have routinely used in the past.
I did specifically ask Marty Chalfie whether he preferred gfp or GFP in this context, and he said gfp,
so we are going with that version".
“Gene fusions incorporated in transgenes that consist of a C. elegans gene or part thereof fused to a reporter such as lacZ or GFP are indicated by the C. elegans gene name followed by two colons and the reporter, all italicized: pes-1::lacZ, mab-9::GFP. No specific recommendations have been made for distinguishing between transcriptional and translational fusions.”
I think this issue is irrelevant if the long full construction of the transgene is written. For example par-1p::par-1::gfp would probably be interpreted as par-1 promoter in front of par-1 gene fused to gfp protein to the C terminus of par-1 (also sometimes people write this if GFP is at the N terminus which is also confusing).
Authors want to save space in papers and figures though and so some people might just prefer to write par-1::gfp in text or on a figure this is the situation where it isn’t clear if they are saying it is a fusion or a transcriptional reporter.
A simple space thrifty solution might be to use par-1::gfp for the transcriptional reporter and par-1=gfp for the translational reporter with the two solid lines of an equal sign being similar to two colons but “connected” just like the proteins.
given the ease of including links to supplementary figures/graphs/text then I would have thought lack of will rather than space was the problem here and that it was enough to:
write out in full the construct along the lines already mentioned
describe the construct in sufficient detail such that it would be useful as an archive.
Shorthand is great, but sometimes one just has to make the effort and say what you mean.