As GBrowse is goinig to be retired January 2023, is there anyway to download directly genbank format from jbrowse? Genbank format can be edited with snapgene with gene feature. Manual adding feature to jbrowse sequencing waste too much time for me at present. It will be convenient if Jbrowse can download genbank format as well.
Hi, GenBank is a “janky” format and difficult to write correctly. I see that Snapgene supports GFF3 as well:
https://support.snapgene.com/hc/en-us/articles/10384245528212?input_string=gff3%3F
which JBrowse 1 will currently output and is a feature that is being worked on for JBrowse 2. Would that work?
Scott
Not Work. The coordinates are Inconsistent.
When I tried change coordinates with excel and save as a small gff, snapgene said my gff does not contain features.
Still not working. From snapgene link your provided, it seems that snapgene does not support join feature with gff3.
From my point of view, genebank format is irreplaceable.
Is it possible to output a most simplified genbank format like the link below?
https : / / cloud . tsinghua . edu . cn / f / a88b5244ef5643ddb5a0 / ? dl = 1
LOCUS Exported 1699 bp DNA linear UNA 04-FEB-2023
FEATURES Location/Qualifiers
source 1..1699
CDS join(428..659,794..1099)
CDS complement(join(1278..1411,1515..1650))
ORIGIN
1 ttaatttgaaatagtttccattttttgataataatgaaaagctgctgaaaaaatggtttggcagttagcaattccaggaattttttcgagataagccataaattttaaaattatggaaattgatttacgtgtgtttttttctaattctaaattttttggtgacgttttccacgttgatttatttatttttcgaacccccctttccctcaaccaaaatagtatttattcttcagtttcaatattgtcaaaaagctcgatgcccgagtattttgaatcttctgcgatttcaattagaagaaatgctgcaggaaacgacgttcaaaaggtaattgaaagcatttagaacatctcataaagatgatgtttcagaacaaagttcaaaattggcttcacagtgtgatcgagcgtctcaagtggtggagtcccggacgatgtcagcagctcttcgtcgagaatgagctcatcgagctatgctacagagctcgtgagcagttctggaaaaacaaagtgaagctagatgtacgtttagcgtatgagggattagcaattcattttctaataatttcagatcgaagctcctgtcaaaatctgtggagacattcacggacagttcgaggacttgatggctctgttcgagttgaatgggtggcctgaagagcataagtaagccgccaatttgaatttggattagtatatgttttcatttcagatatctctttcttggtgattatgttgaccgtggtccattctccattgaagtcatcacactcctcttcacctttcaaatattgatgcctgacaaagtcttccttcttcgaggaaaccacgaaagccgccccgtcaatatgcaatatggattttatctggaatgcaagaagcgctactcagtcgccttgtatgatgcatttcaacttgcattcaattgtatgccactgtgcgctgtcgtgagcaagaagatcatatgtatgcatggaggaatatctgaagatctgattgacttgacgtaagatctttttccaatttccttatgtacttcaacaaccaatttccagacaactcgaaaagattgatcgtccatttgatattccggacattggcgtcatctccgacttgacctgggctgatcccgacgagaaggtcttcggatatgccgattctccacgtggcgcgggacgttctttcggtccgaatgcggtcaagaagttccttcaaatgcacaacctggatctagtcgttcgtgcccatcaggtcgtcatggatggttatgaattctttgcggaccgccaacttgtcacagtcttctcggcaccatcatactgcggacaattcgacaatgctgctgccgtgatgaatgttgacgacaaattgctctgtactttcacaatcttccgcccggatttgaaagttggcgacttcaagaagaaggacaagtgatattttgatttatcgaaataaagcattttttgtaccgtcttgattttcaggttaggctcgaatcacgcgcgcctgcttctcgaccttaaaaatgcctccaggtacaccaggaggcgagcccgctaagcaagaattccagcgccttctcccttctctcccgcttcctgagaatattgatgacataatcggtattctttttgtgtgtgcctgtatccattattcacgcacacaagaacaccaacaagcatgctggttttcttatata
//
I am working on a plugin or other method of generating that format; I’ll let you know what I come up with.
I am thinking about multiple approaches to this problem; while what I want is a JBrowse plugin (for either JB1 or JB2) that downloads in the format you want, a quicker solution for me to implement would be one where you get downloads from the gene and reference sequence tracks (GFF and FASTA) and then process those with a script to get GenBank format. My question for you is this: would you have a preference for a BioPerl based solution or a NodeJS based solution? Both would require you to install external software but if you already use one, I would go down that path. (I could probably have something sooner using BioPerl but if I do it in NodeJS it would probably find wider adoption).
Dear, scott. Thank you very much for your help with this matter.
For me personally, both BioPerl and NodeJS is ok. Actually, everyone else in our lab have no background in bioinformatics and both of them need this feature. It will be hard to setup more than 10 windows computer and make it work for everyone. A plugin would be a better way.
Thanks again to you
As I was thinking about cobbling together a solution for JBrowse 1 and chatting with the JBrowse lead developer and he thought he could get something working for JBrowse 2, and there is already a working prototype. Please take a look at JBrowse and click on the three vertical dots in the track label to get the menu that has “save track data” as an option. Obviously, this is alpha software and I am reasonably sure it’s getting the coordinates right, but checking output is a good idea! Please let me know what you think.
Hi, Scott. Thank you so much for you and jbrowse2 developer to develop this prototype so quickly.
At first not worked, and than I found that I used the gene model historical
track and getting the wrong genbank.
The prototypes worked very well in the Curated Genes
track.
Thank you very much for your efforts.
A very small suggestion, all downloaded files are named as jbrowse_track_data
. To reduce filename conflict and confusion, It is better to use coordinates as filename like gbrowse do.
Thank you for the push–this was something that was on my radar for a while but having a user saying it’s important helps to make things happen!
I’ll pass on the suggestion to modify the file names; that’s a good one too!
mRNA complement(13673..15502)
/gene="gene:Cnig_chr_X.g24897"
/name=transcript:Cnig_chr_X.g24897
/id="transcript:Cnig_chr_X.g24897"
/info="method:InterPro accession:IPR013750 description:GHMP kinase, C-terminal domain
method:InterPro accession:IPR014721 description:Ribosomal protein S5 domain 2-type fold, subgroup
method:InterPro accession:IPR015192 description:Switch protein XOL-1, N-terminal
method:InterPro accession:IPR015193 description:Switch protein XOL-1, GHMP-like
method:InterPro accession:IPR020568 description:Ribosomal protein S5 domain 2-type fold"
/jbrowse_parent="gene:Cnig_chr_X.g24897"
/Name="Cnig_chr_X.g24897"
CDS complement(join(15426..15502,15288..15369,15060..15242,14642..14750,14435..14594,14020..14389,13673..13972))
/mRNA="transcript:Cnig_chr_X.g24897"
I found a bug. The CDS feature is not recognized in the above genebank. This error may originate from long multiple lines info in mRNA feature.
Interesting, if you manually take out the carriage returns in the “info” does it then work? I’m trying to figure out what we need to do generally, since that info section can frequently be quite long.
mRNA complement(13673..15502)
/gene="gene:Cnig_chr_X.g24897"
/name=transcript:Cnig_chr_X.g24897
/id="transcript:Cnig_chr_X.g24897"
/info="method:InterPro accession:IPR013750 description:GHMP kinase, C-terminal domain
method:InterPro accession:IPR014721 description:Ribosomal protein S5 domain 2-type fold, subgroup
method:InterPro accession:IPR015192 description:Switch protein XOL-1, N-terminal
method:InterPro accession:IPR015193 description:Switch protein XOL-1, GHMP-like
method:InterPro accession:IPR020568 description:Ribosomal protein S5 domain 2-type fold"
/jbrowse_parent="gene:Cnig_chr_X.g24897"
/Name="Cnig_chr_X.g24897"
CDS complement(join(15426..15502,15288..15369,15060..15242,14642..14750,14435..14594,14020..14389,13673..13972))
/mRNA="transcript:Cnig_chr_X.g24897"
The above format worked.