Where can I download an excel (or any other readable) file of the names (or IDs) of all the genes in the C. elegans genome?
Hi,
You can try the query all C. elegans on the simple mine site (https://www.wormbase.org/tools/mine/simplemine.cgi).
I suggest you unclick boxes for the additional information you can get to just what you need to reduce file size.
Good luck!
Amy
You can also do a lot with the WormBase ftp site (this is the release currently in use on WormBase, WS271), for example this csv of gene IDs
You can use WormMine to get the list:
http://intermine.wormbase.org/tools/wormmine/begin.do
In the QueryBuilder tab, click on “Import query from XML”, paste in the following XML query:
<query name="" model="genomic" view="Gene.primaryIdentifier Gene.secondaryIdentifier Gene.symbol Gene.organism.name" longDescription="Return all of the non-coding genes, and their transcripts, for a particular species" sortOrder="Gene.primaryIdentifier asc"> <constraint path="Gene.organism.name" op="=" value="Caenorhabditis elegans"/> </query>
click “Submit” and then click on “Show results”. You can then export your list as TSV. If you click on “Query” from the results list you can go back to the QueryBuilder to see how the query was constructed and you can play with modifying your query, including selecting a different species.
How can I filter for mitochondrial genes and rRNA for my single cell analysis?
I will need an csv file with the mitochondrial genes and the rRNA
First, just checking, are you sure you need the rRNA? Many scRNA-Seq methods use polyA enrichment, you’re not really supposed to have much rRNAs in this type of data (as opposed to protein-coding ribosomal subunits).
To get both rRNA and MtDNA genes at the same time, this is a good use case for WormMine.
In the QueryBuilder, you can use this:
<query model="genomic" view="Gene.primaryIdentifier Gene.symbol Gene.secondaryIdentifier Gene.transcripts.symbol Gene.transcripts.method Gene.organism.name" sortOrder="Gene.primaryIdentifier ASC" constraintLogic="A and (B or C)" >
<constraint path="Gene.organism.name" op="=" value="Caenorhabditis elegans" code="A" />
<constraint path="Gene.transcripts.method" op="=" value="rRNA" code="B" />
<constraint path="Gene.chromosome" op="LOOKUP" value="MtDNA" extraValue="C. elegans" code="C" />
</query>
Or to build it yourself:
- open WormMine
- scroll down a bit to the template “Transcript Type, Species → Genes”, keep the species as C. elegans, and set the transcript “method” to rRNA. This creates the first constraint (rRNAs)
- select “Edit Query”, and add a constraint on “Chromosome”: it should be MtDNA
- make sure you select “A or B”, and not “A and B”
Hi! Thanks so much and indeed you are absolutely right. Based on the capture PolyA capture most of the rRNA won’t be there, but some are, and I need to filter these data out. Similar to mtRNA, those are polyadenilated but I need to recognised them and fiter out cells with hight mtRNA capture. High number of UMIs containing mt genes mean that the cell has been damaged, lost cytosolic mRNA while mitochondrial stayed intact.