I am interested in the Gene Descriptions of Homo Sapiens here from the Alliance of Genome Resources.
Is it possible to retrieve the sources / references for the Gene Descriptions?
For example, about TP53 I read that it is “Implicated in several diseases, including Li-Fraumeni syndrome (multiple); carcinoma (multiple); gastrointestinal system cancer (multiple)[…]”
These are certainly well established facts but I feel these statements need references like Pubmed IDs or references for each sentence, no?.. like UniProt does. Can I retrieve them somehow in bulk for each Gene Summary?
(I wanted to post links to an example but I am not allowed to post links as a new user yet).
These descriptions are generated automatically based on gene and disease ontologies, with additional data from the Alliance Model Organism Databases. For the underlying publications, you’ll find those with the gene’s functional and disease annotations, further down on a gene page.
I meant how can I programmatically (i.e. systematically; in batch) access this information? I want to write a script that accesses this information for each statement. I am not good at writing web scrapers… are the data available in a systematic fashion?
That section in the original paper only describes that these are automatically generated, not how to access, parse… this programmatically.