MGI SNP data increased 4-fold

MGI increased its SNP records 4-fold to over 66 million. These data include the Sanger MGP (v5) SNP set. Use our preeminent SNP Query to search for SNPs by gene symbols or chromosomal regions across 101 mouse strains. Search results are limited to 100,000 SNPs (displayed or downloaded), and filters are available to refine search results that exceed the return limit.

To query this large data set efficiently, we implemented the search with Elasticsearch, which allows large data sets to be queried in parallel and improves search speeds. We plan on using this search for other large data sets in MGI/GXD.

In order to update the SNP coordinates to GRCm39 we temporarily lost the dbSNP function classes, such as whether the SNP is in an intron or coding region, and therefore, the Function Class filter in a SNP summary report is limited to “within coordinates of.” A future release will provide more function classes. The function classes for individual SNPs can be found by searching The Alliance of Genome Resources with the RefSNP ID, one ID at a time. For queries that specify one or more Reference strains, MGI’s Allele Agreement filters restrict results to SNPs with strain allele calls that agree or differ from reference strain alleles.

Please send questions and comments to mgi-help@jax.org