Skip to content
This repository was archived by the owner on Aug 11, 2021. It is now read-only.

Genotype associated Phenotype calls REST API

gkos-ebi edited this page Mar 14, 2014 · 6 revisions

Genotype associated phenotype calls

There are many ways to get information about the MP terms associated to the different KO genes. You can select data per:

  • phenotyping center (UCD, Wellcome Trust Sanger Institute, JAX, etc.)
  • phenotyping program (legacy MGP, EUMODIC, etc.)
  • phenotyping resource (EuroPhenome, MGP, IMPC)
  • phenotyping pipeline (EUMODIC1, EUMODIC2, MGP, IMPC adult, IMPC embryonic, etc.)
  • phenotyping procedure or parameter
  • allele name or MGI allele ID
  • strain name or MGI strain ID
  • gene symbol or MGI gene ID
  • or a combination of all these fields

Retrieve all genotype-phenotype associations

This is the basic request to get all the results from the Solr service in JSON

curl \
--basic  \
-X GET \
'http://www.ebi.ac.uk/mi/impc/solr/genotype-phenotype/select?q=*:*&rows=10&wt=json'

A bit of explanation:

  • genotype-phenotype is the name of the Solr core service to query
  • select is the method used to query the Solr REST interface
  • q=: means querying everything without any filtering on any field
  • rows allows to limit the number of results returned
  • wt=json is the response format

Retrieve all genotype-phenotype associations for a specific marker

We will constrain the results by adding a condition to the q (query) parameter using the specific marker_symbol field. For Akt2, simply specify q=marker_symbol:Akt2

curl \
--basic  \
-X GET \
'http://www.ebi.ac.uk/mi/impc/solr/genotype-phenotype/select?q=marker_symbol:Akt2&wt=json'

Retrieve all genotype-phenotype associations for a specific MP term

We will constrain the results by adding a condition to the q (query) parameter using the specific mp_term_name field. To retrieve genotype associated to "decreased total body fat amount", simply specify q=mp_term_name:"decreased total body fat amount"

curl \
--basic  \
-X GET \
'http://www.ebi.ac.uk/mi/impc/solr/genotype-phenotype/select?q=mp_term_name:"decreased total body fat amount"&wt=json'

This also work with mp_term_id the corresponding MP term identifier. In this case specify q=mp_term_id:"MP:0010025"

curl \
--basic  \
-X GET \
'http://www.ebi.ac.uk/mi/impc/solr/genotype-phenotype/select?q=mp_term_id:"MP:0010025"&wt=json'

Retrieve all genotype-phenotype associations for a top level MP term

We will constrain the results by adding a condition to the q (query) parameter using the specific top_level_mp_term_name field. This will work with top_level_mp_term_id if you pass an identifier instead of the MP term name. To retrieve genotype associated to "decreased total body fat amount", simply specify q=top_level_mp_term_name:"nervous system phenotype"

curl \
--basic  \
-X GET \
'http://www.ebi.ac.uk/mi/impc/solr/genotype-phenotype/select?q=top_level_mp_term_name:"nervous system phenotype"&wt=json'

Retrieve all genotype-phenotype associations with a p-value cut-off

In this example, we will apply a cut-off to the previous query and add a condition to the q (query) command. In Solr, you can specify a range to retrieve results. For instance, if you want p-values below 0.0001, you can add this condition p_value:[0 TO 0.0001]. Here, we will retrieve genotype associated to a nervous system phenotype with a p-value cut-off of 0.00005.

curl \
--basic  \
-X GET \
'http://www.ebi.ac.uk/mi/impc/solr/genotype-phenotype/select?q=top_level_mp_term_name:"nervous system phenotype" AND p_value:[0 TO 0.00005]&wt=json'

Retrieve all genotype-phenotype associations for a specific phenotyping center

We will constrain the results by adding a condition to the q (query) parameter using the specific phenotyping_center field. To retrieve all MP associations to "WTSI" (Wellcome Trust Sanger Institute) phenotyping centre ,specify q=phenotyping_center:"WTSI"

curl \
--basic  \
-X GET \
'http://www.ebi.ac.uk/mi/impc/solr/genotype-phenotype/select?q=phenotyping_center:"WTSI"&wt=json'

Get the phenotyping resource names

We will start by a simple request to get the different phenotyping resource names (EuroPhenome, MGP, IMPC). This will be the basis to filter historical phenotyping resources like EuroPhenome or active resource like IMPC project.

There are two basic informations you should know about Solr. Solr queries are based on filters and facets. Using facets enables to retrieve the distinct values from a specific field. In this example, we want to retrieve the distinct phenotyping resource names. Filtering enables us to sub-select specific fields we want to retrieve and all the fields from a Solr document. In this example, the fields we are interested in are 'resource_name' and 'resource_fullname'.

curl \
--basic  \
-X GET \
'http://www.ebi.ac.uk/mi/impc/solr/genotype-phenotype/select/?q=*:*&version=2.2&start=0&rows=0&indent=on&wt=json&fl=resource_name&fl=resource_fullname&facet=on&facet.field=resource_fullname&facet.field=resource_name'

If you look carefully at the request:

  • parameter fl means 'filter': we want to filter the results and keep only resource_fullname and resource-name fields
  • parameter facet=on means we want to have faceted results
  • parameter facet.field means we are looking at all the possible combination of resource_name and resource_fullname
  • parameter q is the query parameter. q=* means we are not doing any text matching and want to get all the resource name / fullname results.

We will look into more advanced examples and how to use the query parameter 'q'.

Retrieve all the phenotyping projects

This is the same principle. Only the selected field changes. In this case, use the field 'project_name' or/and 'project_fullname'.

curl \
--basic  \
-X GET \
'http://www.ebi.ac.uk/mi/impc/solr/genotype-phenotype/select/?q=*:*&version=2.2&start=0&rows=0&indent=on&wt=json&fl=project_name&facet=on&facet.field=project_name'

Retrieve all pipelines from a specific project

To retrieve all the phenotyping pipelines from EUMODIC, we'll use the fq (filter query) parameter to filter the query on project_name:EUMODIC. We are only interested at the distinct pipeline names and we'll use the facet.field parameter to facet on 'pipeline_name'.

curl \
--basic  \
-X GET \
'http://www.ebi.ac.uk/mi/impc/solr/genotype-phenotype/select?q=*:*&fq=project_name:EUMODIC&rows=0&fl=project_name,pipeline_name&facet=on&facet.field=pipeline_name&facet.mincount=1&wt=json'

Retrieve all procedures from a specific pipeline

Again, we'll use the fq command to filter the query on pipeline_name using double-quotes and select facet.field called procedure_name.

curl \
--basic  \
-X GET \
'http://www.ebi.ac.uk/mi/impc/solr/genotype-phenotype/select?q=*:*&fq=pipeline_name:"EUMODIC Pipeline 1"&rows=0&fl=procedure_name,pipeline_name&facet=on&facet.field=procedure_name&facet.mincount=1&wt=json'

Retrieve all parameters from a specific procedure

curl \
--basic  \
-X GET \
'http://www.ebi.ac.uk/mi/impc/solr/genotype-phenotype/select?q=*:*&fq=pipeline_name:"EUMODIC Pipeline 1"&fq=procedure_name:"Non-Invasive blood pressure"&rows=0&fl=procedure_name,parameter_name&facet=on&facet.field=parameter_name&facet.mincount=1&wt=json'

Retrieve all MP calls grouped by top level MP terms first and then by resources (MGP, EuroPhenome)

curl \
--basic \
-x GET \
'http://www.ebi.ac.uk/mi/impc/solr/genotype-phenotype/select/?q=*:*&version=2.2&start=0&rows=0&indent=on&wt=json&fq=-resource_name:%22IMPC%22&fl=top_level_mp_term_name&facet=on&facet.pivot=top_level_mp_term_name,resource_name'