Monday, March 30, 2015

bpy19. Using epost EUtil in Biopython

We can use Bio.Entrez.epost to post IDs to NCBI, so we may use them later. This example is similar to the last, except now we do not use the usehistory='y' keyword.


After getting a list of IDs using esearch, with the same search term as in last example, we selected a random subset of the IDs, of length 500, and posted them.


Then, they were used in a Bio.Entrez.esummary.

# bpy19.py
from __future__ import print_function
from Bio import Entrez
import numpy as np

Entrez.email = "A.N.Other@example.com"
term = ('"mitochondria"[MeSH Terms] OR "mitochondria"[All Fields])'
        ' AND (Review[ptyp] AND "loattrfree full text"[sb] AND'
        ' ("2012/01/01"[PDAT] : "2014/12/31"[PDAT]))')
handle = Entrez.esearch("pubmed", retmax = 1500, term = term)
values = Entrez.read(handle)
print("Count:",values['Count'])
print("len(IdList)",len(values['IdList']))
randIDs = np.random.choice(values['IdList'],500,replace=False)
print("len(randIDs)",len(randIDs))
handle1 = Entrez.epost("pubmed",id=','.join(randIDs))
values1 = Entrez.read(handle1)
print("QueryKey:", values1['QueryKey'])
print("len(WebEnv):", len(values1['WebEnv']))
summary = Entrez.esummary(db="pubmed", retmax=500,
                          webenv=values1["WebEnv"],
                          query_key=values1["QueryKey"])
records = Entrez.read(summary)
for i,record in enumerate(records):
    if i%100 != 0: continue
    print('%d. %s' % (i+1,record['Title']))
    print('')

#Count: 1223
#len(IdList) 1223
#len(randIDs) 500
#QueryKey: 1
#len(WebEnv): 77
#1. Mitochondrial poly(A) polymerase and polyadenylation.
#
#101. The role of mitochondrial dysfunction in sepsis-induced
# multi-organ failure.
#
#201. The involvement of the sigma-1 receptor in neurodegeneration
# and neurorestoration.
#
#301. Provitamin A metabolism and functions in mammalian biology.
#
#401. Peroxisome proliferator activated receptor α ligands as
# anticancer drugs targeting mitochondrial metabolism.

No comments:

Post a Comment