Friday, March 27, 2015

bpy17. Using efetch EUtil in Biopython

We can use Bio.Entrez.efetch to get records from a database, given an ID or a set of IDs. We use the set of IDs that we used earlier in bpy14.


There are two programs here. The first file saves the records to a text file. The rettype and retmode key arguments are given at the NCBI Entrez site. The variable out is the string corresponding to the text file returned. This particular combination is for getting a text file in Medline format.


The second program uses Bio.Medline.parse to read the Medline file. Here we print the Title and first 200 characters of Abstract.

# bpy17a.py
from __future__ import print_function
from Bio import Entrez
Entrez.email = "Your.Name.Here@example.org"
handle = Entrez.efetch(db="pubmed", id="25741283,25798216",
                       rettype ='medline', retmode = 'text')
out = handle.read()
handle.close()
with open("bpy17.txt","w") as fout:
    fout.write(out)

# bpy17b.py
from __future__ import print_function
from Bio import Medline
fin = open('bpy17.txt')
records = Medline.parse(fin)

for record in records:
    print('Title:',record['TI'])
    print('Abstract\n',record['AB'][:200], end = '')
    print('...\n')
    
fin.close()

#Title: Nitric oxide and mitochondria in metabolic syndrome.
#Abstract
# Metabolic syndrome (MS) is a cluster of metabolic disorders
# that collectively increase the risk of cardiovascular disease.
# Nitric oxide (NO) plays a crucial role in the pathogeneses of
# MS components a...
#
#Title: Maternal ancestry and population history from whole
# mitochondrial genomes.
#Abstract
# MtDNA has been a widely used tool in human evolutionary and
# population genetic studies over the past three decades. Its
# maternal inheritance and lack of recombination have offered
# the opportunity to e...

No comments:

Post a Comment