In the last example, we saved the insulin proteins to 'data' subfolder.
We have a function that takes one parameter, the subfolder where *.txt files are stored, each corresponding to a UniProtKB text record.
We could have given our files a different extension during saving, for example .dat or .swiss which would require that term in the list comprehension. We do not have to have a filter term if only UniProtKB records are in the subfolder; in this case we have fils = os.listdir(fol)
Several attributes are printed of the records.
# bpy26.py
from __future__ import print_function, division
import os
from Bio import SwissProt
def createRecords(fol):
records = []
fils = [fil for fil in os.listdir(fol) if fil.endswith('.txt')]
for fil in fils:
handle = open(fol + '/' + fil)
record = SwissProt.read(handle)
records.append(record)
handle.close()
return records
if __name__ == '__main__':
records = createRecords('data')
for record in records:
print('Entry Name:',record.entry_name)
print('Organism:',record.organism)
print('Length:',record.sequence_length)
first_crossref = record.cross_references[0]
print('First cross ref:')
for i in first_crossref:
print('\t',i)
print()
#Entry Name: IGF1R_HUMAN
#Organism: Homo sapiens (Human).
#Length: 1367
#First cross ref:
# EMBL
# X04434
# CAA28030.1
# -
# mRNA
#
#Entry Name: IGF1R_MOUSE
#Organism: Mus musculus (Mouse).
#Length: 1373
#First cross ref:
# EMBL
# AF056187
# AAC12782.1
# -
# mRNA
No comments:
Post a Comment