Wednesday, March 18, 2015

bpy7. Writing multi-record sequences using Biopython

We write 3 random DNA sequences of different lengths. The function rand_dna returns a Seq object of length n.


The function Bio.SeqIO.write is used to write a list of sequences. While we used a list, it can also be a generator object. We also write just one record, as well.


If we open the created file, you will see that, we have unknown description for all 3, since we did not specify them.

# bpy7.py
from __future__ import print_function
from Bio.SeqRecord import SeqRecord
from Bio.Seq import Seq
from Bio import SeqIO
import numpy as np

def rand_dna(n):
    dna = ['A','C','T','G']
    seq = np.random.choice(dna,n)
    return Seq("".join(seq))
    
record1 = SeqRecord(rand_dna(5),"seq1")
record2 = SeqRecord(rand_dna(8),"seq2")
record3 = SeqRecord(rand_dna(12),"seq3")
records = [record1,record2,record3]
count = SeqIO.write(records,'bpy7.fna','fasta')
print("%d records have been written." % count)
for i in records:
    print('id = %s' % i.id)
    print('seq = %s' % i.seq)

#3 records have been written.
#id = seq1
#seq = GTGGC
#id = seq2
#seq = ATTAGTTA
#id = seq3
#seq = AAAGCCAGGTCC

No comments:

Post a Comment