Calculate protein average molecular weight using multiple fasta headers in one txt file
up vote
0
down vote
favorite
This is what I have so far:
AminoDict={
'A':89.09,
'R':174.20,
'N':132.12,
'D':133.10,
'C':121.15,
'Q':146.15,
'E':147.13,
'G':75.07,
'H':155.16,
'I':131.17,
'L':131.17,
'K':146.19,
'M':149.21,
'F':165.19,
'P':115.13,
'S':105.09,
'T':119.12,
'W':204.23,
'Y':181.19,
'V':117.15,
'X':0.0,
'-':0.0,
'*':0.0 }
import re
sequence = open(raw_input('Enter amino acid sequence: '), 'r')
headers = 0
seq =
aa_seq =
AveMolWeight=0
for line in sequence:
if line.startswith('>'):
headers = headers + 1
else:
seq.append(headers)
line=line.strip('rn')
aa_seq.append(seq)
cA = aa_seq.count('A')
cR = aa_seq.count('R')
cN = aa_seq.count('N')
cD = aa_seq.count('D')
cC = aa_seq.count('C')
cQ = aa_seq.count('Q')
cE = aa_seq.count('E')
cG = aa_seq.count('G')
cH = aa_seq.count('H')
cI = aa_seq.count('I')
cL = aa_seq.count('L')
cK = aa_seq.count('K')
cM = aa_seq.count('M')
cF = aa_seq.count('F')
cP = aa_seq.count('P')
cS = aa_seq.count('S')
cT = aa_seq.count('T')
cW = aa_seq.count('W')
cV = aa_seq.count('V')
cY = aa_seq.count('Y')
total = cA + cR + cN + cD + cC + cQ + cE + cG + cH + cI + cL + cK + cM + cF + cP + cS + cT + cW + cV + cY
for rando in aa_seq:
AveMolWeight = AveMolWeight + float(total*110)
print headers
print "Molecular Weight by Average: %.1f" % (AveMolWeight)
It outputs this:
`4242
Molecular Weight by Average: 0.0`
The problem is that the input file has multiple fasta headers, I can make a code that can do one single protein sequence with one header. But I can't figure out how to do each protein sequence individual and output the sequence header and molecular weight of each.
python beginner
New contributor
add a comment |
up vote
0
down vote
favorite
This is what I have so far:
AminoDict={
'A':89.09,
'R':174.20,
'N':132.12,
'D':133.10,
'C':121.15,
'Q':146.15,
'E':147.13,
'G':75.07,
'H':155.16,
'I':131.17,
'L':131.17,
'K':146.19,
'M':149.21,
'F':165.19,
'P':115.13,
'S':105.09,
'T':119.12,
'W':204.23,
'Y':181.19,
'V':117.15,
'X':0.0,
'-':0.0,
'*':0.0 }
import re
sequence = open(raw_input('Enter amino acid sequence: '), 'r')
headers = 0
seq =
aa_seq =
AveMolWeight=0
for line in sequence:
if line.startswith('>'):
headers = headers + 1
else:
seq.append(headers)
line=line.strip('rn')
aa_seq.append(seq)
cA = aa_seq.count('A')
cR = aa_seq.count('R')
cN = aa_seq.count('N')
cD = aa_seq.count('D')
cC = aa_seq.count('C')
cQ = aa_seq.count('Q')
cE = aa_seq.count('E')
cG = aa_seq.count('G')
cH = aa_seq.count('H')
cI = aa_seq.count('I')
cL = aa_seq.count('L')
cK = aa_seq.count('K')
cM = aa_seq.count('M')
cF = aa_seq.count('F')
cP = aa_seq.count('P')
cS = aa_seq.count('S')
cT = aa_seq.count('T')
cW = aa_seq.count('W')
cV = aa_seq.count('V')
cY = aa_seq.count('Y')
total = cA + cR + cN + cD + cC + cQ + cE + cG + cH + cI + cL + cK + cM + cF + cP + cS + cT + cW + cV + cY
for rando in aa_seq:
AveMolWeight = AveMolWeight + float(total*110)
print headers
print "Molecular Weight by Average: %.1f" % (AveMolWeight)
It outputs this:
`4242
Molecular Weight by Average: 0.0`
The problem is that the input file has multiple fasta headers, I can make a code that can do one single protein sequence with one header. But I can't figure out how to do each protein sequence individual and output the sequence header and molecular weight of each.
python beginner
New contributor
add a comment |
up vote
0
down vote
favorite
up vote
0
down vote
favorite
This is what I have so far:
AminoDict={
'A':89.09,
'R':174.20,
'N':132.12,
'D':133.10,
'C':121.15,
'Q':146.15,
'E':147.13,
'G':75.07,
'H':155.16,
'I':131.17,
'L':131.17,
'K':146.19,
'M':149.21,
'F':165.19,
'P':115.13,
'S':105.09,
'T':119.12,
'W':204.23,
'Y':181.19,
'V':117.15,
'X':0.0,
'-':0.0,
'*':0.0 }
import re
sequence = open(raw_input('Enter amino acid sequence: '), 'r')
headers = 0
seq =
aa_seq =
AveMolWeight=0
for line in sequence:
if line.startswith('>'):
headers = headers + 1
else:
seq.append(headers)
line=line.strip('rn')
aa_seq.append(seq)
cA = aa_seq.count('A')
cR = aa_seq.count('R')
cN = aa_seq.count('N')
cD = aa_seq.count('D')
cC = aa_seq.count('C')
cQ = aa_seq.count('Q')
cE = aa_seq.count('E')
cG = aa_seq.count('G')
cH = aa_seq.count('H')
cI = aa_seq.count('I')
cL = aa_seq.count('L')
cK = aa_seq.count('K')
cM = aa_seq.count('M')
cF = aa_seq.count('F')
cP = aa_seq.count('P')
cS = aa_seq.count('S')
cT = aa_seq.count('T')
cW = aa_seq.count('W')
cV = aa_seq.count('V')
cY = aa_seq.count('Y')
total = cA + cR + cN + cD + cC + cQ + cE + cG + cH + cI + cL + cK + cM + cF + cP + cS + cT + cW + cV + cY
for rando in aa_seq:
AveMolWeight = AveMolWeight + float(total*110)
print headers
print "Molecular Weight by Average: %.1f" % (AveMolWeight)
It outputs this:
`4242
Molecular Weight by Average: 0.0`
The problem is that the input file has multiple fasta headers, I can make a code that can do one single protein sequence with one header. But I can't figure out how to do each protein sequence individual and output the sequence header and molecular weight of each.
python beginner
New contributor
This is what I have so far:
AminoDict={
'A':89.09,
'R':174.20,
'N':132.12,
'D':133.10,
'C':121.15,
'Q':146.15,
'E':147.13,
'G':75.07,
'H':155.16,
'I':131.17,
'L':131.17,
'K':146.19,
'M':149.21,
'F':165.19,
'P':115.13,
'S':105.09,
'T':119.12,
'W':204.23,
'Y':181.19,
'V':117.15,
'X':0.0,
'-':0.0,
'*':0.0 }
import re
sequence = open(raw_input('Enter amino acid sequence: '), 'r')
headers = 0
seq =
aa_seq =
AveMolWeight=0
for line in sequence:
if line.startswith('>'):
headers = headers + 1
else:
seq.append(headers)
line=line.strip('rn')
aa_seq.append(seq)
cA = aa_seq.count('A')
cR = aa_seq.count('R')
cN = aa_seq.count('N')
cD = aa_seq.count('D')
cC = aa_seq.count('C')
cQ = aa_seq.count('Q')
cE = aa_seq.count('E')
cG = aa_seq.count('G')
cH = aa_seq.count('H')
cI = aa_seq.count('I')
cL = aa_seq.count('L')
cK = aa_seq.count('K')
cM = aa_seq.count('M')
cF = aa_seq.count('F')
cP = aa_seq.count('P')
cS = aa_seq.count('S')
cT = aa_seq.count('T')
cW = aa_seq.count('W')
cV = aa_seq.count('V')
cY = aa_seq.count('Y')
total = cA + cR + cN + cD + cC + cQ + cE + cG + cH + cI + cL + cK + cM + cF + cP + cS + cT + cW + cV + cY
for rando in aa_seq:
AveMolWeight = AveMolWeight + float(total*110)
print headers
print "Molecular Weight by Average: %.1f" % (AveMolWeight)
It outputs this:
`4242
Molecular Weight by Average: 0.0`
The problem is that the input file has multiple fasta headers, I can make a code that can do one single protein sequence with one header. But I can't figure out how to do each protein sequence individual and output the sequence header and molecular weight of each.
python beginner
python beginner
New contributor
New contributor
New contributor
asked 9 mins ago
GregSmith
1
1
New contributor
New contributor
add a comment |
add a comment |
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
GregSmith is a new contributor. Be nice, and check out our Code of Conduct.
GregSmith is a new contributor. Be nice, and check out our Code of Conduct.
GregSmith is a new contributor. Be nice, and check out our Code of Conduct.
GregSmith is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f208318%2fcalculate-protein-average-molecular-weight-using-multiple-fasta-headers-in-one-t%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown