Calculate protein average molecular weight using multiple fasta headers in one txt file











up vote
0
down vote

favorite












This is what I have so far:



AminoDict={
'A':89.09,
'R':174.20,
'N':132.12,
'D':133.10,
'C':121.15,
'Q':146.15,
'E':147.13,
'G':75.07,
'H':155.16,
'I':131.17,
'L':131.17,
'K':146.19,
'M':149.21,
'F':165.19,
'P':115.13,
'S':105.09,
'T':119.12,
'W':204.23,
'Y':181.19,
'V':117.15,
'X':0.0,
'-':0.0,
'*':0.0 }
import re
sequence = open(raw_input('Enter amino acid sequence: '), 'r')
headers = 0
seq =
aa_seq =
AveMolWeight=0
for line in sequence:
if line.startswith('>'):
headers = headers + 1
else:
seq.append(headers)
line=line.strip('rn')
aa_seq.append(seq)
cA = aa_seq.count('A')
cR = aa_seq.count('R')
cN = aa_seq.count('N')
cD = aa_seq.count('D')
cC = aa_seq.count('C')
cQ = aa_seq.count('Q')
cE = aa_seq.count('E')
cG = aa_seq.count('G')
cH = aa_seq.count('H')
cI = aa_seq.count('I')
cL = aa_seq.count('L')
cK = aa_seq.count('K')
cM = aa_seq.count('M')
cF = aa_seq.count('F')
cP = aa_seq.count('P')
cS = aa_seq.count('S')
cT = aa_seq.count('T')
cW = aa_seq.count('W')
cV = aa_seq.count('V')
cY = aa_seq.count('Y')
total = cA + cR + cN + cD + cC + cQ + cE + cG + cH + cI + cL + cK + cM + cF + cP + cS + cT + cW + cV + cY
for rando in aa_seq:
AveMolWeight = AveMolWeight + float(total*110)
print headers
print "Molecular Weight by Average: %.1f" % (AveMolWeight)


It outputs this:



`4242
Molecular Weight by Average: 0.0`


The problem is that the input file has multiple fasta headers, I can make a code that can do one single protein sequence with one header. But I can't figure out how to do each protein sequence individual and output the sequence header and molecular weight of each.









share







New contributor




GregSmith is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.
























    up vote
    0
    down vote

    favorite












    This is what I have so far:



    AminoDict={
    'A':89.09,
    'R':174.20,
    'N':132.12,
    'D':133.10,
    'C':121.15,
    'Q':146.15,
    'E':147.13,
    'G':75.07,
    'H':155.16,
    'I':131.17,
    'L':131.17,
    'K':146.19,
    'M':149.21,
    'F':165.19,
    'P':115.13,
    'S':105.09,
    'T':119.12,
    'W':204.23,
    'Y':181.19,
    'V':117.15,
    'X':0.0,
    '-':0.0,
    '*':0.0 }
    import re
    sequence = open(raw_input('Enter amino acid sequence: '), 'r')
    headers = 0
    seq =
    aa_seq =
    AveMolWeight=0
    for line in sequence:
    if line.startswith('>'):
    headers = headers + 1
    else:
    seq.append(headers)
    line=line.strip('rn')
    aa_seq.append(seq)
    cA = aa_seq.count('A')
    cR = aa_seq.count('R')
    cN = aa_seq.count('N')
    cD = aa_seq.count('D')
    cC = aa_seq.count('C')
    cQ = aa_seq.count('Q')
    cE = aa_seq.count('E')
    cG = aa_seq.count('G')
    cH = aa_seq.count('H')
    cI = aa_seq.count('I')
    cL = aa_seq.count('L')
    cK = aa_seq.count('K')
    cM = aa_seq.count('M')
    cF = aa_seq.count('F')
    cP = aa_seq.count('P')
    cS = aa_seq.count('S')
    cT = aa_seq.count('T')
    cW = aa_seq.count('W')
    cV = aa_seq.count('V')
    cY = aa_seq.count('Y')
    total = cA + cR + cN + cD + cC + cQ + cE + cG + cH + cI + cL + cK + cM + cF + cP + cS + cT + cW + cV + cY
    for rando in aa_seq:
    AveMolWeight = AveMolWeight + float(total*110)
    print headers
    print "Molecular Weight by Average: %.1f" % (AveMolWeight)


    It outputs this:



    `4242
    Molecular Weight by Average: 0.0`


    The problem is that the input file has multiple fasta headers, I can make a code that can do one single protein sequence with one header. But I can't figure out how to do each protein sequence individual and output the sequence header and molecular weight of each.









    share







    New contributor




    GregSmith is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.






















      up vote
      0
      down vote

      favorite









      up vote
      0
      down vote

      favorite











      This is what I have so far:



      AminoDict={
      'A':89.09,
      'R':174.20,
      'N':132.12,
      'D':133.10,
      'C':121.15,
      'Q':146.15,
      'E':147.13,
      'G':75.07,
      'H':155.16,
      'I':131.17,
      'L':131.17,
      'K':146.19,
      'M':149.21,
      'F':165.19,
      'P':115.13,
      'S':105.09,
      'T':119.12,
      'W':204.23,
      'Y':181.19,
      'V':117.15,
      'X':0.0,
      '-':0.0,
      '*':0.0 }
      import re
      sequence = open(raw_input('Enter amino acid sequence: '), 'r')
      headers = 0
      seq =
      aa_seq =
      AveMolWeight=0
      for line in sequence:
      if line.startswith('>'):
      headers = headers + 1
      else:
      seq.append(headers)
      line=line.strip('rn')
      aa_seq.append(seq)
      cA = aa_seq.count('A')
      cR = aa_seq.count('R')
      cN = aa_seq.count('N')
      cD = aa_seq.count('D')
      cC = aa_seq.count('C')
      cQ = aa_seq.count('Q')
      cE = aa_seq.count('E')
      cG = aa_seq.count('G')
      cH = aa_seq.count('H')
      cI = aa_seq.count('I')
      cL = aa_seq.count('L')
      cK = aa_seq.count('K')
      cM = aa_seq.count('M')
      cF = aa_seq.count('F')
      cP = aa_seq.count('P')
      cS = aa_seq.count('S')
      cT = aa_seq.count('T')
      cW = aa_seq.count('W')
      cV = aa_seq.count('V')
      cY = aa_seq.count('Y')
      total = cA + cR + cN + cD + cC + cQ + cE + cG + cH + cI + cL + cK + cM + cF + cP + cS + cT + cW + cV + cY
      for rando in aa_seq:
      AveMolWeight = AveMolWeight + float(total*110)
      print headers
      print "Molecular Weight by Average: %.1f" % (AveMolWeight)


      It outputs this:



      `4242
      Molecular Weight by Average: 0.0`


      The problem is that the input file has multiple fasta headers, I can make a code that can do one single protein sequence with one header. But I can't figure out how to do each protein sequence individual and output the sequence header and molecular weight of each.









      share







      New contributor




      GregSmith is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.











      This is what I have so far:



      AminoDict={
      'A':89.09,
      'R':174.20,
      'N':132.12,
      'D':133.10,
      'C':121.15,
      'Q':146.15,
      'E':147.13,
      'G':75.07,
      'H':155.16,
      'I':131.17,
      'L':131.17,
      'K':146.19,
      'M':149.21,
      'F':165.19,
      'P':115.13,
      'S':105.09,
      'T':119.12,
      'W':204.23,
      'Y':181.19,
      'V':117.15,
      'X':0.0,
      '-':0.0,
      '*':0.0 }
      import re
      sequence = open(raw_input('Enter amino acid sequence: '), 'r')
      headers = 0
      seq =
      aa_seq =
      AveMolWeight=0
      for line in sequence:
      if line.startswith('>'):
      headers = headers + 1
      else:
      seq.append(headers)
      line=line.strip('rn')
      aa_seq.append(seq)
      cA = aa_seq.count('A')
      cR = aa_seq.count('R')
      cN = aa_seq.count('N')
      cD = aa_seq.count('D')
      cC = aa_seq.count('C')
      cQ = aa_seq.count('Q')
      cE = aa_seq.count('E')
      cG = aa_seq.count('G')
      cH = aa_seq.count('H')
      cI = aa_seq.count('I')
      cL = aa_seq.count('L')
      cK = aa_seq.count('K')
      cM = aa_seq.count('M')
      cF = aa_seq.count('F')
      cP = aa_seq.count('P')
      cS = aa_seq.count('S')
      cT = aa_seq.count('T')
      cW = aa_seq.count('W')
      cV = aa_seq.count('V')
      cY = aa_seq.count('Y')
      total = cA + cR + cN + cD + cC + cQ + cE + cG + cH + cI + cL + cK + cM + cF + cP + cS + cT + cW + cV + cY
      for rando in aa_seq:
      AveMolWeight = AveMolWeight + float(total*110)
      print headers
      print "Molecular Weight by Average: %.1f" % (AveMolWeight)


      It outputs this:



      `4242
      Molecular Weight by Average: 0.0`


      The problem is that the input file has multiple fasta headers, I can make a code that can do one single protein sequence with one header. But I can't figure out how to do each protein sequence individual and output the sequence header and molecular weight of each.







      python beginner





      share







      New contributor




      GregSmith is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.










      share







      New contributor




      GregSmith is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.








      share



      share






      New contributor




      GregSmith is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.









      asked 9 mins ago









      GregSmith

      1




      1




      New contributor




      GregSmith is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.





      New contributor





      GregSmith is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.






      GregSmith is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.



























          active

          oldest

          votes











          Your Answer





          StackExchange.ifUsing("editor", function () {
          return StackExchange.using("mathjaxEditing", function () {
          StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
          StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
          });
          });
          }, "mathjax-editing");

          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "196"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          convertImagesToLinks: false,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });






          GregSmith is a new contributor. Be nice, and check out our Code of Conduct.










           

          draft saved


          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f208318%2fcalculate-protein-average-molecular-weight-using-multiple-fasta-headers-in-one-t%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown






























          active

          oldest

          votes













          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes








          GregSmith is a new contributor. Be nice, and check out our Code of Conduct.










           

          draft saved


          draft discarded


















          GregSmith is a new contributor. Be nice, and check out our Code of Conduct.













          GregSmith is a new contributor. Be nice, and check out our Code of Conduct.












          GregSmith is a new contributor. Be nice, and check out our Code of Conduct.















           


          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f208318%2fcalculate-protein-average-molecular-weight-using-multiple-fasta-headers-in-one-t%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Costa Masnaga

          Fotorealismo

          Sidney Franklin