Run length encoding in Python












4















I'm trying to write a simple python algorithm to solve this problem. Can you please help me figure out why my code is not working:



Problem:




If any character is repeated more than 4 times, the entire set of
repeated characters should be replaced with a slash '/', followed by a
2-digit number which is the length of this run of repeated characters,
and the character. For example, "aaaaa" would be encoded as "/05a".
Runs of 4 or less characters should not be replaced since performing
the encoding would not decrease the length of the string.




My Code:



def runLengthEncode (plainText):
res=''
a=''
for i in plainText:
if a.count(i)>0:
a+=i
else:
if len(a)>4:
res+="/" + str(len(a)) + a[0][:1]
else:
res+=a
a=i
return(res)









share|improve this question















migrated from cs.stackexchange.com Sep 22 '13 at 20:21


This question came from our site for students, researchers and practitioners of computer science.














  • 4





    Aside: You may want to look at collections.Counter and itertools.groupby. Both are very effective ways to count repeats in sequences.

    – kojiro
    Sep 22 '13 at 20:26











  • What should happen if it repeats more than 99 times?

    – Samy Bencherif
    Sep 22 '13 at 20:46













  • By the way, this code will fail if the input is something like /05c because it has no concept of escaping input correctly. You either need to handle / especially, or compress every occurrence, not just runs of length > 4.

    – Konrad Rudolph
    Sep 22 '13 at 20:46


















4















I'm trying to write a simple python algorithm to solve this problem. Can you please help me figure out why my code is not working:



Problem:




If any character is repeated more than 4 times, the entire set of
repeated characters should be replaced with a slash '/', followed by a
2-digit number which is the length of this run of repeated characters,
and the character. For example, "aaaaa" would be encoded as "/05a".
Runs of 4 or less characters should not be replaced since performing
the encoding would not decrease the length of the string.




My Code:



def runLengthEncode (plainText):
res=''
a=''
for i in plainText:
if a.count(i)>0:
a+=i
else:
if len(a)>4:
res+="/" + str(len(a)) + a[0][:1]
else:
res+=a
a=i
return(res)









share|improve this question















migrated from cs.stackexchange.com Sep 22 '13 at 20:21


This question came from our site for students, researchers and practitioners of computer science.














  • 4





    Aside: You may want to look at collections.Counter and itertools.groupby. Both are very effective ways to count repeats in sequences.

    – kojiro
    Sep 22 '13 at 20:26











  • What should happen if it repeats more than 99 times?

    – Samy Bencherif
    Sep 22 '13 at 20:46













  • By the way, this code will fail if the input is something like /05c because it has no concept of escaping input correctly. You either need to handle / especially, or compress every occurrence, not just runs of length > 4.

    – Konrad Rudolph
    Sep 22 '13 at 20:46
















4












4








4


2






I'm trying to write a simple python algorithm to solve this problem. Can you please help me figure out why my code is not working:



Problem:




If any character is repeated more than 4 times, the entire set of
repeated characters should be replaced with a slash '/', followed by a
2-digit number which is the length of this run of repeated characters,
and the character. For example, "aaaaa" would be encoded as "/05a".
Runs of 4 or less characters should not be replaced since performing
the encoding would not decrease the length of the string.




My Code:



def runLengthEncode (plainText):
res=''
a=''
for i in plainText:
if a.count(i)>0:
a+=i
else:
if len(a)>4:
res+="/" + str(len(a)) + a[0][:1]
else:
res+=a
a=i
return(res)









share|improve this question
















I'm trying to write a simple python algorithm to solve this problem. Can you please help me figure out why my code is not working:



Problem:




If any character is repeated more than 4 times, the entire set of
repeated characters should be replaced with a slash '/', followed by a
2-digit number which is the length of this run of repeated characters,
and the character. For example, "aaaaa" would be encoded as "/05a".
Runs of 4 or less characters should not be replaced since performing
the encoding would not decrease the length of the string.




My Code:



def runLengthEncode (plainText):
res=''
a=''
for i in plainText:
if a.count(i)>0:
a+=i
else:
if len(a)>4:
res+="/" + str(len(a)) + a[0][:1]
else:
res+=a
a=i
return(res)






python string






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Sep 22 '13 at 20:23









Gilles

75.7k19162206




75.7k19162206










asked Sep 22 '13 at 20:20







user10281











migrated from cs.stackexchange.com Sep 22 '13 at 20:21


This question came from our site for students, researchers and practitioners of computer science.









migrated from cs.stackexchange.com Sep 22 '13 at 20:21


This question came from our site for students, researchers and practitioners of computer science.










  • 4





    Aside: You may want to look at collections.Counter and itertools.groupby. Both are very effective ways to count repeats in sequences.

    – kojiro
    Sep 22 '13 at 20:26











  • What should happen if it repeats more than 99 times?

    – Samy Bencherif
    Sep 22 '13 at 20:46













  • By the way, this code will fail if the input is something like /05c because it has no concept of escaping input correctly. You either need to handle / especially, or compress every occurrence, not just runs of length > 4.

    – Konrad Rudolph
    Sep 22 '13 at 20:46
















  • 4





    Aside: You may want to look at collections.Counter and itertools.groupby. Both are very effective ways to count repeats in sequences.

    – kojiro
    Sep 22 '13 at 20:26











  • What should happen if it repeats more than 99 times?

    – Samy Bencherif
    Sep 22 '13 at 20:46













  • By the way, this code will fail if the input is something like /05c because it has no concept of escaping input correctly. You either need to handle / especially, or compress every occurrence, not just runs of length > 4.

    – Konrad Rudolph
    Sep 22 '13 at 20:46










4




4





Aside: You may want to look at collections.Counter and itertools.groupby. Both are very effective ways to count repeats in sequences.

– kojiro
Sep 22 '13 at 20:26





Aside: You may want to look at collections.Counter and itertools.groupby. Both are very effective ways to count repeats in sequences.

– kojiro
Sep 22 '13 at 20:26













What should happen if it repeats more than 99 times?

– Samy Bencherif
Sep 22 '13 at 20:46







What should happen if it repeats more than 99 times?

– Samy Bencherif
Sep 22 '13 at 20:46















By the way, this code will fail if the input is something like /05c because it has no concept of escaping input correctly. You either need to handle / especially, or compress every occurrence, not just runs of length > 4.

– Konrad Rudolph
Sep 22 '13 at 20:46







By the way, this code will fail if the input is something like /05c because it has no concept of escaping input correctly. You either need to handle / especially, or compress every occurrence, not just runs of length > 4.

– Konrad Rudolph
Sep 22 '13 at 20:46














8 Answers
8






active

oldest

votes


















4














Aside for setting a=i after encoding a sequence and setting a width for your int when printed into the string. You could also do the following which takes advantage of pythons groupby. Its also a good idea to use format when constructing strings.



from itertools import groupby

def runLengthEncode (plainText):
res =

for k,i in groupby(plainText):
run = list(i)
if(len(run) > 4):
res.append("/{:02}{}".format(len(run), k))
else:
res.extend(run)

return "".join(res)





share|improve this answer





















  • 2





    +1, but if the plainText is long you're adding up a string over and over, which is costly because strings are immutable. You could nest a generator to make it faster and still very easy to read. But since I can't put formatted code in a comment, here's a one-liner: ''.join('/{:02}{}'.format(n,c) if n>4 else c*n for c,n in [(c,len(list(g))) for c,g in groupby(t)])

    – kojiro
    Sep 22 '13 at 20:44











  • @kojiro Good idea, you are right. I implemented what you suggested in multiline format.

    – Serdalis
    Sep 22 '13 at 20:56



















2














Just observe the behaviour:



>>> runLengthEncode("abcd")
'abc'


Last character is ignored. You have to append what you've collected.



>>> runLengthEncode("abbbbbcd")
'a/5b/5b'


Oops, problem after encoding. You should set a=i even if you found a long enough sequence.






share|improve this answer































    2














    Rosetta Code has a lot of implementations, that should easily be adaptable to your usecase.



    Here is Python code with regular expressions:



    from re import sub

    def encode(text):
    '''
    Doctest:
    >>> encode('WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWBWWWWWWWWWWWWWW')
    '12W1B12W3B24W1B14W'
    '''
    return sub(r'(.)1*', lambda m: str(len(m.group(0))) + m.group(1),
    text)

    def decode(text):
    '''
    Doctest:
    >>> decode('12W1B12W3B24W1B14W')
    'WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWBWWWWWWWWWWWWWW'
    '''
    return sub(r'(d+)(D)', lambda m: m.group(2) * int(m.group(1)),
    text)

    textin = "WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWBWWWWWWWWWWWWWW"
    assert decode(encode(textin)) == textin





    share|improve this answer



















    • 1





      Could you please explain how it works?

      – Johnny Well
      Nov 5 '16 at 21:59






    • 1





      (.)1* matches a character and then 0 or more of the same. That is, it matches all the runs, so the substitute function can encode them. For decoding (d+)(D) matches a number and something not a number. E.g. 12w.

      – Thomas Ahle
      Nov 5 '16 at 22:13











    • Nice, I did a solution based on yours, but changed the lambda function a little bit by adding an if to output things like B instead of 1B.

      – Johnny Well
      Nov 7 '16 at 12:44





















    1














    You can use the groupby() function combined with a list/generator comprehension:



    from itertools import groupby, imap

    ''.join(x if reps <= 4 else "/%02d%s" % (reps, x) for x, reps in imap(lambda x: (x[0], len(list(x[1]))), groupby(s)))





    share|improve this answer
























    • >>> import this

      – sedot
      Oct 9 '18 at 14:08



















    0














    I know this is not the most efficient solution, but we haven't studied functions like groupby() yet so here's what I did:



    def runLengthEncode (plainText):
    res=''
    a=''
    count = 0
    for i in plainText:
    count+=1
    if a.count(i)>0:
    a+=i
    else:
    if len(a)>4:
    if len(a)<10:
    res+="/0"+str(len(a))+a[0][:1]
    else:
    res+="/" + str(len(a)) + a[0][:1]
    a=i
    else:
    res+=a
    a=i
    if count == len(plainText):
    if len(a)>4:
    if len(a)<10:
    res+="/0"+str(len(a))+a[0][:1]
    else:
    res+="/" + str(len(a)) + a[0][:1]
    else:
    res+=a
    return(res)





    share|improve this answer































      0














      An easy solution to run-length encoding which I can think of:



      For encoding a string like "a4b5c6d7...":



      def encode(s):
      counts = {}
      for c in s:
      if counts.get(c) is None:
      counts[c] = s.count(c)
      return "".join(k+str(v) for k,v in counts.items())


      For decoding a string like "aaaaaabbbdddddccccc....":



      def decode(s):
      return "".join((map(lambda tup: tup[0] * int(tup[1]), zip(s[0:len(s):2], s[1:len(s):2]))))


      Fairly easy to read and simple.






      share|improve this answer































        0














        text=input("Please enter the string to encode")
        encoded=
        index=0
        amount=1
        while index<=(len(text)-1):
        if index==(len(text)-1) or text[index]!=text[(index+1)]:
        encoded.append((text[index],amount))
        amount=1
        else:
        amount=amount+1
        index=index+1
        print(encoded)





        share|improve this answer

































          0














          Split=(list(input("Enter string: ")))
          Split.append("")
          a = 0
          for i in range(len(Split)):
          try:
          if (Split[i] in Split) >0:
          a = a + 1
          if Split[i] != Split[i+1]:
          print(Split[i],a)
          a = 0
          except IndexError:
          print()


          this is much easier and works everytime






          share|improve this answer























            Your Answer






            StackExchange.ifUsing("editor", function () {
            StackExchange.using("externalEditor", function () {
            StackExchange.using("snippets", function () {
            StackExchange.snippets.init();
            });
            });
            }, "code-snippets");

            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "1"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: true,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: 10,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });














            draft saved

            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f18948382%2frun-length-encoding-in-python%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown
























            8 Answers
            8






            active

            oldest

            votes








            8 Answers
            8






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            4














            Aside for setting a=i after encoding a sequence and setting a width for your int when printed into the string. You could also do the following which takes advantage of pythons groupby. Its also a good idea to use format when constructing strings.



            from itertools import groupby

            def runLengthEncode (plainText):
            res =

            for k,i in groupby(plainText):
            run = list(i)
            if(len(run) > 4):
            res.append("/{:02}{}".format(len(run), k))
            else:
            res.extend(run)

            return "".join(res)





            share|improve this answer





















            • 2





              +1, but if the plainText is long you're adding up a string over and over, which is costly because strings are immutable. You could nest a generator to make it faster and still very easy to read. But since I can't put formatted code in a comment, here's a one-liner: ''.join('/{:02}{}'.format(n,c) if n>4 else c*n for c,n in [(c,len(list(g))) for c,g in groupby(t)])

              – kojiro
              Sep 22 '13 at 20:44











            • @kojiro Good idea, you are right. I implemented what you suggested in multiline format.

              – Serdalis
              Sep 22 '13 at 20:56
















            4














            Aside for setting a=i after encoding a sequence and setting a width for your int when printed into the string. You could also do the following which takes advantage of pythons groupby. Its also a good idea to use format when constructing strings.



            from itertools import groupby

            def runLengthEncode (plainText):
            res =

            for k,i in groupby(plainText):
            run = list(i)
            if(len(run) > 4):
            res.append("/{:02}{}".format(len(run), k))
            else:
            res.extend(run)

            return "".join(res)





            share|improve this answer





















            • 2





              +1, but if the plainText is long you're adding up a string over and over, which is costly because strings are immutable. You could nest a generator to make it faster and still very easy to read. But since I can't put formatted code in a comment, here's a one-liner: ''.join('/{:02}{}'.format(n,c) if n>4 else c*n for c,n in [(c,len(list(g))) for c,g in groupby(t)])

              – kojiro
              Sep 22 '13 at 20:44











            • @kojiro Good idea, you are right. I implemented what you suggested in multiline format.

              – Serdalis
              Sep 22 '13 at 20:56














            4












            4








            4







            Aside for setting a=i after encoding a sequence and setting a width for your int when printed into the string. You could also do the following which takes advantage of pythons groupby. Its also a good idea to use format when constructing strings.



            from itertools import groupby

            def runLengthEncode (plainText):
            res =

            for k,i in groupby(plainText):
            run = list(i)
            if(len(run) > 4):
            res.append("/{:02}{}".format(len(run), k))
            else:
            res.extend(run)

            return "".join(res)





            share|improve this answer















            Aside for setting a=i after encoding a sequence and setting a width for your int when printed into the string. You could also do the following which takes advantage of pythons groupby. Its also a good idea to use format when constructing strings.



            from itertools import groupby

            def runLengthEncode (plainText):
            res =

            for k,i in groupby(plainText):
            run = list(i)
            if(len(run) > 4):
            res.append("/{:02}{}".format(len(run), k))
            else:
            res.extend(run)

            return "".join(res)






            share|improve this answer














            share|improve this answer



            share|improve this answer








            edited Sep 22 '13 at 20:54

























            answered Sep 22 '13 at 20:36









            SerdalisSerdalis

            8,12122347




            8,12122347








            • 2





              +1, but if the plainText is long you're adding up a string over and over, which is costly because strings are immutable. You could nest a generator to make it faster and still very easy to read. But since I can't put formatted code in a comment, here's a one-liner: ''.join('/{:02}{}'.format(n,c) if n>4 else c*n for c,n in [(c,len(list(g))) for c,g in groupby(t)])

              – kojiro
              Sep 22 '13 at 20:44











            • @kojiro Good idea, you are right. I implemented what you suggested in multiline format.

              – Serdalis
              Sep 22 '13 at 20:56














            • 2





              +1, but if the plainText is long you're adding up a string over and over, which is costly because strings are immutable. You could nest a generator to make it faster and still very easy to read. But since I can't put formatted code in a comment, here's a one-liner: ''.join('/{:02}{}'.format(n,c) if n>4 else c*n for c,n in [(c,len(list(g))) for c,g in groupby(t)])

              – kojiro
              Sep 22 '13 at 20:44











            • @kojiro Good idea, you are right. I implemented what you suggested in multiline format.

              – Serdalis
              Sep 22 '13 at 20:56








            2




            2





            +1, but if the plainText is long you're adding up a string over and over, which is costly because strings are immutable. You could nest a generator to make it faster and still very easy to read. But since I can't put formatted code in a comment, here's a one-liner: ''.join('/{:02}{}'.format(n,c) if n>4 else c*n for c,n in [(c,len(list(g))) for c,g in groupby(t)])

            – kojiro
            Sep 22 '13 at 20:44





            +1, but if the plainText is long you're adding up a string over and over, which is costly because strings are immutable. You could nest a generator to make it faster and still very easy to read. But since I can't put formatted code in a comment, here's a one-liner: ''.join('/{:02}{}'.format(n,c) if n>4 else c*n for c,n in [(c,len(list(g))) for c,g in groupby(t)])

            – kojiro
            Sep 22 '13 at 20:44













            @kojiro Good idea, you are right. I implemented what you suggested in multiline format.

            – Serdalis
            Sep 22 '13 at 20:56





            @kojiro Good idea, you are right. I implemented what you suggested in multiline format.

            – Serdalis
            Sep 22 '13 at 20:56













            2














            Just observe the behaviour:



            >>> runLengthEncode("abcd")
            'abc'


            Last character is ignored. You have to append what you've collected.



            >>> runLengthEncode("abbbbbcd")
            'a/5b/5b'


            Oops, problem after encoding. You should set a=i even if you found a long enough sequence.






            share|improve this answer




























              2














              Just observe the behaviour:



              >>> runLengthEncode("abcd")
              'abc'


              Last character is ignored. You have to append what you've collected.



              >>> runLengthEncode("abbbbbcd")
              'a/5b/5b'


              Oops, problem after encoding. You should set a=i even if you found a long enough sequence.






              share|improve this answer


























                2












                2








                2







                Just observe the behaviour:



                >>> runLengthEncode("abcd")
                'abc'


                Last character is ignored. You have to append what you've collected.



                >>> runLengthEncode("abbbbbcd")
                'a/5b/5b'


                Oops, problem after encoding. You should set a=i even if you found a long enough sequence.






                share|improve this answer













                Just observe the behaviour:



                >>> runLengthEncode("abcd")
                'abc'


                Last character is ignored. You have to append what you've collected.



                >>> runLengthEncode("abbbbbcd")
                'a/5b/5b'


                Oops, problem after encoding. You should set a=i even if you found a long enough sequence.







                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered Sep 22 '13 at 20:30









                Karoly HorvathKaroly Horvath

                78.9k1092157




                78.9k1092157























                    2














                    Rosetta Code has a lot of implementations, that should easily be adaptable to your usecase.



                    Here is Python code with regular expressions:



                    from re import sub

                    def encode(text):
                    '''
                    Doctest:
                    >>> encode('WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWBWWWWWWWWWWWWWW')
                    '12W1B12W3B24W1B14W'
                    '''
                    return sub(r'(.)1*', lambda m: str(len(m.group(0))) + m.group(1),
                    text)

                    def decode(text):
                    '''
                    Doctest:
                    >>> decode('12W1B12W3B24W1B14W')
                    'WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWBWWWWWWWWWWWWWW'
                    '''
                    return sub(r'(d+)(D)', lambda m: m.group(2) * int(m.group(1)),
                    text)

                    textin = "WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWBWWWWWWWWWWWWWW"
                    assert decode(encode(textin)) == textin





                    share|improve this answer



















                    • 1





                      Could you please explain how it works?

                      – Johnny Well
                      Nov 5 '16 at 21:59






                    • 1





                      (.)1* matches a character and then 0 or more of the same. That is, it matches all the runs, so the substitute function can encode them. For decoding (d+)(D) matches a number and something not a number. E.g. 12w.

                      – Thomas Ahle
                      Nov 5 '16 at 22:13











                    • Nice, I did a solution based on yours, but changed the lambda function a little bit by adding an if to output things like B instead of 1B.

                      – Johnny Well
                      Nov 7 '16 at 12:44


















                    2














                    Rosetta Code has a lot of implementations, that should easily be adaptable to your usecase.



                    Here is Python code with regular expressions:



                    from re import sub

                    def encode(text):
                    '''
                    Doctest:
                    >>> encode('WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWBWWWWWWWWWWWWWW')
                    '12W1B12W3B24W1B14W'
                    '''
                    return sub(r'(.)1*', lambda m: str(len(m.group(0))) + m.group(1),
                    text)

                    def decode(text):
                    '''
                    Doctest:
                    >>> decode('12W1B12W3B24W1B14W')
                    'WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWBWWWWWWWWWWWWWW'
                    '''
                    return sub(r'(d+)(D)', lambda m: m.group(2) * int(m.group(1)),
                    text)

                    textin = "WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWBWWWWWWWWWWWWWW"
                    assert decode(encode(textin)) == textin





                    share|improve this answer



















                    • 1





                      Could you please explain how it works?

                      – Johnny Well
                      Nov 5 '16 at 21:59






                    • 1





                      (.)1* matches a character and then 0 or more of the same. That is, it matches all the runs, so the substitute function can encode them. For decoding (d+)(D) matches a number and something not a number. E.g. 12w.

                      – Thomas Ahle
                      Nov 5 '16 at 22:13











                    • Nice, I did a solution based on yours, but changed the lambda function a little bit by adding an if to output things like B instead of 1B.

                      – Johnny Well
                      Nov 7 '16 at 12:44
















                    2












                    2








                    2







                    Rosetta Code has a lot of implementations, that should easily be adaptable to your usecase.



                    Here is Python code with regular expressions:



                    from re import sub

                    def encode(text):
                    '''
                    Doctest:
                    >>> encode('WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWBWWWWWWWWWWWWWW')
                    '12W1B12W3B24W1B14W'
                    '''
                    return sub(r'(.)1*', lambda m: str(len(m.group(0))) + m.group(1),
                    text)

                    def decode(text):
                    '''
                    Doctest:
                    >>> decode('12W1B12W3B24W1B14W')
                    'WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWBWWWWWWWWWWWWWW'
                    '''
                    return sub(r'(d+)(D)', lambda m: m.group(2) * int(m.group(1)),
                    text)

                    textin = "WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWBWWWWWWWWWWWWWW"
                    assert decode(encode(textin)) == textin





                    share|improve this answer













                    Rosetta Code has a lot of implementations, that should easily be adaptable to your usecase.



                    Here is Python code with regular expressions:



                    from re import sub

                    def encode(text):
                    '''
                    Doctest:
                    >>> encode('WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWBWWWWWWWWWWWWWW')
                    '12W1B12W3B24W1B14W'
                    '''
                    return sub(r'(.)1*', lambda m: str(len(m.group(0))) + m.group(1),
                    text)

                    def decode(text):
                    '''
                    Doctest:
                    >>> decode('12W1B12W3B24W1B14W')
                    'WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWBWWWWWWWWWWWWWW'
                    '''
                    return sub(r'(d+)(D)', lambda m: m.group(2) * int(m.group(1)),
                    text)

                    textin = "WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWBWWWWWWWWWWWWWW"
                    assert decode(encode(textin)) == textin






                    share|improve this answer












                    share|improve this answer



                    share|improve this answer










                    answered Feb 4 '14 at 21:19









                    Thomas AhleThomas Ahle

                    21.7k166596




                    21.7k166596








                    • 1





                      Could you please explain how it works?

                      – Johnny Well
                      Nov 5 '16 at 21:59






                    • 1





                      (.)1* matches a character and then 0 or more of the same. That is, it matches all the runs, so the substitute function can encode them. For decoding (d+)(D) matches a number and something not a number. E.g. 12w.

                      – Thomas Ahle
                      Nov 5 '16 at 22:13











                    • Nice, I did a solution based on yours, but changed the lambda function a little bit by adding an if to output things like B instead of 1B.

                      – Johnny Well
                      Nov 7 '16 at 12:44
















                    • 1





                      Could you please explain how it works?

                      – Johnny Well
                      Nov 5 '16 at 21:59






                    • 1





                      (.)1* matches a character and then 0 or more of the same. That is, it matches all the runs, so the substitute function can encode them. For decoding (d+)(D) matches a number and something not a number. E.g. 12w.

                      – Thomas Ahle
                      Nov 5 '16 at 22:13











                    • Nice, I did a solution based on yours, but changed the lambda function a little bit by adding an if to output things like B instead of 1B.

                      – Johnny Well
                      Nov 7 '16 at 12:44










                    1




                    1





                    Could you please explain how it works?

                    – Johnny Well
                    Nov 5 '16 at 21:59





                    Could you please explain how it works?

                    – Johnny Well
                    Nov 5 '16 at 21:59




                    1




                    1





                    (.)1* matches a character and then 0 or more of the same. That is, it matches all the runs, so the substitute function can encode them. For decoding (d+)(D) matches a number and something not a number. E.g. 12w.

                    – Thomas Ahle
                    Nov 5 '16 at 22:13





                    (.)1* matches a character and then 0 or more of the same. That is, it matches all the runs, so the substitute function can encode them. For decoding (d+)(D) matches a number and something not a number. E.g. 12w.

                    – Thomas Ahle
                    Nov 5 '16 at 22:13













                    Nice, I did a solution based on yours, but changed the lambda function a little bit by adding an if to output things like B instead of 1B.

                    – Johnny Well
                    Nov 7 '16 at 12:44







                    Nice, I did a solution based on yours, but changed the lambda function a little bit by adding an if to output things like B instead of 1B.

                    – Johnny Well
                    Nov 7 '16 at 12:44













                    1














                    You can use the groupby() function combined with a list/generator comprehension:



                    from itertools import groupby, imap

                    ''.join(x if reps <= 4 else "/%02d%s" % (reps, x) for x, reps in imap(lambda x: (x[0], len(list(x[1]))), groupby(s)))





                    share|improve this answer
























                    • >>> import this

                      – sedot
                      Oct 9 '18 at 14:08
















                    1














                    You can use the groupby() function combined with a list/generator comprehension:



                    from itertools import groupby, imap

                    ''.join(x if reps <= 4 else "/%02d%s" % (reps, x) for x, reps in imap(lambda x: (x[0], len(list(x[1]))), groupby(s)))





                    share|improve this answer
























                    • >>> import this

                      – sedot
                      Oct 9 '18 at 14:08














                    1












                    1








                    1







                    You can use the groupby() function combined with a list/generator comprehension:



                    from itertools import groupby, imap

                    ''.join(x if reps <= 4 else "/%02d%s" % (reps, x) for x, reps in imap(lambda x: (x[0], len(list(x[1]))), groupby(s)))





                    share|improve this answer













                    You can use the groupby() function combined with a list/generator comprehension:



                    from itertools import groupby, imap

                    ''.join(x if reps <= 4 else "/%02d%s" % (reps, x) for x, reps in imap(lambda x: (x[0], len(list(x[1]))), groupby(s)))






                    share|improve this answer












                    share|improve this answer



                    share|improve this answer










                    answered Sep 22 '13 at 21:18









                    pkacprzakpkacprzak

                    4,80411132




                    4,80411132













                    • >>> import this

                      – sedot
                      Oct 9 '18 at 14:08



















                    • >>> import this

                      – sedot
                      Oct 9 '18 at 14:08

















                    >>> import this

                    – sedot
                    Oct 9 '18 at 14:08





                    >>> import this

                    – sedot
                    Oct 9 '18 at 14:08











                    0














                    I know this is not the most efficient solution, but we haven't studied functions like groupby() yet so here's what I did:



                    def runLengthEncode (plainText):
                    res=''
                    a=''
                    count = 0
                    for i in plainText:
                    count+=1
                    if a.count(i)>0:
                    a+=i
                    else:
                    if len(a)>4:
                    if len(a)<10:
                    res+="/0"+str(len(a))+a[0][:1]
                    else:
                    res+="/" + str(len(a)) + a[0][:1]
                    a=i
                    else:
                    res+=a
                    a=i
                    if count == len(plainText):
                    if len(a)>4:
                    if len(a)<10:
                    res+="/0"+str(len(a))+a[0][:1]
                    else:
                    res+="/" + str(len(a)) + a[0][:1]
                    else:
                    res+=a
                    return(res)





                    share|improve this answer




























                      0














                      I know this is not the most efficient solution, but we haven't studied functions like groupby() yet so here's what I did:



                      def runLengthEncode (plainText):
                      res=''
                      a=''
                      count = 0
                      for i in plainText:
                      count+=1
                      if a.count(i)>0:
                      a+=i
                      else:
                      if len(a)>4:
                      if len(a)<10:
                      res+="/0"+str(len(a))+a[0][:1]
                      else:
                      res+="/" + str(len(a)) + a[0][:1]
                      a=i
                      else:
                      res+=a
                      a=i
                      if count == len(plainText):
                      if len(a)>4:
                      if len(a)<10:
                      res+="/0"+str(len(a))+a[0][:1]
                      else:
                      res+="/" + str(len(a)) + a[0][:1]
                      else:
                      res+=a
                      return(res)





                      share|improve this answer


























                        0












                        0








                        0







                        I know this is not the most efficient solution, but we haven't studied functions like groupby() yet so here's what I did:



                        def runLengthEncode (plainText):
                        res=''
                        a=''
                        count = 0
                        for i in plainText:
                        count+=1
                        if a.count(i)>0:
                        a+=i
                        else:
                        if len(a)>4:
                        if len(a)<10:
                        res+="/0"+str(len(a))+a[0][:1]
                        else:
                        res+="/" + str(len(a)) + a[0][:1]
                        a=i
                        else:
                        res+=a
                        a=i
                        if count == len(plainText):
                        if len(a)>4:
                        if len(a)<10:
                        res+="/0"+str(len(a))+a[0][:1]
                        else:
                        res+="/" + str(len(a)) + a[0][:1]
                        else:
                        res+=a
                        return(res)





                        share|improve this answer













                        I know this is not the most efficient solution, but we haven't studied functions like groupby() yet so here's what I did:



                        def runLengthEncode (plainText):
                        res=''
                        a=''
                        count = 0
                        for i in plainText:
                        count+=1
                        if a.count(i)>0:
                        a+=i
                        else:
                        if len(a)>4:
                        if len(a)<10:
                        res+="/0"+str(len(a))+a[0][:1]
                        else:
                        res+="/" + str(len(a)) + a[0][:1]
                        a=i
                        else:
                        res+=a
                        a=i
                        if count == len(plainText):
                        if len(a)>4:
                        if len(a)<10:
                        res+="/0"+str(len(a))+a[0][:1]
                        else:
                        res+="/" + str(len(a)) + a[0][:1]
                        else:
                        res+=a
                        return(res)






                        share|improve this answer












                        share|improve this answer



                        share|improve this answer










                        answered Sep 23 '13 at 1:13







                        user786033






























                            0














                            An easy solution to run-length encoding which I can think of:



                            For encoding a string like "a4b5c6d7...":



                            def encode(s):
                            counts = {}
                            for c in s:
                            if counts.get(c) is None:
                            counts[c] = s.count(c)
                            return "".join(k+str(v) for k,v in counts.items())


                            For decoding a string like "aaaaaabbbdddddccccc....":



                            def decode(s):
                            return "".join((map(lambda tup: tup[0] * int(tup[1]), zip(s[0:len(s):2], s[1:len(s):2]))))


                            Fairly easy to read and simple.






                            share|improve this answer




























                              0














                              An easy solution to run-length encoding which I can think of:



                              For encoding a string like "a4b5c6d7...":



                              def encode(s):
                              counts = {}
                              for c in s:
                              if counts.get(c) is None:
                              counts[c] = s.count(c)
                              return "".join(k+str(v) for k,v in counts.items())


                              For decoding a string like "aaaaaabbbdddddccccc....":



                              def decode(s):
                              return "".join((map(lambda tup: tup[0] * int(tup[1]), zip(s[0:len(s):2], s[1:len(s):2]))))


                              Fairly easy to read and simple.






                              share|improve this answer


























                                0












                                0








                                0







                                An easy solution to run-length encoding which I can think of:



                                For encoding a string like "a4b5c6d7...":



                                def encode(s):
                                counts = {}
                                for c in s:
                                if counts.get(c) is None:
                                counts[c] = s.count(c)
                                return "".join(k+str(v) for k,v in counts.items())


                                For decoding a string like "aaaaaabbbdddddccccc....":



                                def decode(s):
                                return "".join((map(lambda tup: tup[0] * int(tup[1]), zip(s[0:len(s):2], s[1:len(s):2]))))


                                Fairly easy to read and simple.






                                share|improve this answer













                                An easy solution to run-length encoding which I can think of:



                                For encoding a string like "a4b5c6d7...":



                                def encode(s):
                                counts = {}
                                for c in s:
                                if counts.get(c) is None:
                                counts[c] = s.count(c)
                                return "".join(k+str(v) for k,v in counts.items())


                                For decoding a string like "aaaaaabbbdddddccccc....":



                                def decode(s):
                                return "".join((map(lambda tup: tup[0] * int(tup[1]), zip(s[0:len(s):2], s[1:len(s):2]))))


                                Fairly easy to read and simple.







                                share|improve this answer












                                share|improve this answer



                                share|improve this answer










                                answered Nov 2 '18 at 18:22









                                Ashwin JoshiAshwin Joshi

                                73




                                73























                                    0














                                    text=input("Please enter the string to encode")
                                    encoded=
                                    index=0
                                    amount=1
                                    while index<=(len(text)-1):
                                    if index==(len(text)-1) or text[index]!=text[(index+1)]:
                                    encoded.append((text[index],amount))
                                    amount=1
                                    else:
                                    amount=amount+1
                                    index=index+1
                                    print(encoded)





                                    share|improve this answer






























                                      0














                                      text=input("Please enter the string to encode")
                                      encoded=
                                      index=0
                                      amount=1
                                      while index<=(len(text)-1):
                                      if index==(len(text)-1) or text[index]!=text[(index+1)]:
                                      encoded.append((text[index],amount))
                                      amount=1
                                      else:
                                      amount=amount+1
                                      index=index+1
                                      print(encoded)





                                      share|improve this answer




























                                        0












                                        0








                                        0







                                        text=input("Please enter the string to encode")
                                        encoded=
                                        index=0
                                        amount=1
                                        while index<=(len(text)-1):
                                        if index==(len(text)-1) or text[index]!=text[(index+1)]:
                                        encoded.append((text[index],amount))
                                        amount=1
                                        else:
                                        amount=amount+1
                                        index=index+1
                                        print(encoded)





                                        share|improve this answer















                                        text=input("Please enter the string to encode")
                                        encoded=
                                        index=0
                                        amount=1
                                        while index<=(len(text)-1):
                                        if index==(len(text)-1) or text[index]!=text[(index+1)]:
                                        encoded.append((text[index],amount))
                                        amount=1
                                        else:
                                        amount=amount+1
                                        index=index+1
                                        print(encoded)






                                        share|improve this answer














                                        share|improve this answer



                                        share|improve this answer








                                        edited Nov 24 '18 at 21:08

























                                        answered Nov 24 '18 at 20:45









                                        Christopher McGeoughChristopher McGeough

                                        11




                                        11























                                            0














                                            Split=(list(input("Enter string: ")))
                                            Split.append("")
                                            a = 0
                                            for i in range(len(Split)):
                                            try:
                                            if (Split[i] in Split) >0:
                                            a = a + 1
                                            if Split[i] != Split[i+1]:
                                            print(Split[i],a)
                                            a = 0
                                            except IndexError:
                                            print()


                                            this is much easier and works everytime






                                            share|improve this answer




























                                              0














                                              Split=(list(input("Enter string: ")))
                                              Split.append("")
                                              a = 0
                                              for i in range(len(Split)):
                                              try:
                                              if (Split[i] in Split) >0:
                                              a = a + 1
                                              if Split[i] != Split[i+1]:
                                              print(Split[i],a)
                                              a = 0
                                              except IndexError:
                                              print()


                                              this is much easier and works everytime






                                              share|improve this answer


























                                                0












                                                0








                                                0







                                                Split=(list(input("Enter string: ")))
                                                Split.append("")
                                                a = 0
                                                for i in range(len(Split)):
                                                try:
                                                if (Split[i] in Split) >0:
                                                a = a + 1
                                                if Split[i] != Split[i+1]:
                                                print(Split[i],a)
                                                a = 0
                                                except IndexError:
                                                print()


                                                this is much easier and works everytime






                                                share|improve this answer













                                                Split=(list(input("Enter string: ")))
                                                Split.append("")
                                                a = 0
                                                for i in range(len(Split)):
                                                try:
                                                if (Split[i] in Split) >0:
                                                a = a + 1
                                                if Split[i] != Split[i+1]:
                                                print(Split[i],a)
                                                a = 0
                                                except IndexError:
                                                print()


                                                this is much easier and works everytime







                                                share|improve this answer












                                                share|improve this answer



                                                share|improve this answer










                                                answered Jan 24 at 17:20









                                                Tom SmithTom Smith

                                                1




                                                1






























                                                    draft saved

                                                    draft discarded




















































                                                    Thanks for contributing an answer to Stack Overflow!


                                                    • Please be sure to answer the question. Provide details and share your research!

                                                    But avoid



                                                    • Asking for help, clarification, or responding to other answers.

                                                    • Making statements based on opinion; back them up with references or personal experience.


                                                    To learn more, see our tips on writing great answers.




                                                    draft saved


                                                    draft discarded














                                                    StackExchange.ready(
                                                    function () {
                                                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f18948382%2frun-length-encoding-in-python%23new-answer', 'question_page');
                                                    }
                                                    );

                                                    Post as a guest















                                                    Required, but never shown





















































                                                    Required, but never shown














                                                    Required, but never shown












                                                    Required, but never shown







                                                    Required, but never shown

































                                                    Required, but never shown














                                                    Required, but never shown












                                                    Required, but never shown







                                                    Required, but never shown







                                                    Popular posts from this blog

                                                    Create new schema in PostgreSQL using DBeaver

                                                    Deepest pit of an array with Javascript: test on Codility

                                                    Costa Masnaga