Run length encoding in Python

I'm trying to write a simple python algorithm to solve this problem. Can you please help me figure out why my code is not working:

Problem:

If any character is repeated more than 4 times, the entire set of
repeated characters should be replaced with a slash '/', followed by a
2-digit number which is the length of this run of repeated characters,
and the character. For example, "aaaaa" would be encoded as "/05a".
Runs of 4 or less characters should not be replaced since performing
the encoding would not decrease the length of the string.

My Code:

def runLengthEncode (plainText):

    res=''

    a=''

    for i in plainText:

        if a.count(i)>0:

            a+=i

        else:

            if len(a)>4:

                res+="/" + str(len(a)) + a[0][:1]

            else:

                res+=a

                a=i

    return(res)

edited Sep 22 '13 at 20:23

Gilles

75.7k19162206

asked Sep 22 '13 at 20:20

user10281

migrated from cs.stackexchange.com Sep 22 '13 at 20:21

This question came from our site for students, researchers and practitioners of computer science.

4

Aside: You may want to look at collections.Counter and itertools.groupby. Both are very effective ways to count repeats in sequences.

– kojiro
Sep 22 '13 at 20:26

What should happen if it repeats more than 99 times?

– Samy Bencherif
Sep 22 '13 at 20:46

By the way, this code will fail if the input is something like /05c because it has no concept of escaping input correctly. You either need to handle / especially, or compress every occurrence, not just runs of length > 4.

– Konrad Rudolph
Sep 22 '13 at 20:46

add a comment |

I'm trying to write a simple python algorithm to solve this problem. Can you please help me figure out why my code is not working:

Problem:

If any character is repeated more than 4 times, the entire set of
repeated characters should be replaced with a slash '/', followed by a
2-digit number which is the length of this run of repeated characters,
and the character. For example, "aaaaa" would be encoded as "/05a".
Runs of 4 or less characters should not be replaced since performing
the encoding would not decrease the length of the string.

My Code:

def runLengthEncode (plainText):

    res=''

    a=''

    for i in plainText:

        if a.count(i)>0:

            a+=i

        else:

            if len(a)>4:

                res+="/" + str(len(a)) + a[0][:1]

            else:

                res+=a

                a=i

    return(res)

edited Sep 22 '13 at 20:23

Gilles

75.7k19162206

asked Sep 22 '13 at 20:20

user10281

migrated from cs.stackexchange.com Sep 22 '13 at 20:21

This question came from our site for students, researchers and practitioners of computer science.

4

Aside: You may want to look at collections.Counter and itertools.groupby. Both are very effective ways to count repeats in sequences.

– kojiro
Sep 22 '13 at 20:26

What should happen if it repeats more than 99 times?

– Samy Bencherif
Sep 22 '13 at 20:46

By the way, this code will fail if the input is something like /05c because it has no concept of escaping input correctly. You either need to handle / especially, or compress every occurrence, not just runs of length > 4.

– Konrad Rudolph
Sep 22 '13 at 20:46

add a comment |

I'm trying to write a simple python algorithm to solve this problem. Can you please help me figure out why my code is not working:

Problem:

If any character is repeated more than 4 times, the entire set of
repeated characters should be replaced with a slash '/', followed by a
2-digit number which is the length of this run of repeated characters,
and the character. For example, "aaaaa" would be encoded as "/05a".
Runs of 4 or less characters should not be replaced since performing
the encoding would not decrease the length of the string.

My Code:

def runLengthEncode (plainText):

    res=''

    a=''

    for i in plainText:

        if a.count(i)>0:

            a+=i

        else:

            if len(a)>4:

                res+="/" + str(len(a)) + a[0][:1]

            else:

                res+=a

                a=i

    return(res)

edited Sep 22 '13 at 20:23

Gilles

75.7k19162206

asked Sep 22 '13 at 20:20

user10281

I'm trying to write a simple python algorithm to solve this problem. Can you please help me figure out why my code is not working:

Problem:

If any character is repeated more than 4 times, the entire set of
repeated characters should be replaced with a slash '/', followed by a
2-digit number which is the length of this run of repeated characters,
and the character. For example, "aaaaa" would be encoded as "/05a".
Runs of 4 or less characters should not be replaced since performing
the encoding would not decrease the length of the string.

My Code:

def runLengthEncode (plainText):

    res=''

    a=''

    for i in plainText:

        if a.count(i)>0:

            a+=i

        else:

            if len(a)>4:

                res+="/" + str(len(a)) + a[0][:1]

            else:

                res+=a

                a=i

    return(res)

python string

edited Sep 22 '13 at 20:23

Gilles

75.7k19162206

asked Sep 22 '13 at 20:20

user10281

edited Sep 22 '13 at 20:23

Gilles

75.7k19162206

asked Sep 22 '13 at 20:20

user10281

edited Sep 22 '13 at 20:23

Gilles

75.7k19162206

edited Sep 22 '13 at 20:23

Gilles

75.7k19162206

edited Sep 22 '13 at 20:23

Gilles

75.7k19162206

asked Sep 22 '13 at 20:20

user10281

asked Sep 22 '13 at 20:20

user10281

asked Sep 22 '13 at 20:20

user10281

migrated from cs.stackexchange.com Sep 22 '13 at 20:21

This question came from our site for students, researchers and practitioners of computer science.

migrated from cs.stackexchange.com Sep 22 '13 at 20:21

This question came from our site for students, researchers and practitioners of computer science.

4

Aside: You may want to look at collections.Counter and itertools.groupby. Both are very effective ways to count repeats in sequences.

– kojiro
Sep 22 '13 at 20:26

What should happen if it repeats more than 99 times?

– Samy Bencherif
Sep 22 '13 at 20:46

By the way, this code will fail if the input is something like /05c because it has no concept of escaping input correctly. You either need to handle / especially, or compress every occurrence, not just runs of length > 4.

– Konrad Rudolph
Sep 22 '13 at 20:46

add a comment |

4

Aside: You may want to look at collections.Counter and itertools.groupby. Both are very effective ways to count repeats in sequences.

– kojiro
Sep 22 '13 at 20:26

What should happen if it repeats more than 99 times?

– Samy Bencherif
Sep 22 '13 at 20:46

By the way, this code will fail if the input is something like /05c because it has no concept of escaping input correctly. You either need to handle / especially, or compress every occurrence, not just runs of length > 4.

– Konrad Rudolph
Sep 22 '13 at 20:46

Aside: You may want to look at collections.Counter and itertools.groupby. Both are very effective ways to count repeats in sequences.

– kojiro
Sep 22 '13 at 20:26

What should happen if it repeats more than 99 times?

– Samy Bencherif
Sep 22 '13 at 20:46

By the way, this code will fail if the input is something like /05c because it has no concept of escaping input correctly. You either need to handle / especially, or compress every occurrence, not just runs of length > 4.

– Konrad Rudolph
Sep 22 '13 at 20:46

add a comment |

8 Answers
8

active

oldest

votes

Aside for setting a=i after encoding a sequence and setting a width for your int when printed into the string. You could also do the following which takes advantage of pythons groupby. Its also a good idea to use format when constructing strings.

from itertools import groupby



def runLengthEncode (plainText):

    res = 



    for k,i in groupby(plainText):

        run = list(i)

        if(len(run) > 4):

            res.append("/{:02}{}".format(len(run), k))

        else:

            res.extend(run)



    return "".join(res)

edited Sep 22 '13 at 20:54

answered Sep 22 '13 at 20:36

Serdalis

8,12122347

2

+1, but if the plainText is long you're adding up a string over and over, which is costly because strings are immutable. You could nest a generator to make it faster and still very easy to read. But since I can't put formatted code in a comment, here's a one-liner: ''.join('/{:02}{}'.format(n,c) if n>4 else c*n for c,n in [(c,len(list(g))) for c,g in groupby(t)])

– kojiro
Sep 22 '13 at 20:44

@kojiro Good idea, you are right. I implemented what you suggested in multiline format.

– Serdalis
Sep 22 '13 at 20:56

add a comment |

Just observe the behaviour:

>>> runLengthEncode("abcd")

'abc'

Last character is ignored. You have to append what you've collected.

>>> runLengthEncode("abbbbbcd")

'a/5b/5b'

Oops, problem after encoding. You should set a=i even if you found a long enough sequence.

answered Sep 22 '13 at 20:30

Karoly Horvath

78.9k1092157

add a comment |

Rosetta Code has a lot of implementations, that should easily be adaptable to your usecase.

Here is Python code with regular expressions:

from re import sub



def encode(text):

    '''

    Doctest:

        >>> encode('WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWBWWWWWWWWWWWWWW')

        '12W1B12W3B24W1B14W'    

    '''

    return sub(r'(.)1*', lambda m: str(len(m.group(0))) + m.group(1),

               text)



def decode(text):

    '''

    Doctest:

        >>> decode('12W1B12W3B24W1B14W')

        'WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWBWWWWWWWWWWWWWW'

    '''

    return sub(r'(d+)(D)', lambda m: m.group(2) * int(m.group(1)),

               text)



textin = "WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWBWWWWWWWWWWWWWW"

assert decode(encode(textin)) == textin

answered Feb 4 '14 at 21:19

Thomas Ahle

21.7k166596

1

Could you please explain how it works?

– Johnny Well
Nov 5 '16 at 21:59

1

(.)1* matches a character and then 0 or more of the same. That is, it matches all the runs, so the substitute function can encode them. For decoding (d+)(D) matches a number and something not a number. E.g. 12w.

– Thomas Ahle
Nov 5 '16 at 22:13

Nice, I did a solution based on yours, but changed the lambda function a little bit by adding an if to output things like B instead of 1B.

– Johnny Well
Nov 7 '16 at 12:44

add a comment |

You can use the groupby() function combined with a list/generator comprehension:

from itertools import groupby, imap



''.join(x if reps <= 4 else "/%02d%s" % (reps, x) for x, reps in imap(lambda x: (x[0], len(list(x[1]))), groupby(s)))

answered Sep 22 '13 at 21:18

pkacprzak

4,80411132

>>> import this

– sedot
Oct 9 '18 at 14:08

add a comment |

I know this is not the most efficient solution, but we haven't studied functions like groupby() yet so here's what I did:

def runLengthEncode (plainText):

    res=''

    a=''

    count = 0

    for i in plainText:

        count+=1

        if a.count(i)>0:

            a+=i

        else:

            if len(a)>4:

                if len(a)<10:

                    res+="/0"+str(len(a))+a[0][:1]

                else:

                    res+="/" + str(len(a)) + a[0][:1]

                a=i

            else:

                res+=a

                a=i

        if count == len(plainText):

            if len(a)>4:

                if len(a)<10:

                    res+="/0"+str(len(a))+a[0][:1]

                else:

                    res+="/" + str(len(a)) + a[0][:1]

            else:

                res+=a

    return(res)

answered Sep 23 '13 at 1:13

user786033

add a comment |

An easy solution to run-length encoding which I can think of:

For encoding a string like "a4b5c6d7...":

def encode(s):

    counts = {}

    for c in s:

        if counts.get(c) is None:

            counts[c] = s.count(c)

    return "".join(k+str(v) for k,v in counts.items())

For decoding a string like "aaaaaabbbdddddccccc....":

def decode(s):

    return "".join((map(lambda tup:  tup[0] * int(tup[1]), zip(s[0:len(s):2], s[1:len(s):2]))))

Fairly easy to read and simple.

answered Nov 2 '18 at 18:22

Ashwin Joshi

add a comment |

text=input("Please enter the string to encode")

encoded=

index=0

amount=1

while index<=(len(text)-1):  

  if index==(len(text)-1) or text[index]!=text[(index+1)]:

    encoded.append((text[index],amount))        

    amount=1

  else:

    amount=amount+1            

  index=index+1   

print(encoded)

edited Nov 24 '18 at 21:08

answered Nov 24 '18 at 20:45

Christopher McGeough

add a comment |

Split=(list(input("Enter string: ")))

Split.append("")

a = 0

for i in range(len(Split)):

    try:

        if (Split[i] in Split) >0:

            a = a + 1

        if Split[i] != Split[i+1]:

            print(Split[i],a)

            a = 0

    except IndexError:

        print()

this is much easier and works everytime

answered Jan 24 at 17:20

Tom Smith

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f18948382%2frun-length-encoding-in-python%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

8 Answers
8

active

oldest

votes

8 Answers
8

active

oldest

votes

from itertools import groupby



def runLengthEncode (plainText):

    res = 



    for k,i in groupby(plainText):

        run = list(i)

        if(len(run) > 4):

            res.append("/{:02}{}".format(len(run), k))

        else:

            res.extend(run)



    return "".join(res)

edited Sep 22 '13 at 20:54

answered Sep 22 '13 at 20:36

Serdalis

8,12122347

2

+1, but if the plainText is long you're adding up a string over and over, which is costly because strings are immutable. You could nest a generator to make it faster and still very easy to read. But since I can't put formatted code in a comment, here's a one-liner: ''.join('/{:02}{}'.format(n,c) if n>4 else c*n for c,n in [(c,len(list(g))) for c,g in groupby(t)])

– kojiro
Sep 22 '13 at 20:44

@kojiro Good idea, you are right. I implemented what you suggested in multiline format.

– Serdalis
Sep 22 '13 at 20:56

add a comment |

from itertools import groupby



def runLengthEncode (plainText):

    res = 



    for k,i in groupby(plainText):

        run = list(i)

        if(len(run) > 4):

            res.append("/{:02}{}".format(len(run), k))

        else:

            res.extend(run)



    return "".join(res)

edited Sep 22 '13 at 20:54

answered Sep 22 '13 at 20:36

Serdalis

8,12122347

2

+1, but if the plainText is long you're adding up a string over and over, which is costly because strings are immutable. You could nest a generator to make it faster and still very easy to read. But since I can't put formatted code in a comment, here's a one-liner: ''.join('/{:02}{}'.format(n,c) if n>4 else c*n for c,n in [(c,len(list(g))) for c,g in groupby(t)])

– kojiro
Sep 22 '13 at 20:44

@kojiro Good idea, you are right. I implemented what you suggested in multiline format.

– Serdalis
Sep 22 '13 at 20:56

add a comment |

from itertools import groupby



def runLengthEncode (plainText):

    res = 



    for k,i in groupby(plainText):

        run = list(i)

        if(len(run) > 4):

            res.append("/{:02}{}".format(len(run), k))

        else:

            res.extend(run)



    return "".join(res)

edited Sep 22 '13 at 20:54

answered Sep 22 '13 at 20:36

Serdalis

8,12122347

from itertools import groupby



def runLengthEncode (plainText):

    res = 



    for k,i in groupby(plainText):

        run = list(i)

        if(len(run) > 4):

            res.append("/{:02}{}".format(len(run), k))

        else:

            res.extend(run)



    return "".join(res)

edited Sep 22 '13 at 20:54

answered Sep 22 '13 at 20:36

Serdalis

8,12122347

edited Sep 22 '13 at 20:54

answered Sep 22 '13 at 20:36

Serdalis

8,12122347

answered Sep 22 '13 at 20:36

Serdalis

8,12122347

answered Sep 22 '13 at 20:36

Serdalis

8,12122347

2

+1, but if the plainText is long you're adding up a string over and over, which is costly because strings are immutable. You could nest a generator to make it faster and still very easy to read. But since I can't put formatted code in a comment, here's a one-liner: ''.join('/{:02}{}'.format(n,c) if n>4 else c*n for c,n in [(c,len(list(g))) for c,g in groupby(t)])

– kojiro
Sep 22 '13 at 20:44

@kojiro Good idea, you are right. I implemented what you suggested in multiline format.

– Serdalis
Sep 22 '13 at 20:56

add a comment |

2

+1, but if the plainText is long you're adding up a string over and over, which is costly because strings are immutable. You could nest a generator to make it faster and still very easy to read. But since I can't put formatted code in a comment, here's a one-liner: ''.join('/{:02}{}'.format(n,c) if n>4 else c*n for c,n in [(c,len(list(g))) for c,g in groupby(t)])

– kojiro
Sep 22 '13 at 20:44

@kojiro Good idea, you are right. I implemented what you suggested in multiline format.

– Serdalis
Sep 22 '13 at 20:56

+1, but if the plainText is long you're adding up a string over and over, which is costly because strings are immutable. You could nest a generator to make it faster and still very easy to read. But since I can't put formatted code in a comment, here's a one-liner: ''.join('/{:02}{}'.format(n,c) if n>4 else c*n for c,n in [(c,len(list(g))) for c,g in groupby(t)])

– kojiro
Sep 22 '13 at 20:44

@kojiro Good idea, you are right. I implemented what you suggested in multiline format.

– Serdalis
Sep 22 '13 at 20:56

add a comment |

Just observe the behaviour:

>>> runLengthEncode("abcd")

'abc'

Last character is ignored. You have to append what you've collected.

>>> runLengthEncode("abbbbbcd")

'a/5b/5b'

Oops, problem after encoding. You should set a=i even if you found a long enough sequence.

answered Sep 22 '13 at 20:30

Karoly Horvath

78.9k1092157

add a comment |

Just observe the behaviour:

>>> runLengthEncode("abcd")

'abc'

Last character is ignored. You have to append what you've collected.

>>> runLengthEncode("abbbbbcd")

'a/5b/5b'

Oops, problem after encoding. You should set a=i even if you found a long enough sequence.

answered Sep 22 '13 at 20:30

Karoly Horvath

78.9k1092157

add a comment |

Just observe the behaviour:

>>> runLengthEncode("abcd")

'abc'

Last character is ignored. You have to append what you've collected.

>>> runLengthEncode("abbbbbcd")

'a/5b/5b'

Oops, problem after encoding. You should set a=i even if you found a long enough sequence.

answered Sep 22 '13 at 20:30

Karoly Horvath

78.9k1092157

Just observe the behaviour:

>>> runLengthEncode("abcd")

'abc'

Last character is ignored. You have to append what you've collected.

>>> runLengthEncode("abbbbbcd")

'a/5b/5b'

Oops, problem after encoding. You should set a=i even if you found a long enough sequence.

answered Sep 22 '13 at 20:30

Karoly Horvath

78.9k1092157

answered Sep 22 '13 at 20:30

Karoly Horvath

78.9k1092157

answered Sep 22 '13 at 20:30

Karoly Horvath

78.9k1092157

answered Sep 22 '13 at 20:30

Karoly Horvath

78.9k1092157

add a comment |

Rosetta Code has a lot of implementations, that should easily be adaptable to your usecase.

Here is Python code with regular expressions:

from re import sub



def encode(text):

    '''

    Doctest:

        >>> encode('WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWBWWWWWWWWWWWWWW')

        '12W1B12W3B24W1B14W'    

    '''

    return sub(r'(.)1*', lambda m: str(len(m.group(0))) + m.group(1),

               text)



def decode(text):

    '''

    Doctest:

        >>> decode('12W1B12W3B24W1B14W')

        'WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWBWWWWWWWWWWWWWW'

    '''

    return sub(r'(d+)(D)', lambda m: m.group(2) * int(m.group(1)),

               text)



textin = "WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWBWWWWWWWWWWWWWW"

assert decode(encode(textin)) == textin

answered Feb 4 '14 at 21:19

Thomas Ahle

21.7k166596

1

Could you please explain how it works?

– Johnny Well
Nov 5 '16 at 21:59

1

(.)1* matches a character and then 0 or more of the same. That is, it matches all the runs, so the substitute function can encode them. For decoding (d+)(D) matches a number and something not a number. E.g. 12w.

– Thomas Ahle
Nov 5 '16 at 22:13

Nice, I did a solution based on yours, but changed the lambda function a little bit by adding an if to output things like B instead of 1B.

– Johnny Well
Nov 7 '16 at 12:44

add a comment |

Rosetta Code has a lot of implementations, that should easily be adaptable to your usecase.

Here is Python code with regular expressions:

from re import sub



def encode(text):

    '''

    Doctest:

        >>> encode('WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWBWWWWWWWWWWWWWW')

        '12W1B12W3B24W1B14W'    

    '''

    return sub(r'(.)1*', lambda m: str(len(m.group(0))) + m.group(1),

               text)



def decode(text):

    '''

    Doctest:

        >>> decode('12W1B12W3B24W1B14W')

        'WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWBWWWWWWWWWWWWWW'

    '''

    return sub(r'(d+)(D)', lambda m: m.group(2) * int(m.group(1)),

               text)



textin = "WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWBWWWWWWWWWWWWWW"

assert decode(encode(textin)) == textin

answered Feb 4 '14 at 21:19

Thomas Ahle

21.7k166596

1

Could you please explain how it works?

– Johnny Well
Nov 5 '16 at 21:59

1

(.)1* matches a character and then 0 or more of the same. That is, it matches all the runs, so the substitute function can encode them. For decoding (d+)(D) matches a number and something not a number. E.g. 12w.

– Thomas Ahle
Nov 5 '16 at 22:13

Nice, I did a solution based on yours, but changed the lambda function a little bit by adding an if to output things like B instead of 1B.

– Johnny Well
Nov 7 '16 at 12:44

add a comment |

Rosetta Code has a lot of implementations, that should easily be adaptable to your usecase.

Here is Python code with regular expressions:

from re import sub



def encode(text):

    '''

    Doctest:

        >>> encode('WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWBWWWWWWWWWWWWWW')

        '12W1B12W3B24W1B14W'    

    '''

    return sub(r'(.)1*', lambda m: str(len(m.group(0))) + m.group(1),

               text)



def decode(text):

    '''

    Doctest:

        >>> decode('12W1B12W3B24W1B14W')

        'WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWBWWWWWWWWWWWWWW'

    '''

    return sub(r'(d+)(D)', lambda m: m.group(2) * int(m.group(1)),

               text)



textin = "WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWBWWWWWWWWWWWWWW"

assert decode(encode(textin)) == textin

answered Feb 4 '14 at 21:19

Thomas Ahle

21.7k166596

Rosetta Code has a lot of implementations, that should easily be adaptable to your usecase.

Here is Python code with regular expressions:

from re import sub



def encode(text):

    '''

    Doctest:

        >>> encode('WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWBWWWWWWWWWWWWWW')

        '12W1B12W3B24W1B14W'    

    '''

    return sub(r'(.)1*', lambda m: str(len(m.group(0))) + m.group(1),

               text)



def decode(text):

    '''

    Doctest:

        >>> decode('12W1B12W3B24W1B14W')

        'WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWBWWWWWWWWWWWWWW'

    '''

    return sub(r'(d+)(D)', lambda m: m.group(2) * int(m.group(1)),

               text)



textin = "WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWBWWWWWWWWWWWWWW"

assert decode(encode(textin)) == textin

answered Feb 4 '14 at 21:19

Thomas Ahle

21.7k166596

answered Feb 4 '14 at 21:19

Thomas Ahle

21.7k166596

answered Feb 4 '14 at 21:19

Thomas Ahle

21.7k166596

answered Feb 4 '14 at 21:19

Thomas Ahle

21.7k166596

1

Could you please explain how it works?

– Johnny Well
Nov 5 '16 at 21:59

1

(.)1* matches a character and then 0 or more of the same. That is, it matches all the runs, so the substitute function can encode them. For decoding (d+)(D) matches a number and something not a number. E.g. 12w.

– Thomas Ahle
Nov 5 '16 at 22:13

Nice, I did a solution based on yours, but changed the lambda function a little bit by adding an if to output things like B instead of 1B.

– Johnny Well
Nov 7 '16 at 12:44

add a comment |

1

Could you please explain how it works?

– Johnny Well
Nov 5 '16 at 21:59

1

(.)1* matches a character and then 0 or more of the same. That is, it matches all the runs, so the substitute function can encode them. For decoding (d+)(D) matches a number and something not a number. E.g. 12w.

– Thomas Ahle
Nov 5 '16 at 22:13

Nice, I did a solution based on yours, but changed the lambda function a little bit by adding an if to output things like B instead of 1B.

– Johnny Well
Nov 7 '16 at 12:44

Could you please explain how it works?

– Johnny Well
Nov 5 '16 at 21:59

(.)1* matches a character and then 0 or more of the same. That is, it matches all the runs, so the substitute function can encode them. For decoding (d+)(D) matches a number and something not a number. E.g. 12w.

– Thomas Ahle
Nov 5 '16 at 22:13

Nice, I did a solution based on yours, but changed the lambda function a little bit by adding an if to output things like B instead of 1B.

– Johnny Well
Nov 7 '16 at 12:44

add a comment |

You can use the groupby() function combined with a list/generator comprehension:

from itertools import groupby, imap



''.join(x if reps <= 4 else "/%02d%s" % (reps, x) for x, reps in imap(lambda x: (x[0], len(list(x[1]))), groupby(s)))

answered Sep 22 '13 at 21:18

pkacprzak

4,80411132

>>> import this

– sedot
Oct 9 '18 at 14:08

add a comment |

You can use the groupby() function combined with a list/generator comprehension:

from itertools import groupby, imap



''.join(x if reps <= 4 else "/%02d%s" % (reps, x) for x, reps in imap(lambda x: (x[0], len(list(x[1]))), groupby(s)))

answered Sep 22 '13 at 21:18

pkacprzak

4,80411132

>>> import this

– sedot
Oct 9 '18 at 14:08

add a comment |

You can use the groupby() function combined with a list/generator comprehension:

from itertools import groupby, imap



''.join(x if reps <= 4 else "/%02d%s" % (reps, x) for x, reps in imap(lambda x: (x[0], len(list(x[1]))), groupby(s)))

answered Sep 22 '13 at 21:18

pkacprzak

4,80411132

You can use the groupby() function combined with a list/generator comprehension:

from itertools import groupby, imap



''.join(x if reps <= 4 else "/%02d%s" % (reps, x) for x, reps in imap(lambda x: (x[0], len(list(x[1]))), groupby(s)))

answered Sep 22 '13 at 21:18

pkacprzak

4,80411132

answered Sep 22 '13 at 21:18

pkacprzak

4,80411132

answered Sep 22 '13 at 21:18

pkacprzak

4,80411132

answered Sep 22 '13 at 21:18

pkacprzak

4,80411132

>>> import this

– sedot
Oct 9 '18 at 14:08

add a comment |

>>> import this

– sedot
Oct 9 '18 at 14:08

>>> import this

– sedot
Oct 9 '18 at 14:08

add a comment |

I know this is not the most efficient solution, but we haven't studied functions like groupby() yet so here's what I did:

def runLengthEncode (plainText):

    res=''

    a=''

    count = 0

    for i in plainText:

        count+=1

        if a.count(i)>0:

            a+=i

        else:

            if len(a)>4:

                if len(a)<10:

                    res+="/0"+str(len(a))+a[0][:1]

                else:

                    res+="/" + str(len(a)) + a[0][:1]

                a=i

            else:

                res+=a

                a=i

        if count == len(plainText):

            if len(a)>4:

                if len(a)<10:

                    res+="/0"+str(len(a))+a[0][:1]

                else:

                    res+="/" + str(len(a)) + a[0][:1]

            else:

                res+=a

    return(res)

answered Sep 23 '13 at 1:13

user786033

add a comment |

I know this is not the most efficient solution, but we haven't studied functions like groupby() yet so here's what I did:

def runLengthEncode (plainText):

    res=''

    a=''

    count = 0

    for i in plainText:

        count+=1

        if a.count(i)>0:

            a+=i

        else:

            if len(a)>4:

                if len(a)<10:

                    res+="/0"+str(len(a))+a[0][:1]

                else:

                    res+="/" + str(len(a)) + a[0][:1]

                a=i

            else:

                res+=a

                a=i

        if count == len(plainText):

            if len(a)>4:

                if len(a)<10:

                    res+="/0"+str(len(a))+a[0][:1]

                else:

                    res+="/" + str(len(a)) + a[0][:1]

            else:

                res+=a

    return(res)

answered Sep 23 '13 at 1:13

user786033

add a comment |

I know this is not the most efficient solution, but we haven't studied functions like groupby() yet so here's what I did:

def runLengthEncode (plainText):

    res=''

    a=''

    count = 0

    for i in plainText:

        count+=1

        if a.count(i)>0:

            a+=i

        else:

            if len(a)>4:

                if len(a)<10:

                    res+="/0"+str(len(a))+a[0][:1]

                else:

                    res+="/" + str(len(a)) + a[0][:1]

                a=i

            else:

                res+=a

                a=i

        if count == len(plainText):

            if len(a)>4:

                if len(a)<10:

                    res+="/0"+str(len(a))+a[0][:1]

                else:

                    res+="/" + str(len(a)) + a[0][:1]

            else:

                res+=a

    return(res)

answered Sep 23 '13 at 1:13

user786033

I know this is not the most efficient solution, but we haven't studied functions like groupby() yet so here's what I did:

def runLengthEncode (plainText):

    res=''

    a=''

    count = 0

    for i in plainText:

        count+=1

        if a.count(i)>0:

            a+=i

        else:

            if len(a)>4:

                if len(a)<10:

                    res+="/0"+str(len(a))+a[0][:1]

                else:

                    res+="/" + str(len(a)) + a[0][:1]

                a=i

            else:

                res+=a

                a=i

        if count == len(plainText):

            if len(a)>4:

                if len(a)<10:

                    res+="/0"+str(len(a))+a[0][:1]

                else:

                    res+="/" + str(len(a)) + a[0][:1]

            else:

                res+=a

    return(res)

answered Sep 23 '13 at 1:13

user786033

answered Sep 23 '13 at 1:13

user786033

answered Sep 23 '13 at 1:13

user786033

answered Sep 23 '13 at 1:13

user786033

add a comment |

An easy solution to run-length encoding which I can think of:

For encoding a string like "a4b5c6d7...":

def encode(s):

    counts = {}

    for c in s:

        if counts.get(c) is None:

            counts[c] = s.count(c)

    return "".join(k+str(v) for k,v in counts.items())

For decoding a string like "aaaaaabbbdddddccccc....":

def decode(s):

    return "".join((map(lambda tup:  tup[0] * int(tup[1]), zip(s[0:len(s):2], s[1:len(s):2]))))

Fairly easy to read and simple.

answered Nov 2 '18 at 18:22

Ashwin Joshi

add a comment |

An easy solution to run-length encoding which I can think of:

For encoding a string like "a4b5c6d7...":

def encode(s):

    counts = {}

    for c in s:

        if counts.get(c) is None:

            counts[c] = s.count(c)

    return "".join(k+str(v) for k,v in counts.items())

For decoding a string like "aaaaaabbbdddddccccc....":

def decode(s):

    return "".join((map(lambda tup:  tup[0] * int(tup[1]), zip(s[0:len(s):2], s[1:len(s):2]))))

Fairly easy to read and simple.

answered Nov 2 '18 at 18:22

Ashwin Joshi

add a comment |

An easy solution to run-length encoding which I can think of:

For encoding a string like "a4b5c6d7...":

def encode(s):

    counts = {}

    for c in s:

        if counts.get(c) is None:

            counts[c] = s.count(c)

    return "".join(k+str(v) for k,v in counts.items())

For decoding a string like "aaaaaabbbdddddccccc....":

def decode(s):

    return "".join((map(lambda tup:  tup[0] * int(tup[1]), zip(s[0:len(s):2], s[1:len(s):2]))))

Fairly easy to read and simple.

answered Nov 2 '18 at 18:22

Ashwin Joshi

An easy solution to run-length encoding which I can think of:

For encoding a string like "a4b5c6d7...":

def encode(s):

    counts = {}

    for c in s:

        if counts.get(c) is None:

            counts[c] = s.count(c)

    return "".join(k+str(v) for k,v in counts.items())

For decoding a string like "aaaaaabbbdddddccccc....":

def decode(s):

    return "".join((map(lambda tup:  tup[0] * int(tup[1]), zip(s[0:len(s):2], s[1:len(s):2]))))

Fairly easy to read and simple.

answered Nov 2 '18 at 18:22

Ashwin Joshi

answered Nov 2 '18 at 18:22

Ashwin Joshi

answered Nov 2 '18 at 18:22

Ashwin Joshi

answered Nov 2 '18 at 18:22

Ashwin Joshi

add a comment |

text=input("Please enter the string to encode")

encoded=

index=0

amount=1

while index<=(len(text)-1):  

  if index==(len(text)-1) or text[index]!=text[(index+1)]:

    encoded.append((text[index],amount))        

    amount=1

  else:

    amount=amount+1            

  index=index+1   

print(encoded)

edited Nov 24 '18 at 21:08

answered Nov 24 '18 at 20:45

Christopher McGeough

add a comment |

text=input("Please enter the string to encode")

encoded=

index=0

amount=1

while index<=(len(text)-1):  

  if index==(len(text)-1) or text[index]!=text[(index+1)]:

    encoded.append((text[index],amount))        

    amount=1

  else:

    amount=amount+1            

  index=index+1   

print(encoded)

edited Nov 24 '18 at 21:08

answered Nov 24 '18 at 20:45

Christopher McGeough

add a comment |

text=input("Please enter the string to encode")

encoded=

index=0

amount=1

while index<=(len(text)-1):  

  if index==(len(text)-1) or text[index]!=text[(index+1)]:

    encoded.append((text[index],amount))        

    amount=1

  else:

    amount=amount+1            

  index=index+1   

print(encoded)

edited Nov 24 '18 at 21:08

answered Nov 24 '18 at 20:45

Christopher McGeough

text=input("Please enter the string to encode")

encoded=

index=0

amount=1

while index<=(len(text)-1):  

  if index==(len(text)-1) or text[index]!=text[(index+1)]:

    encoded.append((text[index],amount))        

    amount=1

  else:

    amount=amount+1            

  index=index+1   

print(encoded)

edited Nov 24 '18 at 21:08

answered Nov 24 '18 at 20:45

Christopher McGeough

edited Nov 24 '18 at 21:08

answered Nov 24 '18 at 20:45

Christopher McGeough

answered Nov 24 '18 at 20:45

Christopher McGeough

answered Nov 24 '18 at 20:45

Christopher McGeough

add a comment |

Split=(list(input("Enter string: ")))

Split.append("")

a = 0

for i in range(len(Split)):

    try:

        if (Split[i] in Split) >0:

            a = a + 1

        if Split[i] != Split[i+1]:

            print(Split[i],a)

            a = 0

    except IndexError:

        print()

this is much easier and works everytime

answered Jan 24 at 17:20

Tom Smith

add a comment |

Split=(list(input("Enter string: ")))

Split.append("")

a = 0

for i in range(len(Split)):

    try:

        if (Split[i] in Split) >0:

            a = a + 1

        if Split[i] != Split[i+1]:

            print(Split[i],a)

            a = 0

    except IndexError:

        print()

this is much easier and works everytime

answered Jan 24 at 17:20

Tom Smith

add a comment |

Split=(list(input("Enter string: ")))

Split.append("")

a = 0

for i in range(len(Split)):

    try:

        if (Split[i] in Split) >0:

            a = a + 1

        if Split[i] != Split[i+1]:

            print(Split[i],a)

            a = 0

    except IndexError:

        print()

this is much easier and works everytime

answered Jan 24 at 17:20

Tom Smith

Split=(list(input("Enter string: ")))

Split.append("")

a = 0

for i in range(len(Split)):

    try:

        if (Split[i] in Split) >0:

            a = a + 1

        if Split[i] != Split[i+1]:

            print(Split[i],a)

            a = 0

    except IndexError:

        print()

this is much easier and works everytime

answered Jan 24 at 17:20

Tom Smith

answered Jan 24 at 17:20

Tom Smith

answered Jan 24 at 17:20

Tom Smith

answered Jan 24 at 17:20

Tom Smith

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Nsryjdtyk