Run length encoding in Python
I'm trying to write a simple python algorithm to solve this problem. Can you please help me figure out why my code is not working:
Problem:
If any character is repeated more than 4 times, the entire set of
repeated characters should be replaced with a slash '/', followed by a
2-digit number which is the length of this run of repeated characters,
and the character. For example, "aaaaa" would be encoded as "/05a".
Runs of 4 or less characters should not be replaced since performing
the encoding would not decrease the length of the string.
My Code:
def runLengthEncode (plainText):
res=''
a=''
for i in plainText:
if a.count(i)>0:
a+=i
else:
if len(a)>4:
res+="/" + str(len(a)) + a[0][:1]
else:
res+=a
a=i
return(res)
python string
migrated from cs.stackexchange.com Sep 22 '13 at 20:21
This question came from our site for students, researchers and practitioners of computer science.
add a comment |
I'm trying to write a simple python algorithm to solve this problem. Can you please help me figure out why my code is not working:
Problem:
If any character is repeated more than 4 times, the entire set of
repeated characters should be replaced with a slash '/', followed by a
2-digit number which is the length of this run of repeated characters,
and the character. For example, "aaaaa" would be encoded as "/05a".
Runs of 4 or less characters should not be replaced since performing
the encoding would not decrease the length of the string.
My Code:
def runLengthEncode (plainText):
res=''
a=''
for i in plainText:
if a.count(i)>0:
a+=i
else:
if len(a)>4:
res+="/" + str(len(a)) + a[0][:1]
else:
res+=a
a=i
return(res)
python string
migrated from cs.stackexchange.com Sep 22 '13 at 20:21
This question came from our site for students, researchers and practitioners of computer science.
4
Aside: You may want to look atcollections.Counter
anditertools.groupby
. Both are very effective ways to count repeats in sequences.
– kojiro
Sep 22 '13 at 20:26
What should happen if it repeats more than 99 times?
– Samy Bencherif
Sep 22 '13 at 20:46
By the way, this code will fail if the input is something like/05c
because it has no concept of escaping input correctly. You either need to handle/
especially, or compress every occurrence, not just runs of length > 4.
– Konrad Rudolph
Sep 22 '13 at 20:46
add a comment |
I'm trying to write a simple python algorithm to solve this problem. Can you please help me figure out why my code is not working:
Problem:
If any character is repeated more than 4 times, the entire set of
repeated characters should be replaced with a slash '/', followed by a
2-digit number which is the length of this run of repeated characters,
and the character. For example, "aaaaa" would be encoded as "/05a".
Runs of 4 or less characters should not be replaced since performing
the encoding would not decrease the length of the string.
My Code:
def runLengthEncode (plainText):
res=''
a=''
for i in plainText:
if a.count(i)>0:
a+=i
else:
if len(a)>4:
res+="/" + str(len(a)) + a[0][:1]
else:
res+=a
a=i
return(res)
python string
I'm trying to write a simple python algorithm to solve this problem. Can you please help me figure out why my code is not working:
Problem:
If any character is repeated more than 4 times, the entire set of
repeated characters should be replaced with a slash '/', followed by a
2-digit number which is the length of this run of repeated characters,
and the character. For example, "aaaaa" would be encoded as "/05a".
Runs of 4 or less characters should not be replaced since performing
the encoding would not decrease the length of the string.
My Code:
def runLengthEncode (plainText):
res=''
a=''
for i in plainText:
if a.count(i)>0:
a+=i
else:
if len(a)>4:
res+="/" + str(len(a)) + a[0][:1]
else:
res+=a
a=i
return(res)
python string
python string
edited Sep 22 '13 at 20:23
Gilles
75.7k19162206
75.7k19162206
asked Sep 22 '13 at 20:20
user10281
migrated from cs.stackexchange.com Sep 22 '13 at 20:21
This question came from our site for students, researchers and practitioners of computer science.
migrated from cs.stackexchange.com Sep 22 '13 at 20:21
This question came from our site for students, researchers and practitioners of computer science.
4
Aside: You may want to look atcollections.Counter
anditertools.groupby
. Both are very effective ways to count repeats in sequences.
– kojiro
Sep 22 '13 at 20:26
What should happen if it repeats more than 99 times?
– Samy Bencherif
Sep 22 '13 at 20:46
By the way, this code will fail if the input is something like/05c
because it has no concept of escaping input correctly. You either need to handle/
especially, or compress every occurrence, not just runs of length > 4.
– Konrad Rudolph
Sep 22 '13 at 20:46
add a comment |
4
Aside: You may want to look atcollections.Counter
anditertools.groupby
. Both are very effective ways to count repeats in sequences.
– kojiro
Sep 22 '13 at 20:26
What should happen if it repeats more than 99 times?
– Samy Bencherif
Sep 22 '13 at 20:46
By the way, this code will fail if the input is something like/05c
because it has no concept of escaping input correctly. You either need to handle/
especially, or compress every occurrence, not just runs of length > 4.
– Konrad Rudolph
Sep 22 '13 at 20:46
4
4
Aside: You may want to look at
collections.Counter
and itertools.groupby
. Both are very effective ways to count repeats in sequences.– kojiro
Sep 22 '13 at 20:26
Aside: You may want to look at
collections.Counter
and itertools.groupby
. Both are very effective ways to count repeats in sequences.– kojiro
Sep 22 '13 at 20:26
What should happen if it repeats more than 99 times?
– Samy Bencherif
Sep 22 '13 at 20:46
What should happen if it repeats more than 99 times?
– Samy Bencherif
Sep 22 '13 at 20:46
By the way, this code will fail if the input is something like
/05c
because it has no concept of escaping input correctly. You either need to handle /
especially, or compress every occurrence, not just runs of length > 4.– Konrad Rudolph
Sep 22 '13 at 20:46
By the way, this code will fail if the input is something like
/05c
because it has no concept of escaping input correctly. You either need to handle /
especially, or compress every occurrence, not just runs of length > 4.– Konrad Rudolph
Sep 22 '13 at 20:46
add a comment |
8 Answers
8
active
oldest
votes
Aside for setting a=i
after encoding a sequence and setting a width for your int when printed into the string. You could also do the following which takes advantage of pythons groupby
. Its also a good idea to use format
when constructing strings.
from itertools import groupby
def runLengthEncode (plainText):
res =
for k,i in groupby(plainText):
run = list(i)
if(len(run) > 4):
res.append("/{:02}{}".format(len(run), k))
else:
res.extend(run)
return "".join(res)
2
+1, but if the plainText is long you're adding up a string over and over, which is costly because strings are immutable. You could nest a generator to make it faster and still very easy to read. But since I can't put formatted code in a comment, here's a one-liner:''.join('/{:02}{}'.format(n,c) if n>4 else c*n for c,n in [(c,len(list(g))) for c,g in groupby(t)])
– kojiro
Sep 22 '13 at 20:44
@kojiro Good idea, you are right. I implemented what you suggested in multiline format.
– Serdalis
Sep 22 '13 at 20:56
add a comment |
Just observe the behaviour:
>>> runLengthEncode("abcd")
'abc'
Last character is ignored. You have to append what you've collected.
>>> runLengthEncode("abbbbbcd")
'a/5b/5b'
Oops, problem after encoding. You should set a=i
even if you found a long enough sequence.
add a comment |
Rosetta Code has a lot of implementations, that should easily be adaptable to your usecase.
Here is Python code with regular expressions:
from re import sub
def encode(text):
'''
Doctest:
>>> encode('WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWBWWWWWWWWWWWWWW')
'12W1B12W3B24W1B14W'
'''
return sub(r'(.)1*', lambda m: str(len(m.group(0))) + m.group(1),
text)
def decode(text):
'''
Doctest:
>>> decode('12W1B12W3B24W1B14W')
'WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWBWWWWWWWWWWWWWW'
'''
return sub(r'(d+)(D)', lambda m: m.group(2) * int(m.group(1)),
text)
textin = "WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWBWWWWWWWWWWWWWW"
assert decode(encode(textin)) == textin
1
Could you please explain how it works?
– Johnny Well
Nov 5 '16 at 21:59
1
(.)1*
matches a character and then 0 or more of the same. That is, it matches all the runs, so the substitute function can encode them. For decoding(d+)(D)
matches a number and something not a number. E.g. 12w.
– Thomas Ahle
Nov 5 '16 at 22:13
Nice, I did a solution based on yours, but changed the lambda function a little bit by adding an if to output things like B instead of 1B.
– Johnny Well
Nov 7 '16 at 12:44
add a comment |
You can use the groupby() function combined with a list/generator comprehension:
from itertools import groupby, imap
''.join(x if reps <= 4 else "/%02d%s" % (reps, x) for x, reps in imap(lambda x: (x[0], len(list(x[1]))), groupby(s)))
>>> import this
– sedot
Oct 9 '18 at 14:08
add a comment |
I know this is not the most efficient solution, but we haven't studied functions like groupby()
yet so here's what I did:
def runLengthEncode (plainText):
res=''
a=''
count = 0
for i in plainText:
count+=1
if a.count(i)>0:
a+=i
else:
if len(a)>4:
if len(a)<10:
res+="/0"+str(len(a))+a[0][:1]
else:
res+="/" + str(len(a)) + a[0][:1]
a=i
else:
res+=a
a=i
if count == len(plainText):
if len(a)>4:
if len(a)<10:
res+="/0"+str(len(a))+a[0][:1]
else:
res+="/" + str(len(a)) + a[0][:1]
else:
res+=a
return(res)
add a comment |
An easy solution to run-length encoding which I can think of:
For encoding a string like "a4b5c6d7..."
:
def encode(s):
counts = {}
for c in s:
if counts.get(c) is None:
counts[c] = s.count(c)
return "".join(k+str(v) for k,v in counts.items())
For decoding a string like "aaaaaabbbdddddccccc...."
:
def decode(s):
return "".join((map(lambda tup: tup[0] * int(tup[1]), zip(s[0:len(s):2], s[1:len(s):2]))))
Fairly easy to read and simple.
add a comment |
text=input("Please enter the string to encode")
encoded=
index=0
amount=1
while index<=(len(text)-1):
if index==(len(text)-1) or text[index]!=text[(index+1)]:
encoded.append((text[index],amount))
amount=1
else:
amount=amount+1
index=index+1
print(encoded)
add a comment |
Split=(list(input("Enter string: ")))
Split.append("")
a = 0
for i in range(len(Split)):
try:
if (Split[i] in Split) >0:
a = a + 1
if Split[i] != Split[i+1]:
print(Split[i],a)
a = 0
except IndexError:
print()
this is much easier and works everytime
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f18948382%2frun-length-encoding-in-python%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
8 Answers
8
active
oldest
votes
8 Answers
8
active
oldest
votes
active
oldest
votes
active
oldest
votes
Aside for setting a=i
after encoding a sequence and setting a width for your int when printed into the string. You could also do the following which takes advantage of pythons groupby
. Its also a good idea to use format
when constructing strings.
from itertools import groupby
def runLengthEncode (plainText):
res =
for k,i in groupby(plainText):
run = list(i)
if(len(run) > 4):
res.append("/{:02}{}".format(len(run), k))
else:
res.extend(run)
return "".join(res)
2
+1, but if the plainText is long you're adding up a string over and over, which is costly because strings are immutable. You could nest a generator to make it faster and still very easy to read. But since I can't put formatted code in a comment, here's a one-liner:''.join('/{:02}{}'.format(n,c) if n>4 else c*n for c,n in [(c,len(list(g))) for c,g in groupby(t)])
– kojiro
Sep 22 '13 at 20:44
@kojiro Good idea, you are right. I implemented what you suggested in multiline format.
– Serdalis
Sep 22 '13 at 20:56
add a comment |
Aside for setting a=i
after encoding a sequence and setting a width for your int when printed into the string. You could also do the following which takes advantage of pythons groupby
. Its also a good idea to use format
when constructing strings.
from itertools import groupby
def runLengthEncode (plainText):
res =
for k,i in groupby(plainText):
run = list(i)
if(len(run) > 4):
res.append("/{:02}{}".format(len(run), k))
else:
res.extend(run)
return "".join(res)
2
+1, but if the plainText is long you're adding up a string over and over, which is costly because strings are immutable. You could nest a generator to make it faster and still very easy to read. But since I can't put formatted code in a comment, here's a one-liner:''.join('/{:02}{}'.format(n,c) if n>4 else c*n for c,n in [(c,len(list(g))) for c,g in groupby(t)])
– kojiro
Sep 22 '13 at 20:44
@kojiro Good idea, you are right. I implemented what you suggested in multiline format.
– Serdalis
Sep 22 '13 at 20:56
add a comment |
Aside for setting a=i
after encoding a sequence and setting a width for your int when printed into the string. You could also do the following which takes advantage of pythons groupby
. Its also a good idea to use format
when constructing strings.
from itertools import groupby
def runLengthEncode (plainText):
res =
for k,i in groupby(plainText):
run = list(i)
if(len(run) > 4):
res.append("/{:02}{}".format(len(run), k))
else:
res.extend(run)
return "".join(res)
Aside for setting a=i
after encoding a sequence and setting a width for your int when printed into the string. You could also do the following which takes advantage of pythons groupby
. Its also a good idea to use format
when constructing strings.
from itertools import groupby
def runLengthEncode (plainText):
res =
for k,i in groupby(plainText):
run = list(i)
if(len(run) > 4):
res.append("/{:02}{}".format(len(run), k))
else:
res.extend(run)
return "".join(res)
edited Sep 22 '13 at 20:54
answered Sep 22 '13 at 20:36
SerdalisSerdalis
8,12122347
8,12122347
2
+1, but if the plainText is long you're adding up a string over and over, which is costly because strings are immutable. You could nest a generator to make it faster and still very easy to read. But since I can't put formatted code in a comment, here's a one-liner:''.join('/{:02}{}'.format(n,c) if n>4 else c*n for c,n in [(c,len(list(g))) for c,g in groupby(t)])
– kojiro
Sep 22 '13 at 20:44
@kojiro Good idea, you are right. I implemented what you suggested in multiline format.
– Serdalis
Sep 22 '13 at 20:56
add a comment |
2
+1, but if the plainText is long you're adding up a string over and over, which is costly because strings are immutable. You could nest a generator to make it faster and still very easy to read. But since I can't put formatted code in a comment, here's a one-liner:''.join('/{:02}{}'.format(n,c) if n>4 else c*n for c,n in [(c,len(list(g))) for c,g in groupby(t)])
– kojiro
Sep 22 '13 at 20:44
@kojiro Good idea, you are right. I implemented what you suggested in multiline format.
– Serdalis
Sep 22 '13 at 20:56
2
2
+1, but if the plainText is long you're adding up a string over and over, which is costly because strings are immutable. You could nest a generator to make it faster and still very easy to read. But since I can't put formatted code in a comment, here's a one-liner:
''.join('/{:02}{}'.format(n,c) if n>4 else c*n for c,n in [(c,len(list(g))) for c,g in groupby(t)])
– kojiro
Sep 22 '13 at 20:44
+1, but if the plainText is long you're adding up a string over and over, which is costly because strings are immutable. You could nest a generator to make it faster and still very easy to read. But since I can't put formatted code in a comment, here's a one-liner:
''.join('/{:02}{}'.format(n,c) if n>4 else c*n for c,n in [(c,len(list(g))) for c,g in groupby(t)])
– kojiro
Sep 22 '13 at 20:44
@kojiro Good idea, you are right. I implemented what you suggested in multiline format.
– Serdalis
Sep 22 '13 at 20:56
@kojiro Good idea, you are right. I implemented what you suggested in multiline format.
– Serdalis
Sep 22 '13 at 20:56
add a comment |
Just observe the behaviour:
>>> runLengthEncode("abcd")
'abc'
Last character is ignored. You have to append what you've collected.
>>> runLengthEncode("abbbbbcd")
'a/5b/5b'
Oops, problem after encoding. You should set a=i
even if you found a long enough sequence.
add a comment |
Just observe the behaviour:
>>> runLengthEncode("abcd")
'abc'
Last character is ignored. You have to append what you've collected.
>>> runLengthEncode("abbbbbcd")
'a/5b/5b'
Oops, problem after encoding. You should set a=i
even if you found a long enough sequence.
add a comment |
Just observe the behaviour:
>>> runLengthEncode("abcd")
'abc'
Last character is ignored. You have to append what you've collected.
>>> runLengthEncode("abbbbbcd")
'a/5b/5b'
Oops, problem after encoding. You should set a=i
even if you found a long enough sequence.
Just observe the behaviour:
>>> runLengthEncode("abcd")
'abc'
Last character is ignored. You have to append what you've collected.
>>> runLengthEncode("abbbbbcd")
'a/5b/5b'
Oops, problem after encoding. You should set a=i
even if you found a long enough sequence.
answered Sep 22 '13 at 20:30
Karoly HorvathKaroly Horvath
78.9k1092157
78.9k1092157
add a comment |
add a comment |
Rosetta Code has a lot of implementations, that should easily be adaptable to your usecase.
Here is Python code with regular expressions:
from re import sub
def encode(text):
'''
Doctest:
>>> encode('WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWBWWWWWWWWWWWWWW')
'12W1B12W3B24W1B14W'
'''
return sub(r'(.)1*', lambda m: str(len(m.group(0))) + m.group(1),
text)
def decode(text):
'''
Doctest:
>>> decode('12W1B12W3B24W1B14W')
'WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWBWWWWWWWWWWWWWW'
'''
return sub(r'(d+)(D)', lambda m: m.group(2) * int(m.group(1)),
text)
textin = "WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWBWWWWWWWWWWWWWW"
assert decode(encode(textin)) == textin
1
Could you please explain how it works?
– Johnny Well
Nov 5 '16 at 21:59
1
(.)1*
matches a character and then 0 or more of the same. That is, it matches all the runs, so the substitute function can encode them. For decoding(d+)(D)
matches a number and something not a number. E.g. 12w.
– Thomas Ahle
Nov 5 '16 at 22:13
Nice, I did a solution based on yours, but changed the lambda function a little bit by adding an if to output things like B instead of 1B.
– Johnny Well
Nov 7 '16 at 12:44
add a comment |
Rosetta Code has a lot of implementations, that should easily be adaptable to your usecase.
Here is Python code with regular expressions:
from re import sub
def encode(text):
'''
Doctest:
>>> encode('WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWBWWWWWWWWWWWWWW')
'12W1B12W3B24W1B14W'
'''
return sub(r'(.)1*', lambda m: str(len(m.group(0))) + m.group(1),
text)
def decode(text):
'''
Doctest:
>>> decode('12W1B12W3B24W1B14W')
'WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWBWWWWWWWWWWWWWW'
'''
return sub(r'(d+)(D)', lambda m: m.group(2) * int(m.group(1)),
text)
textin = "WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWBWWWWWWWWWWWWWW"
assert decode(encode(textin)) == textin
1
Could you please explain how it works?
– Johnny Well
Nov 5 '16 at 21:59
1
(.)1*
matches a character and then 0 or more of the same. That is, it matches all the runs, so the substitute function can encode them. For decoding(d+)(D)
matches a number and something not a number. E.g. 12w.
– Thomas Ahle
Nov 5 '16 at 22:13
Nice, I did a solution based on yours, but changed the lambda function a little bit by adding an if to output things like B instead of 1B.
– Johnny Well
Nov 7 '16 at 12:44
add a comment |
Rosetta Code has a lot of implementations, that should easily be adaptable to your usecase.
Here is Python code with regular expressions:
from re import sub
def encode(text):
'''
Doctest:
>>> encode('WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWBWWWWWWWWWWWWWW')
'12W1B12W3B24W1B14W'
'''
return sub(r'(.)1*', lambda m: str(len(m.group(0))) + m.group(1),
text)
def decode(text):
'''
Doctest:
>>> decode('12W1B12W3B24W1B14W')
'WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWBWWWWWWWWWWWWWW'
'''
return sub(r'(d+)(D)', lambda m: m.group(2) * int(m.group(1)),
text)
textin = "WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWBWWWWWWWWWWWWWW"
assert decode(encode(textin)) == textin
Rosetta Code has a lot of implementations, that should easily be adaptable to your usecase.
Here is Python code with regular expressions:
from re import sub
def encode(text):
'''
Doctest:
>>> encode('WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWBWWWWWWWWWWWWWW')
'12W1B12W3B24W1B14W'
'''
return sub(r'(.)1*', lambda m: str(len(m.group(0))) + m.group(1),
text)
def decode(text):
'''
Doctest:
>>> decode('12W1B12W3B24W1B14W')
'WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWBWWWWWWWWWWWWWW'
'''
return sub(r'(d+)(D)', lambda m: m.group(2) * int(m.group(1)),
text)
textin = "WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWBWWWWWWWWWWWWWW"
assert decode(encode(textin)) == textin
answered Feb 4 '14 at 21:19
Thomas AhleThomas Ahle
21.7k166596
21.7k166596
1
Could you please explain how it works?
– Johnny Well
Nov 5 '16 at 21:59
1
(.)1*
matches a character and then 0 or more of the same. That is, it matches all the runs, so the substitute function can encode them. For decoding(d+)(D)
matches a number and something not a number. E.g. 12w.
– Thomas Ahle
Nov 5 '16 at 22:13
Nice, I did a solution based on yours, but changed the lambda function a little bit by adding an if to output things like B instead of 1B.
– Johnny Well
Nov 7 '16 at 12:44
add a comment |
1
Could you please explain how it works?
– Johnny Well
Nov 5 '16 at 21:59
1
(.)1*
matches a character and then 0 or more of the same. That is, it matches all the runs, so the substitute function can encode them. For decoding(d+)(D)
matches a number and something not a number. E.g. 12w.
– Thomas Ahle
Nov 5 '16 at 22:13
Nice, I did a solution based on yours, but changed the lambda function a little bit by adding an if to output things like B instead of 1B.
– Johnny Well
Nov 7 '16 at 12:44
1
1
Could you please explain how it works?
– Johnny Well
Nov 5 '16 at 21:59
Could you please explain how it works?
– Johnny Well
Nov 5 '16 at 21:59
1
1
(.)1*
matches a character and then 0 or more of the same. That is, it matches all the runs, so the substitute function can encode them. For decoding (d+)(D)
matches a number and something not a number. E.g. 12w.– Thomas Ahle
Nov 5 '16 at 22:13
(.)1*
matches a character and then 0 or more of the same. That is, it matches all the runs, so the substitute function can encode them. For decoding (d+)(D)
matches a number and something not a number. E.g. 12w.– Thomas Ahle
Nov 5 '16 at 22:13
Nice, I did a solution based on yours, but changed the lambda function a little bit by adding an if to output things like B instead of 1B.
– Johnny Well
Nov 7 '16 at 12:44
Nice, I did a solution based on yours, but changed the lambda function a little bit by adding an if to output things like B instead of 1B.
– Johnny Well
Nov 7 '16 at 12:44
add a comment |
You can use the groupby() function combined with a list/generator comprehension:
from itertools import groupby, imap
''.join(x if reps <= 4 else "/%02d%s" % (reps, x) for x, reps in imap(lambda x: (x[0], len(list(x[1]))), groupby(s)))
>>> import this
– sedot
Oct 9 '18 at 14:08
add a comment |
You can use the groupby() function combined with a list/generator comprehension:
from itertools import groupby, imap
''.join(x if reps <= 4 else "/%02d%s" % (reps, x) for x, reps in imap(lambda x: (x[0], len(list(x[1]))), groupby(s)))
>>> import this
– sedot
Oct 9 '18 at 14:08
add a comment |
You can use the groupby() function combined with a list/generator comprehension:
from itertools import groupby, imap
''.join(x if reps <= 4 else "/%02d%s" % (reps, x) for x, reps in imap(lambda x: (x[0], len(list(x[1]))), groupby(s)))
You can use the groupby() function combined with a list/generator comprehension:
from itertools import groupby, imap
''.join(x if reps <= 4 else "/%02d%s" % (reps, x) for x, reps in imap(lambda x: (x[0], len(list(x[1]))), groupby(s)))
answered Sep 22 '13 at 21:18
pkacprzakpkacprzak
4,80411132
4,80411132
>>> import this
– sedot
Oct 9 '18 at 14:08
add a comment |
>>> import this
– sedot
Oct 9 '18 at 14:08
>>> import this
– sedot
Oct 9 '18 at 14:08
>>> import this
– sedot
Oct 9 '18 at 14:08
add a comment |
I know this is not the most efficient solution, but we haven't studied functions like groupby()
yet so here's what I did:
def runLengthEncode (plainText):
res=''
a=''
count = 0
for i in plainText:
count+=1
if a.count(i)>0:
a+=i
else:
if len(a)>4:
if len(a)<10:
res+="/0"+str(len(a))+a[0][:1]
else:
res+="/" + str(len(a)) + a[0][:1]
a=i
else:
res+=a
a=i
if count == len(plainText):
if len(a)>4:
if len(a)<10:
res+="/0"+str(len(a))+a[0][:1]
else:
res+="/" + str(len(a)) + a[0][:1]
else:
res+=a
return(res)
add a comment |
I know this is not the most efficient solution, but we haven't studied functions like groupby()
yet so here's what I did:
def runLengthEncode (plainText):
res=''
a=''
count = 0
for i in plainText:
count+=1
if a.count(i)>0:
a+=i
else:
if len(a)>4:
if len(a)<10:
res+="/0"+str(len(a))+a[0][:1]
else:
res+="/" + str(len(a)) + a[0][:1]
a=i
else:
res+=a
a=i
if count == len(plainText):
if len(a)>4:
if len(a)<10:
res+="/0"+str(len(a))+a[0][:1]
else:
res+="/" + str(len(a)) + a[0][:1]
else:
res+=a
return(res)
add a comment |
I know this is not the most efficient solution, but we haven't studied functions like groupby()
yet so here's what I did:
def runLengthEncode (plainText):
res=''
a=''
count = 0
for i in plainText:
count+=1
if a.count(i)>0:
a+=i
else:
if len(a)>4:
if len(a)<10:
res+="/0"+str(len(a))+a[0][:1]
else:
res+="/" + str(len(a)) + a[0][:1]
a=i
else:
res+=a
a=i
if count == len(plainText):
if len(a)>4:
if len(a)<10:
res+="/0"+str(len(a))+a[0][:1]
else:
res+="/" + str(len(a)) + a[0][:1]
else:
res+=a
return(res)
I know this is not the most efficient solution, but we haven't studied functions like groupby()
yet so here's what I did:
def runLengthEncode (plainText):
res=''
a=''
count = 0
for i in plainText:
count+=1
if a.count(i)>0:
a+=i
else:
if len(a)>4:
if len(a)<10:
res+="/0"+str(len(a))+a[0][:1]
else:
res+="/" + str(len(a)) + a[0][:1]
a=i
else:
res+=a
a=i
if count == len(plainText):
if len(a)>4:
if len(a)<10:
res+="/0"+str(len(a))+a[0][:1]
else:
res+="/" + str(len(a)) + a[0][:1]
else:
res+=a
return(res)
answered Sep 23 '13 at 1:13
user786033
add a comment |
add a comment |
An easy solution to run-length encoding which I can think of:
For encoding a string like "a4b5c6d7..."
:
def encode(s):
counts = {}
for c in s:
if counts.get(c) is None:
counts[c] = s.count(c)
return "".join(k+str(v) for k,v in counts.items())
For decoding a string like "aaaaaabbbdddddccccc...."
:
def decode(s):
return "".join((map(lambda tup: tup[0] * int(tup[1]), zip(s[0:len(s):2], s[1:len(s):2]))))
Fairly easy to read and simple.
add a comment |
An easy solution to run-length encoding which I can think of:
For encoding a string like "a4b5c6d7..."
:
def encode(s):
counts = {}
for c in s:
if counts.get(c) is None:
counts[c] = s.count(c)
return "".join(k+str(v) for k,v in counts.items())
For decoding a string like "aaaaaabbbdddddccccc...."
:
def decode(s):
return "".join((map(lambda tup: tup[0] * int(tup[1]), zip(s[0:len(s):2], s[1:len(s):2]))))
Fairly easy to read and simple.
add a comment |
An easy solution to run-length encoding which I can think of:
For encoding a string like "a4b5c6d7..."
:
def encode(s):
counts = {}
for c in s:
if counts.get(c) is None:
counts[c] = s.count(c)
return "".join(k+str(v) for k,v in counts.items())
For decoding a string like "aaaaaabbbdddddccccc...."
:
def decode(s):
return "".join((map(lambda tup: tup[0] * int(tup[1]), zip(s[0:len(s):2], s[1:len(s):2]))))
Fairly easy to read and simple.
An easy solution to run-length encoding which I can think of:
For encoding a string like "a4b5c6d7..."
:
def encode(s):
counts = {}
for c in s:
if counts.get(c) is None:
counts[c] = s.count(c)
return "".join(k+str(v) for k,v in counts.items())
For decoding a string like "aaaaaabbbdddddccccc...."
:
def decode(s):
return "".join((map(lambda tup: tup[0] * int(tup[1]), zip(s[0:len(s):2], s[1:len(s):2]))))
Fairly easy to read and simple.
answered Nov 2 '18 at 18:22
Ashwin JoshiAshwin Joshi
73
73
add a comment |
add a comment |
text=input("Please enter the string to encode")
encoded=
index=0
amount=1
while index<=(len(text)-1):
if index==(len(text)-1) or text[index]!=text[(index+1)]:
encoded.append((text[index],amount))
amount=1
else:
amount=amount+1
index=index+1
print(encoded)
add a comment |
text=input("Please enter the string to encode")
encoded=
index=0
amount=1
while index<=(len(text)-1):
if index==(len(text)-1) or text[index]!=text[(index+1)]:
encoded.append((text[index],amount))
amount=1
else:
amount=amount+1
index=index+1
print(encoded)
add a comment |
text=input("Please enter the string to encode")
encoded=
index=0
amount=1
while index<=(len(text)-1):
if index==(len(text)-1) or text[index]!=text[(index+1)]:
encoded.append((text[index],amount))
amount=1
else:
amount=amount+1
index=index+1
print(encoded)
text=input("Please enter the string to encode")
encoded=
index=0
amount=1
while index<=(len(text)-1):
if index==(len(text)-1) or text[index]!=text[(index+1)]:
encoded.append((text[index],amount))
amount=1
else:
amount=amount+1
index=index+1
print(encoded)
edited Nov 24 '18 at 21:08
answered Nov 24 '18 at 20:45
Christopher McGeoughChristopher McGeough
11
11
add a comment |
add a comment |
Split=(list(input("Enter string: ")))
Split.append("")
a = 0
for i in range(len(Split)):
try:
if (Split[i] in Split) >0:
a = a + 1
if Split[i] != Split[i+1]:
print(Split[i],a)
a = 0
except IndexError:
print()
this is much easier and works everytime
add a comment |
Split=(list(input("Enter string: ")))
Split.append("")
a = 0
for i in range(len(Split)):
try:
if (Split[i] in Split) >0:
a = a + 1
if Split[i] != Split[i+1]:
print(Split[i],a)
a = 0
except IndexError:
print()
this is much easier and works everytime
add a comment |
Split=(list(input("Enter string: ")))
Split.append("")
a = 0
for i in range(len(Split)):
try:
if (Split[i] in Split) >0:
a = a + 1
if Split[i] != Split[i+1]:
print(Split[i],a)
a = 0
except IndexError:
print()
this is much easier and works everytime
Split=(list(input("Enter string: ")))
Split.append("")
a = 0
for i in range(len(Split)):
try:
if (Split[i] in Split) >0:
a = a + 1
if Split[i] != Split[i+1]:
print(Split[i],a)
a = 0
except IndexError:
print()
this is much easier and works everytime
answered Jan 24 at 17:20
Tom SmithTom Smith
1
1
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f18948382%2frun-length-encoding-in-python%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
4
Aside: You may want to look at
collections.Counter
anditertools.groupby
. Both are very effective ways to count repeats in sequences.– kojiro
Sep 22 '13 at 20:26
What should happen if it repeats more than 99 times?
– Samy Bencherif
Sep 22 '13 at 20:46
By the way, this code will fail if the input is something like
/05c
because it has no concept of escaping input correctly. You either need to handle/
especially, or compress every occurrence, not just runs of length > 4.– Konrad Rudolph
Sep 22 '13 at 20:46