How to apply a function to multiple columns to create multiple columns in Pandas?
I am trying to apply a function on multiple columns and in turn create multiple columns to count the length of each entry.
Basically I have 5 columns with indexes 5,7,9,13 and 15 and each entry in those columns is a string of the form 'WrappedArray(|2008-11-12, |2008-11-12)'
and in my function I try to strip the wrappedArray part and split the two values and count the (length - 1)
using the following;
def updates(row,num_col):
strp = row[num_col.strip('WrappedAway')
lis = list(strp.split(','))
return len(lis) - 1
where num_col is the index of the column and cal take the value 5,7,9,13,15.
I have done this but only for 1 column:
fn = lambda row: updates(row,5)
col = df.apply(fn, axis=1)
df = df.assign(**{'count1':col.values})
I basically want to apply this function to ALL the columns (not just 5 as above) with the indexes mentioned and then create a separate column associated with columns 5,7,9,13 and 15 all in short code instead of doing that separately for each value.
I hope I made sense.
python pandas
add a comment |
I am trying to apply a function on multiple columns and in turn create multiple columns to count the length of each entry.
Basically I have 5 columns with indexes 5,7,9,13 and 15 and each entry in those columns is a string of the form 'WrappedArray(|2008-11-12, |2008-11-12)'
and in my function I try to strip the wrappedArray part and split the two values and count the (length - 1)
using the following;
def updates(row,num_col):
strp = row[num_col.strip('WrappedAway')
lis = list(strp.split(','))
return len(lis) - 1
where num_col is the index of the column and cal take the value 5,7,9,13,15.
I have done this but only for 1 column:
fn = lambda row: updates(row,5)
col = df.apply(fn, axis=1)
df = df.assign(**{'count1':col.values})
I basically want to apply this function to ALL the columns (not just 5 as above) with the indexes mentioned and then create a separate column associated with columns 5,7,9,13 and 15 all in short code instead of doing that separately for each value.
I hope I made sense.
python pandas
1
Could you create an example of your input data with desired output?
– zipa
Nov 26 '18 at 10:53
add a comment |
I am trying to apply a function on multiple columns and in turn create multiple columns to count the length of each entry.
Basically I have 5 columns with indexes 5,7,9,13 and 15 and each entry in those columns is a string of the form 'WrappedArray(|2008-11-12, |2008-11-12)'
and in my function I try to strip the wrappedArray part and split the two values and count the (length - 1)
using the following;
def updates(row,num_col):
strp = row[num_col.strip('WrappedAway')
lis = list(strp.split(','))
return len(lis) - 1
where num_col is the index of the column and cal take the value 5,7,9,13,15.
I have done this but only for 1 column:
fn = lambda row: updates(row,5)
col = df.apply(fn, axis=1)
df = df.assign(**{'count1':col.values})
I basically want to apply this function to ALL the columns (not just 5 as above) with the indexes mentioned and then create a separate column associated with columns 5,7,9,13 and 15 all in short code instead of doing that separately for each value.
I hope I made sense.
python pandas
I am trying to apply a function on multiple columns and in turn create multiple columns to count the length of each entry.
Basically I have 5 columns with indexes 5,7,9,13 and 15 and each entry in those columns is a string of the form 'WrappedArray(|2008-11-12, |2008-11-12)'
and in my function I try to strip the wrappedArray part and split the two values and count the (length - 1)
using the following;
def updates(row,num_col):
strp = row[num_col.strip('WrappedAway')
lis = list(strp.split(','))
return len(lis) - 1
where num_col is the index of the column and cal take the value 5,7,9,13,15.
I have done this but only for 1 column:
fn = lambda row: updates(row,5)
col = df.apply(fn, axis=1)
df = df.assign(**{'count1':col.values})
I basically want to apply this function to ALL the columns (not just 5 as above) with the indexes mentioned and then create a separate column associated with columns 5,7,9,13 and 15 all in short code instead of doing that separately for each value.
I hope I made sense.
python pandas
python pandas
edited Nov 26 '18 at 10:54
Tanya Gupta
asked Nov 26 '18 at 10:43
Tanya GuptaTanya Gupta
83
83
1
Could you create an example of your input data with desired output?
– zipa
Nov 26 '18 at 10:53
add a comment |
1
Could you create an example of your input data with desired output?
– zipa
Nov 26 '18 at 10:53
1
1
Could you create an example of your input data with desired output?
– zipa
Nov 26 '18 at 10:53
Could you create an example of your input data with desired output?
– zipa
Nov 26 '18 at 10:53
add a comment |
3 Answers
3
active
oldest
votes
In regards to finding the amount of elements in the list, looks like you could simply use str.count()
to find the amount of ','
in the strings. And in order to apply a defined function to a set of columns you could do something like:
cols = [5,7,9,13,15]
for col in cols:
col_counts = {'{}_count'.format(col): df.iloc[:,col].apply(lambda x: x.count(','))}
df = df.assign(**col_counts)
Alternatively you can also usestrip('WrappedAway').split(',')
as you where using:
def count_elements(x):
return len(x.strip('WrappedAway').split(',')) - 1
for col in cols:
col_counts = {'{}_count'.format(col):
df.iloc[:,col].apply(count_elements)}
df = df.assign(**col_counts)
So for example with the following dataframe:
df = pd.DataFrame({'A': ['WrappedArray(|2008-11-12, |2008-11-12, |2008-10-11)', 'WrappedArray(|2008-11-12, |2008-11-12)'],
'B': ['WrappedArray(|2008-11-12,|2008-11-12)', 'WrappedArray(|2008-11-12, |2008-11-12)'],
'C': ['WrappedArray(|2008-11-12|2008-11-12)', 'WrappedArray(|2008-11-12|2008-11-12)']})
Redefining the set of columns on which we want to count the amount of elements:
for col in [0,1,2]:
col_counts = {'{}_count'.format(col):
df.iloc[:,col].apply(count_elements)}
df = df.assign(**col_counts)
Would yield:
A
0 WrappedArray(|2008-11-12, |2008-11-12, |2008-1...
1 WrappedArray(|2008-11-12, |2008-11-12)
B
0 WrappedArray(|2008-11-12,|2008-11-12)
1 WrappedArray(|2008-11-12, |2008-11-12)
C 0_count 1_count 2_count
0 WrappedArray(|2008-11-12|2008-11-12) 2 1 0
1 WrappedArray(|2008-11-12|2008-11-12) 1 1 0
1
yay that worked! But I had to take the + 1 out because it gave me wrong numbers and taking it out gave me exactly what I was looking for. Can you please explain the line that defines col_counts a little bit more?
– Tanya Gupta
Nov 26 '18 at 11:46
'{}_count'.format(col)
is a string formatter, very useful when it comes to modifying strings in a for loop. And withdf.iloc[:,col].apply(lambda x: x.count(','))
you are applying the count of ','
to the columns of interest. Glad it helped :-)
– yatu
Nov 26 '18 at 12:05
Btw I added an edit to also do it withstrip('WrappedAway').split(',')
as you were doing
– yatu
Nov 26 '18 at 12:21
If you have more questions, find me on twiiter, user name is in my profile. Hope that helped
– yatu
Nov 26 '18 at 12:30
ah perfect. Thank you!
– Tanya Gupta
Nov 26 '18 at 14:51
add a comment |
You are confusing row-wise and column-wise operations by trying to do both in one function. Choose one or the other. Column-wise operations are usually more efficient and you can utilize Pandas str
methods.
Setup
df = pd.DataFrame({'A': ['WrappedArray(|2008-11-12, |2008-11-12, |2008-10-11)', 'WrappedArray(|2008-11-12, |2008-11-12)'],
'B': ['WrappedArray(|2008-11-12,|2008-11-12)', 'WrappedArray(|2008-11-12|2008-11-12)']})
Logic
# perform operations on strings in a series
def calc_length(series):
return series.str.strip('WrappedAway').str.split(',').str.len() - 1
# apply to each column and join to original dataframe
df = df.join(df.apply(calc_length).add_suffix('_Length'))
Result
print(df)
A
0 WrappedArray(|2008-11-12, |2008-11-12, |2008-1...
1 WrappedArray(|2008-11-12, |2008-11-12)
B A_Length B_Length
0 WrappedArray(|2008-11-12,|2008-11-12) 2 1
1 WrappedArray(|2008-11-12|2008-11-12) 1 0
add a comment |
I think we can use pandas str.count()
df= pd.DataFrame({
"col1":['WrappedArray(|2008-11-12, |2008-11-12)',
'WrappedArray(|2018-11-12, |2017-11-12, |2018-11-12)'],
"col2":['WrappedArray(|2008-11-12, |2008-11-12,|2008-11-12,|2008-11-12)',
'WrappedArray(|2018-11-12, |2017-11-12, |2018-11-12)']})
df["col1"].str.count(',')
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53479377%2fhow-to-apply-a-function-to-multiple-columns-to-create-multiple-columns-in-pandas%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
In regards to finding the amount of elements in the list, looks like you could simply use str.count()
to find the amount of ','
in the strings. And in order to apply a defined function to a set of columns you could do something like:
cols = [5,7,9,13,15]
for col in cols:
col_counts = {'{}_count'.format(col): df.iloc[:,col].apply(lambda x: x.count(','))}
df = df.assign(**col_counts)
Alternatively you can also usestrip('WrappedAway').split(',')
as you where using:
def count_elements(x):
return len(x.strip('WrappedAway').split(',')) - 1
for col in cols:
col_counts = {'{}_count'.format(col):
df.iloc[:,col].apply(count_elements)}
df = df.assign(**col_counts)
So for example with the following dataframe:
df = pd.DataFrame({'A': ['WrappedArray(|2008-11-12, |2008-11-12, |2008-10-11)', 'WrappedArray(|2008-11-12, |2008-11-12)'],
'B': ['WrappedArray(|2008-11-12,|2008-11-12)', 'WrappedArray(|2008-11-12, |2008-11-12)'],
'C': ['WrappedArray(|2008-11-12|2008-11-12)', 'WrappedArray(|2008-11-12|2008-11-12)']})
Redefining the set of columns on which we want to count the amount of elements:
for col in [0,1,2]:
col_counts = {'{}_count'.format(col):
df.iloc[:,col].apply(count_elements)}
df = df.assign(**col_counts)
Would yield:
A
0 WrappedArray(|2008-11-12, |2008-11-12, |2008-1...
1 WrappedArray(|2008-11-12, |2008-11-12)
B
0 WrappedArray(|2008-11-12,|2008-11-12)
1 WrappedArray(|2008-11-12, |2008-11-12)
C 0_count 1_count 2_count
0 WrappedArray(|2008-11-12|2008-11-12) 2 1 0
1 WrappedArray(|2008-11-12|2008-11-12) 1 1 0
1
yay that worked! But I had to take the + 1 out because it gave me wrong numbers and taking it out gave me exactly what I was looking for. Can you please explain the line that defines col_counts a little bit more?
– Tanya Gupta
Nov 26 '18 at 11:46
'{}_count'.format(col)
is a string formatter, very useful when it comes to modifying strings in a for loop. And withdf.iloc[:,col].apply(lambda x: x.count(','))
you are applying the count of ','
to the columns of interest. Glad it helped :-)
– yatu
Nov 26 '18 at 12:05
Btw I added an edit to also do it withstrip('WrappedAway').split(',')
as you were doing
– yatu
Nov 26 '18 at 12:21
If you have more questions, find me on twiiter, user name is in my profile. Hope that helped
– yatu
Nov 26 '18 at 12:30
ah perfect. Thank you!
– Tanya Gupta
Nov 26 '18 at 14:51
add a comment |
In regards to finding the amount of elements in the list, looks like you could simply use str.count()
to find the amount of ','
in the strings. And in order to apply a defined function to a set of columns you could do something like:
cols = [5,7,9,13,15]
for col in cols:
col_counts = {'{}_count'.format(col): df.iloc[:,col].apply(lambda x: x.count(','))}
df = df.assign(**col_counts)
Alternatively you can also usestrip('WrappedAway').split(',')
as you where using:
def count_elements(x):
return len(x.strip('WrappedAway').split(',')) - 1
for col in cols:
col_counts = {'{}_count'.format(col):
df.iloc[:,col].apply(count_elements)}
df = df.assign(**col_counts)
So for example with the following dataframe:
df = pd.DataFrame({'A': ['WrappedArray(|2008-11-12, |2008-11-12, |2008-10-11)', 'WrappedArray(|2008-11-12, |2008-11-12)'],
'B': ['WrappedArray(|2008-11-12,|2008-11-12)', 'WrappedArray(|2008-11-12, |2008-11-12)'],
'C': ['WrappedArray(|2008-11-12|2008-11-12)', 'WrappedArray(|2008-11-12|2008-11-12)']})
Redefining the set of columns on which we want to count the amount of elements:
for col in [0,1,2]:
col_counts = {'{}_count'.format(col):
df.iloc[:,col].apply(count_elements)}
df = df.assign(**col_counts)
Would yield:
A
0 WrappedArray(|2008-11-12, |2008-11-12, |2008-1...
1 WrappedArray(|2008-11-12, |2008-11-12)
B
0 WrappedArray(|2008-11-12,|2008-11-12)
1 WrappedArray(|2008-11-12, |2008-11-12)
C 0_count 1_count 2_count
0 WrappedArray(|2008-11-12|2008-11-12) 2 1 0
1 WrappedArray(|2008-11-12|2008-11-12) 1 1 0
1
yay that worked! But I had to take the + 1 out because it gave me wrong numbers and taking it out gave me exactly what I was looking for. Can you please explain the line that defines col_counts a little bit more?
– Tanya Gupta
Nov 26 '18 at 11:46
'{}_count'.format(col)
is a string formatter, very useful when it comes to modifying strings in a for loop. And withdf.iloc[:,col].apply(lambda x: x.count(','))
you are applying the count of ','
to the columns of interest. Glad it helped :-)
– yatu
Nov 26 '18 at 12:05
Btw I added an edit to also do it withstrip('WrappedAway').split(',')
as you were doing
– yatu
Nov 26 '18 at 12:21
If you have more questions, find me on twiiter, user name is in my profile. Hope that helped
– yatu
Nov 26 '18 at 12:30
ah perfect. Thank you!
– Tanya Gupta
Nov 26 '18 at 14:51
add a comment |
In regards to finding the amount of elements in the list, looks like you could simply use str.count()
to find the amount of ','
in the strings. And in order to apply a defined function to a set of columns you could do something like:
cols = [5,7,9,13,15]
for col in cols:
col_counts = {'{}_count'.format(col): df.iloc[:,col].apply(lambda x: x.count(','))}
df = df.assign(**col_counts)
Alternatively you can also usestrip('WrappedAway').split(',')
as you where using:
def count_elements(x):
return len(x.strip('WrappedAway').split(',')) - 1
for col in cols:
col_counts = {'{}_count'.format(col):
df.iloc[:,col].apply(count_elements)}
df = df.assign(**col_counts)
So for example with the following dataframe:
df = pd.DataFrame({'A': ['WrappedArray(|2008-11-12, |2008-11-12, |2008-10-11)', 'WrappedArray(|2008-11-12, |2008-11-12)'],
'B': ['WrappedArray(|2008-11-12,|2008-11-12)', 'WrappedArray(|2008-11-12, |2008-11-12)'],
'C': ['WrappedArray(|2008-11-12|2008-11-12)', 'WrappedArray(|2008-11-12|2008-11-12)']})
Redefining the set of columns on which we want to count the amount of elements:
for col in [0,1,2]:
col_counts = {'{}_count'.format(col):
df.iloc[:,col].apply(count_elements)}
df = df.assign(**col_counts)
Would yield:
A
0 WrappedArray(|2008-11-12, |2008-11-12, |2008-1...
1 WrappedArray(|2008-11-12, |2008-11-12)
B
0 WrappedArray(|2008-11-12,|2008-11-12)
1 WrappedArray(|2008-11-12, |2008-11-12)
C 0_count 1_count 2_count
0 WrappedArray(|2008-11-12|2008-11-12) 2 1 0
1 WrappedArray(|2008-11-12|2008-11-12) 1 1 0
In regards to finding the amount of elements in the list, looks like you could simply use str.count()
to find the amount of ','
in the strings. And in order to apply a defined function to a set of columns you could do something like:
cols = [5,7,9,13,15]
for col in cols:
col_counts = {'{}_count'.format(col): df.iloc[:,col].apply(lambda x: x.count(','))}
df = df.assign(**col_counts)
Alternatively you can also usestrip('WrappedAway').split(',')
as you where using:
def count_elements(x):
return len(x.strip('WrappedAway').split(',')) - 1
for col in cols:
col_counts = {'{}_count'.format(col):
df.iloc[:,col].apply(count_elements)}
df = df.assign(**col_counts)
So for example with the following dataframe:
df = pd.DataFrame({'A': ['WrappedArray(|2008-11-12, |2008-11-12, |2008-10-11)', 'WrappedArray(|2008-11-12, |2008-11-12)'],
'B': ['WrappedArray(|2008-11-12,|2008-11-12)', 'WrappedArray(|2008-11-12, |2008-11-12)'],
'C': ['WrappedArray(|2008-11-12|2008-11-12)', 'WrappedArray(|2008-11-12|2008-11-12)']})
Redefining the set of columns on which we want to count the amount of elements:
for col in [0,1,2]:
col_counts = {'{}_count'.format(col):
df.iloc[:,col].apply(count_elements)}
df = df.assign(**col_counts)
Would yield:
A
0 WrappedArray(|2008-11-12, |2008-11-12, |2008-1...
1 WrappedArray(|2008-11-12, |2008-11-12)
B
0 WrappedArray(|2008-11-12,|2008-11-12)
1 WrappedArray(|2008-11-12, |2008-11-12)
C 0_count 1_count 2_count
0 WrappedArray(|2008-11-12|2008-11-12) 2 1 0
1 WrappedArray(|2008-11-12|2008-11-12) 1 1 0
edited Nov 26 '18 at 12:20
answered Nov 26 '18 at 11:09
yatuyatu
15.3k41542
15.3k41542
1
yay that worked! But I had to take the + 1 out because it gave me wrong numbers and taking it out gave me exactly what I was looking for. Can you please explain the line that defines col_counts a little bit more?
– Tanya Gupta
Nov 26 '18 at 11:46
'{}_count'.format(col)
is a string formatter, very useful when it comes to modifying strings in a for loop. And withdf.iloc[:,col].apply(lambda x: x.count(','))
you are applying the count of ','
to the columns of interest. Glad it helped :-)
– yatu
Nov 26 '18 at 12:05
Btw I added an edit to also do it withstrip('WrappedAway').split(',')
as you were doing
– yatu
Nov 26 '18 at 12:21
If you have more questions, find me on twiiter, user name is in my profile. Hope that helped
– yatu
Nov 26 '18 at 12:30
ah perfect. Thank you!
– Tanya Gupta
Nov 26 '18 at 14:51
add a comment |
1
yay that worked! But I had to take the + 1 out because it gave me wrong numbers and taking it out gave me exactly what I was looking for. Can you please explain the line that defines col_counts a little bit more?
– Tanya Gupta
Nov 26 '18 at 11:46
'{}_count'.format(col)
is a string formatter, very useful when it comes to modifying strings in a for loop. And withdf.iloc[:,col].apply(lambda x: x.count(','))
you are applying the count of ','
to the columns of interest. Glad it helped :-)
– yatu
Nov 26 '18 at 12:05
Btw I added an edit to also do it withstrip('WrappedAway').split(',')
as you were doing
– yatu
Nov 26 '18 at 12:21
If you have more questions, find me on twiiter, user name is in my profile. Hope that helped
– yatu
Nov 26 '18 at 12:30
ah perfect. Thank you!
– Tanya Gupta
Nov 26 '18 at 14:51
1
1
yay that worked! But I had to take the + 1 out because it gave me wrong numbers and taking it out gave me exactly what I was looking for. Can you please explain the line that defines col_counts a little bit more?
– Tanya Gupta
Nov 26 '18 at 11:46
yay that worked! But I had to take the + 1 out because it gave me wrong numbers and taking it out gave me exactly what I was looking for. Can you please explain the line that defines col_counts a little bit more?
– Tanya Gupta
Nov 26 '18 at 11:46
'{}_count'.format(col)
is a string formatter, very useful when it comes to modifying strings in a for loop. And with df.iloc[:,col].apply(lambda x: x.count(','))
you are applying the count of ','
to the columns of interest. Glad it helped :-)– yatu
Nov 26 '18 at 12:05
'{}_count'.format(col)
is a string formatter, very useful when it comes to modifying strings in a for loop. And with df.iloc[:,col].apply(lambda x: x.count(','))
you are applying the count of ','
to the columns of interest. Glad it helped :-)– yatu
Nov 26 '18 at 12:05
Btw I added an edit to also do it with
strip('WrappedAway').split(',')
as you were doing– yatu
Nov 26 '18 at 12:21
Btw I added an edit to also do it with
strip('WrappedAway').split(',')
as you were doing– yatu
Nov 26 '18 at 12:21
If you have more questions, find me on twiiter, user name is in my profile. Hope that helped
– yatu
Nov 26 '18 at 12:30
If you have more questions, find me on twiiter, user name is in my profile. Hope that helped
– yatu
Nov 26 '18 at 12:30
ah perfect. Thank you!
– Tanya Gupta
Nov 26 '18 at 14:51
ah perfect. Thank you!
– Tanya Gupta
Nov 26 '18 at 14:51
add a comment |
You are confusing row-wise and column-wise operations by trying to do both in one function. Choose one or the other. Column-wise operations are usually more efficient and you can utilize Pandas str
methods.
Setup
df = pd.DataFrame({'A': ['WrappedArray(|2008-11-12, |2008-11-12, |2008-10-11)', 'WrappedArray(|2008-11-12, |2008-11-12)'],
'B': ['WrappedArray(|2008-11-12,|2008-11-12)', 'WrappedArray(|2008-11-12|2008-11-12)']})
Logic
# perform operations on strings in a series
def calc_length(series):
return series.str.strip('WrappedAway').str.split(',').str.len() - 1
# apply to each column and join to original dataframe
df = df.join(df.apply(calc_length).add_suffix('_Length'))
Result
print(df)
A
0 WrappedArray(|2008-11-12, |2008-11-12, |2008-1...
1 WrappedArray(|2008-11-12, |2008-11-12)
B A_Length B_Length
0 WrappedArray(|2008-11-12,|2008-11-12) 2 1
1 WrappedArray(|2008-11-12|2008-11-12) 1 0
add a comment |
You are confusing row-wise and column-wise operations by trying to do both in one function. Choose one or the other. Column-wise operations are usually more efficient and you can utilize Pandas str
methods.
Setup
df = pd.DataFrame({'A': ['WrappedArray(|2008-11-12, |2008-11-12, |2008-10-11)', 'WrappedArray(|2008-11-12, |2008-11-12)'],
'B': ['WrappedArray(|2008-11-12,|2008-11-12)', 'WrappedArray(|2008-11-12|2008-11-12)']})
Logic
# perform operations on strings in a series
def calc_length(series):
return series.str.strip('WrappedAway').str.split(',').str.len() - 1
# apply to each column and join to original dataframe
df = df.join(df.apply(calc_length).add_suffix('_Length'))
Result
print(df)
A
0 WrappedArray(|2008-11-12, |2008-11-12, |2008-1...
1 WrappedArray(|2008-11-12, |2008-11-12)
B A_Length B_Length
0 WrappedArray(|2008-11-12,|2008-11-12) 2 1
1 WrappedArray(|2008-11-12|2008-11-12) 1 0
add a comment |
You are confusing row-wise and column-wise operations by trying to do both in one function. Choose one or the other. Column-wise operations are usually more efficient and you can utilize Pandas str
methods.
Setup
df = pd.DataFrame({'A': ['WrappedArray(|2008-11-12, |2008-11-12, |2008-10-11)', 'WrappedArray(|2008-11-12, |2008-11-12)'],
'B': ['WrappedArray(|2008-11-12,|2008-11-12)', 'WrappedArray(|2008-11-12|2008-11-12)']})
Logic
# perform operations on strings in a series
def calc_length(series):
return series.str.strip('WrappedAway').str.split(',').str.len() - 1
# apply to each column and join to original dataframe
df = df.join(df.apply(calc_length).add_suffix('_Length'))
Result
print(df)
A
0 WrappedArray(|2008-11-12, |2008-11-12, |2008-1...
1 WrappedArray(|2008-11-12, |2008-11-12)
B A_Length B_Length
0 WrappedArray(|2008-11-12,|2008-11-12) 2 1
1 WrappedArray(|2008-11-12|2008-11-12) 1 0
You are confusing row-wise and column-wise operations by trying to do both in one function. Choose one or the other. Column-wise operations are usually more efficient and you can utilize Pandas str
methods.
Setup
df = pd.DataFrame({'A': ['WrappedArray(|2008-11-12, |2008-11-12, |2008-10-11)', 'WrappedArray(|2008-11-12, |2008-11-12)'],
'B': ['WrappedArray(|2008-11-12,|2008-11-12)', 'WrappedArray(|2008-11-12|2008-11-12)']})
Logic
# perform operations on strings in a series
def calc_length(series):
return series.str.strip('WrappedAway').str.split(',').str.len() - 1
# apply to each column and join to original dataframe
df = df.join(df.apply(calc_length).add_suffix('_Length'))
Result
print(df)
A
0 WrappedArray(|2008-11-12, |2008-11-12, |2008-1...
1 WrappedArray(|2008-11-12, |2008-11-12)
B A_Length B_Length
0 WrappedArray(|2008-11-12,|2008-11-12) 2 1
1 WrappedArray(|2008-11-12|2008-11-12) 1 0
answered Nov 26 '18 at 10:53
jppjpp
102k2165116
102k2165116
add a comment |
add a comment |
I think we can use pandas str.count()
df= pd.DataFrame({
"col1":['WrappedArray(|2008-11-12, |2008-11-12)',
'WrappedArray(|2018-11-12, |2017-11-12, |2018-11-12)'],
"col2":['WrappedArray(|2008-11-12, |2008-11-12,|2008-11-12,|2008-11-12)',
'WrappedArray(|2018-11-12, |2017-11-12, |2018-11-12)']})
df["col1"].str.count(',')
add a comment |
I think we can use pandas str.count()
df= pd.DataFrame({
"col1":['WrappedArray(|2008-11-12, |2008-11-12)',
'WrappedArray(|2018-11-12, |2017-11-12, |2018-11-12)'],
"col2":['WrappedArray(|2008-11-12, |2008-11-12,|2008-11-12,|2008-11-12)',
'WrappedArray(|2018-11-12, |2017-11-12, |2018-11-12)']})
df["col1"].str.count(',')
add a comment |
I think we can use pandas str.count()
df= pd.DataFrame({
"col1":['WrappedArray(|2008-11-12, |2008-11-12)',
'WrappedArray(|2018-11-12, |2017-11-12, |2018-11-12)'],
"col2":['WrappedArray(|2008-11-12, |2008-11-12,|2008-11-12,|2008-11-12)',
'WrappedArray(|2018-11-12, |2017-11-12, |2018-11-12)']})
df["col1"].str.count(',')
I think we can use pandas str.count()
df= pd.DataFrame({
"col1":['WrappedArray(|2008-11-12, |2008-11-12)',
'WrappedArray(|2018-11-12, |2017-11-12, |2018-11-12)'],
"col2":['WrappedArray(|2008-11-12, |2008-11-12,|2008-11-12,|2008-11-12)',
'WrappedArray(|2018-11-12, |2017-11-12, |2018-11-12)']})
df["col1"].str.count(',')
answered Nov 26 '18 at 11:00
erncyperncyp
326311
326311
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53479377%2fhow-to-apply-a-function-to-multiple-columns-to-create-multiple-columns-in-pandas%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
Could you create an example of your input data with desired output?
– zipa
Nov 26 '18 at 10:53