Pandas conditional filter
I have a dataframe
A B C
0 True True True
1 True False False
2 False False False
I would like to add a row D with the following conditions:
D is true, if A, B and C are true. Else, D is false.
I tried
df['D'] = df.loc[(df['A'] == True) & df['B'] == True & df['C'] == True]
I get
TypeError: cannot compare a dtyped [float64] array with a scalar of type [bool]
Then I tried to follow this example and wrote a similar function as suggested in the link:
def all_true(row):
if row['A'] == True:
if row['B'] == True:
if row['C'] == True:
val = True
else:
val = 0
return val
df['D'] = df.apply(all_true(df), axis=1)
In which case I get
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
I'd appreciate suggestions. Thanks!
python pandas
add a comment |
I have a dataframe
A B C
0 True True True
1 True False False
2 False False False
I would like to add a row D with the following conditions:
D is true, if A, B and C are true. Else, D is false.
I tried
df['D'] = df.loc[(df['A'] == True) & df['B'] == True & df['C'] == True]
I get
TypeError: cannot compare a dtyped [float64] array with a scalar of type [bool]
Then I tried to follow this example and wrote a similar function as suggested in the link:
def all_true(row):
if row['A'] == True:
if row['B'] == True:
if row['C'] == True:
val = True
else:
val = 0
return val
df['D'] = df.apply(all_true(df), axis=1)
In which case I get
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
I'd appreciate suggestions. Thanks!
python pandas
add a comment |
I have a dataframe
A B C
0 True True True
1 True False False
2 False False False
I would like to add a row D with the following conditions:
D is true, if A, B and C are true. Else, D is false.
I tried
df['D'] = df.loc[(df['A'] == True) & df['B'] == True & df['C'] == True]
I get
TypeError: cannot compare a dtyped [float64] array with a scalar of type [bool]
Then I tried to follow this example and wrote a similar function as suggested in the link:
def all_true(row):
if row['A'] == True:
if row['B'] == True:
if row['C'] == True:
val = True
else:
val = 0
return val
df['D'] = df.apply(all_true(df), axis=1)
In which case I get
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
I'd appreciate suggestions. Thanks!
python pandas
I have a dataframe
A B C
0 True True True
1 True False False
2 False False False
I would like to add a row D with the following conditions:
D is true, if A, B and C are true. Else, D is false.
I tried
df['D'] = df.loc[(df['A'] == True) & df['B'] == True & df['C'] == True]
I get
TypeError: cannot compare a dtyped [float64] array with a scalar of type [bool]
Then I tried to follow this example and wrote a similar function as suggested in the link:
def all_true(row):
if row['A'] == True:
if row['B'] == True:
if row['C'] == True:
val = True
else:
val = 0
return val
df['D'] = df.apply(all_true(df), axis=1)
In which case I get
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
I'd appreciate suggestions. Thanks!
python pandas
python pandas
asked Nov 23 '18 at 6:43
MeeepMeeep
325
325
add a comment |
add a comment |
3 Answers
3
active
oldest
votes
Or even better:
df['D']=df.all(1)
And now:
print(df)
Is:
A B C D
0 True True True True
1 True False False False
2 False False False False
1
Did the trick. Thanks
– Meeep
Nov 23 '18 at 12:07
@Meeep Happy to help, :-), 😊😊😊
– U9-Forward
Nov 24 '18 at 23:45
add a comment |
Comparing with True
is not necessary, ony chain boolean masks with &
:
df['D'] = df['A'] & df['B'] & df['C']
If performance is important:
df['D'] = df['A'].values & df['B'].values & df['C'].values
Or use DataFrame.all
for check all True
s per rows:
df['D'] = df[['A','B','C']].all(axis=1)
#numpy all
#df['D'] = np.all(df.values,1)
print (df)
A B C D
0 True True True True
1 True False False False
2 False False False False
Performance:
np.random.seed(125)
def all1(df):
df['D'] = df.all(axis=1)
return df
def all1_numpy(df):
df['D'] = np.all(df.values,1)
return df
def eval1(df):
df['D'] = df.eval('A & B & C')
return df
def chained(df):
df['D'] = df['A'] & df['B'] & df['C']
return df
def chained_numpy(df):
df['D'] = df['A'].values & df['B'].values & df['C'].values
return df
def make_df(n):
df = pd.DataFrame({'A':np.random.choice([True, False], size=n),
'B':np.random.choice([True, False], size=n),
'C':np.random.choice([True, False], size=n)})
return df
perfplot.show(
setup=make_df,
kernels=[all1, all1_numpy, eval1,chained,chained_numpy],
n_range=[2**k for k in range(2, 25)],
logx=True,
logy=True,
equality_check=False,
xlabel='len(df)')
@jezrael, what isperfplot
is this matplotlib import? i'm into that learning this is good example.
– pygo
Nov 23 '18 at 8:31
1
no, it is custom module, learning from unutbu, github.com/nschloe/perfplot - but it use matplotlib
– jezrael
Nov 23 '18 at 8:33
add a comment |
Using pandas eval
:
df['D'] = df.eval('A & B & C')
Or:
df = df.eval('D = A & B & C')
#alternative inplace df.eval('D = A & B & C', inplace=True)
Or:
df['D'] = np.all(df.values,1)
print(df)
A B C D
0 True True True True
1 True False False False
2 False False False False
Good Try +1 :-)
– pygo
Nov 23 '18 at 12:45
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53441783%2fpandas-conditional-filter%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
Or even better:
df['D']=df.all(1)
And now:
print(df)
Is:
A B C D
0 True True True True
1 True False False False
2 False False False False
1
Did the trick. Thanks
– Meeep
Nov 23 '18 at 12:07
@Meeep Happy to help, :-), 😊😊😊
– U9-Forward
Nov 24 '18 at 23:45
add a comment |
Or even better:
df['D']=df.all(1)
And now:
print(df)
Is:
A B C D
0 True True True True
1 True False False False
2 False False False False
1
Did the trick. Thanks
– Meeep
Nov 23 '18 at 12:07
@Meeep Happy to help, :-), 😊😊😊
– U9-Forward
Nov 24 '18 at 23:45
add a comment |
Or even better:
df['D']=df.all(1)
And now:
print(df)
Is:
A B C D
0 True True True True
1 True False False False
2 False False False False
Or even better:
df['D']=df.all(1)
And now:
print(df)
Is:
A B C D
0 True True True True
1 True False False False
2 False False False False
answered Nov 23 '18 at 6:45
U9-ForwardU9-Forward
15.2k41438
15.2k41438
1
Did the trick. Thanks
– Meeep
Nov 23 '18 at 12:07
@Meeep Happy to help, :-), 😊😊😊
– U9-Forward
Nov 24 '18 at 23:45
add a comment |
1
Did the trick. Thanks
– Meeep
Nov 23 '18 at 12:07
@Meeep Happy to help, :-), 😊😊😊
– U9-Forward
Nov 24 '18 at 23:45
1
1
Did the trick. Thanks
– Meeep
Nov 23 '18 at 12:07
Did the trick. Thanks
– Meeep
Nov 23 '18 at 12:07
@Meeep Happy to help, :-), 😊😊😊
– U9-Forward
Nov 24 '18 at 23:45
@Meeep Happy to help, :-), 😊😊😊
– U9-Forward
Nov 24 '18 at 23:45
add a comment |
Comparing with True
is not necessary, ony chain boolean masks with &
:
df['D'] = df['A'] & df['B'] & df['C']
If performance is important:
df['D'] = df['A'].values & df['B'].values & df['C'].values
Or use DataFrame.all
for check all True
s per rows:
df['D'] = df[['A','B','C']].all(axis=1)
#numpy all
#df['D'] = np.all(df.values,1)
print (df)
A B C D
0 True True True True
1 True False False False
2 False False False False
Performance:
np.random.seed(125)
def all1(df):
df['D'] = df.all(axis=1)
return df
def all1_numpy(df):
df['D'] = np.all(df.values,1)
return df
def eval1(df):
df['D'] = df.eval('A & B & C')
return df
def chained(df):
df['D'] = df['A'] & df['B'] & df['C']
return df
def chained_numpy(df):
df['D'] = df['A'].values & df['B'].values & df['C'].values
return df
def make_df(n):
df = pd.DataFrame({'A':np.random.choice([True, False], size=n),
'B':np.random.choice([True, False], size=n),
'C':np.random.choice([True, False], size=n)})
return df
perfplot.show(
setup=make_df,
kernels=[all1, all1_numpy, eval1,chained,chained_numpy],
n_range=[2**k for k in range(2, 25)],
logx=True,
logy=True,
equality_check=False,
xlabel='len(df)')
@jezrael, what isperfplot
is this matplotlib import? i'm into that learning this is good example.
– pygo
Nov 23 '18 at 8:31
1
no, it is custom module, learning from unutbu, github.com/nschloe/perfplot - but it use matplotlib
– jezrael
Nov 23 '18 at 8:33
add a comment |
Comparing with True
is not necessary, ony chain boolean masks with &
:
df['D'] = df['A'] & df['B'] & df['C']
If performance is important:
df['D'] = df['A'].values & df['B'].values & df['C'].values
Or use DataFrame.all
for check all True
s per rows:
df['D'] = df[['A','B','C']].all(axis=1)
#numpy all
#df['D'] = np.all(df.values,1)
print (df)
A B C D
0 True True True True
1 True False False False
2 False False False False
Performance:
np.random.seed(125)
def all1(df):
df['D'] = df.all(axis=1)
return df
def all1_numpy(df):
df['D'] = np.all(df.values,1)
return df
def eval1(df):
df['D'] = df.eval('A & B & C')
return df
def chained(df):
df['D'] = df['A'] & df['B'] & df['C']
return df
def chained_numpy(df):
df['D'] = df['A'].values & df['B'].values & df['C'].values
return df
def make_df(n):
df = pd.DataFrame({'A':np.random.choice([True, False], size=n),
'B':np.random.choice([True, False], size=n),
'C':np.random.choice([True, False], size=n)})
return df
perfplot.show(
setup=make_df,
kernels=[all1, all1_numpy, eval1,chained,chained_numpy],
n_range=[2**k for k in range(2, 25)],
logx=True,
logy=True,
equality_check=False,
xlabel='len(df)')
@jezrael, what isperfplot
is this matplotlib import? i'm into that learning this is good example.
– pygo
Nov 23 '18 at 8:31
1
no, it is custom module, learning from unutbu, github.com/nschloe/perfplot - but it use matplotlib
– jezrael
Nov 23 '18 at 8:33
add a comment |
Comparing with True
is not necessary, ony chain boolean masks with &
:
df['D'] = df['A'] & df['B'] & df['C']
If performance is important:
df['D'] = df['A'].values & df['B'].values & df['C'].values
Or use DataFrame.all
for check all True
s per rows:
df['D'] = df[['A','B','C']].all(axis=1)
#numpy all
#df['D'] = np.all(df.values,1)
print (df)
A B C D
0 True True True True
1 True False False False
2 False False False False
Performance:
np.random.seed(125)
def all1(df):
df['D'] = df.all(axis=1)
return df
def all1_numpy(df):
df['D'] = np.all(df.values,1)
return df
def eval1(df):
df['D'] = df.eval('A & B & C')
return df
def chained(df):
df['D'] = df['A'] & df['B'] & df['C']
return df
def chained_numpy(df):
df['D'] = df['A'].values & df['B'].values & df['C'].values
return df
def make_df(n):
df = pd.DataFrame({'A':np.random.choice([True, False], size=n),
'B':np.random.choice([True, False], size=n),
'C':np.random.choice([True, False], size=n)})
return df
perfplot.show(
setup=make_df,
kernels=[all1, all1_numpy, eval1,chained,chained_numpy],
n_range=[2**k for k in range(2, 25)],
logx=True,
logy=True,
equality_check=False,
xlabel='len(df)')
Comparing with True
is not necessary, ony chain boolean masks with &
:
df['D'] = df['A'] & df['B'] & df['C']
If performance is important:
df['D'] = df['A'].values & df['B'].values & df['C'].values
Or use DataFrame.all
for check all True
s per rows:
df['D'] = df[['A','B','C']].all(axis=1)
#numpy all
#df['D'] = np.all(df.values,1)
print (df)
A B C D
0 True True True True
1 True False False False
2 False False False False
Performance:
np.random.seed(125)
def all1(df):
df['D'] = df.all(axis=1)
return df
def all1_numpy(df):
df['D'] = np.all(df.values,1)
return df
def eval1(df):
df['D'] = df.eval('A & B & C')
return df
def chained(df):
df['D'] = df['A'] & df['B'] & df['C']
return df
def chained_numpy(df):
df['D'] = df['A'].values & df['B'].values & df['C'].values
return df
def make_df(n):
df = pd.DataFrame({'A':np.random.choice([True, False], size=n),
'B':np.random.choice([True, False], size=n),
'C':np.random.choice([True, False], size=n)})
return df
perfplot.show(
setup=make_df,
kernels=[all1, all1_numpy, eval1,chained,chained_numpy],
n_range=[2**k for k in range(2, 25)],
logx=True,
logy=True,
equality_check=False,
xlabel='len(df)')
edited Nov 23 '18 at 7:11
answered Nov 23 '18 at 6:44
jezraeljezrael
334k25277353
334k25277353
@jezrael, what isperfplot
is this matplotlib import? i'm into that learning this is good example.
– pygo
Nov 23 '18 at 8:31
1
no, it is custom module, learning from unutbu, github.com/nschloe/perfplot - but it use matplotlib
– jezrael
Nov 23 '18 at 8:33
add a comment |
@jezrael, what isperfplot
is this matplotlib import? i'm into that learning this is good example.
– pygo
Nov 23 '18 at 8:31
1
no, it is custom module, learning from unutbu, github.com/nschloe/perfplot - but it use matplotlib
– jezrael
Nov 23 '18 at 8:33
@jezrael, what is
perfplot
is this matplotlib import? i'm into that learning this is good example.– pygo
Nov 23 '18 at 8:31
@jezrael, what is
perfplot
is this matplotlib import? i'm into that learning this is good example.– pygo
Nov 23 '18 at 8:31
1
1
no, it is custom module, learning from unutbu, github.com/nschloe/perfplot - but it use matplotlib
– jezrael
Nov 23 '18 at 8:33
no, it is custom module, learning from unutbu, github.com/nschloe/perfplot - but it use matplotlib
– jezrael
Nov 23 '18 at 8:33
add a comment |
Using pandas eval
:
df['D'] = df.eval('A & B & C')
Or:
df = df.eval('D = A & B & C')
#alternative inplace df.eval('D = A & B & C', inplace=True)
Or:
df['D'] = np.all(df.values,1)
print(df)
A B C D
0 True True True True
1 True False False False
2 False False False False
Good Try +1 :-)
– pygo
Nov 23 '18 at 12:45
add a comment |
Using pandas eval
:
df['D'] = df.eval('A & B & C')
Or:
df = df.eval('D = A & B & C')
#alternative inplace df.eval('D = A & B & C', inplace=True)
Or:
df['D'] = np.all(df.values,1)
print(df)
A B C D
0 True True True True
1 True False False False
2 False False False False
Good Try +1 :-)
– pygo
Nov 23 '18 at 12:45
add a comment |
Using pandas eval
:
df['D'] = df.eval('A & B & C')
Or:
df = df.eval('D = A & B & C')
#alternative inplace df.eval('D = A & B & C', inplace=True)
Or:
df['D'] = np.all(df.values,1)
print(df)
A B C D
0 True True True True
1 True False False False
2 False False False False
Using pandas eval
:
df['D'] = df.eval('A & B & C')
Or:
df = df.eval('D = A & B & C')
#alternative inplace df.eval('D = A & B & C', inplace=True)
Or:
df['D'] = np.all(df.values,1)
print(df)
A B C D
0 True True True True
1 True False False False
2 False False False False
edited Nov 23 '18 at 7:05
answered Nov 23 '18 at 6:46
Sandeep KadapaSandeep Kadapa
7,043830
7,043830
Good Try +1 :-)
– pygo
Nov 23 '18 at 12:45
add a comment |
Good Try +1 :-)
– pygo
Nov 23 '18 at 12:45
Good Try +1 :-)
– pygo
Nov 23 '18 at 12:45
Good Try +1 :-)
– pygo
Nov 23 '18 at 12:45
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53441783%2fpandas-conditional-filter%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown