Concatenating select columns of a panda, while ignoring blanks in columns
I have a data frame which looks like this.
key A1 A2 A3 BX CX DX
1 X1 Y1 B1 C1 D1
2 X2 Z2 B2 C2 D2
3 X3 B3 C3 D3
4 X4 B4 C4 D4
5 B5 C5 D5
I am trying to form a new col 'NC' which is concatenated from columns A1,A2 and A3. If there is no entry in a certain column, the next column needs to step forward.The seperator can be a "," or a "_"
The final df looks like
key A1 A2 A3 BX CX DX NC
1 X1 Y1 B1 C1 D1 X1_Y1
2 X2 Z2 B2 C2 D2 X2_Z2
3 X3 B3 C3 D3 X3
4 X4 B4 C4 D4 X4
5 B5 C5 D5
If there are no entries in A1-A3, then the entry in NC remains blank.I have looked at other posts in SO and have tried other ways, but I cant seemed to get it right. The entries in A1-A3 columns are floats which sometimes have a 0 after the number(X2.0). I also want to drop the decimal and the 0. Hoping someone more knowledgeable can show me the way.
edit Q to change data type in the data frame
key A1 A2 A3 BX CX DX
1 1.0 2.0 B1 C1 D1
2 3 4 B2 C2 D2
3 7.0 B3 C3 D3
4 5 6.0 7.0 B4 C4 D4
5 B5 C5 D5
new df looks like
key A1 A2 A3 BX CX DX NC
1 1.0 2.0 B1 C1 D1 1_2
2 3 4 B2 C2 D2 3_4
3 7.0 B3 C3 D3 7
4 5 6.0 7.0 B4 C4 D4 5_6_7
5 B5 C5 D5
python pandas
add a comment |
I have a data frame which looks like this.
key A1 A2 A3 BX CX DX
1 X1 Y1 B1 C1 D1
2 X2 Z2 B2 C2 D2
3 X3 B3 C3 D3
4 X4 B4 C4 D4
5 B5 C5 D5
I am trying to form a new col 'NC' which is concatenated from columns A1,A2 and A3. If there is no entry in a certain column, the next column needs to step forward.The seperator can be a "," or a "_"
The final df looks like
key A1 A2 A3 BX CX DX NC
1 X1 Y1 B1 C1 D1 X1_Y1
2 X2 Z2 B2 C2 D2 X2_Z2
3 X3 B3 C3 D3 X3
4 X4 B4 C4 D4 X4
5 B5 C5 D5
If there are no entries in A1-A3, then the entry in NC remains blank.I have looked at other posts in SO and have tried other ways, but I cant seemed to get it right. The entries in A1-A3 columns are floats which sometimes have a 0 after the number(X2.0). I also want to drop the decimal and the 0. Hoping someone more knowledgeable can show me the way.
edit Q to change data type in the data frame
key A1 A2 A3 BX CX DX
1 1.0 2.0 B1 C1 D1
2 3 4 B2 C2 D2
3 7.0 B3 C3 D3
4 5 6.0 7.0 B4 C4 D4
5 B5 C5 D5
new df looks like
key A1 A2 A3 BX CX DX NC
1 1.0 2.0 B1 C1 D1 1_2
2 3 4 B2 C2 D2 3_4
3 7.0 B3 C3 D3 7
4 5 6.0 7.0 B4 C4 D4 5_6_7
5 B5 C5 D5
python pandas
add a comment |
I have a data frame which looks like this.
key A1 A2 A3 BX CX DX
1 X1 Y1 B1 C1 D1
2 X2 Z2 B2 C2 D2
3 X3 B3 C3 D3
4 X4 B4 C4 D4
5 B5 C5 D5
I am trying to form a new col 'NC' which is concatenated from columns A1,A2 and A3. If there is no entry in a certain column, the next column needs to step forward.The seperator can be a "," or a "_"
The final df looks like
key A1 A2 A3 BX CX DX NC
1 X1 Y1 B1 C1 D1 X1_Y1
2 X2 Z2 B2 C2 D2 X2_Z2
3 X3 B3 C3 D3 X3
4 X4 B4 C4 D4 X4
5 B5 C5 D5
If there are no entries in A1-A3, then the entry in NC remains blank.I have looked at other posts in SO and have tried other ways, but I cant seemed to get it right. The entries in A1-A3 columns are floats which sometimes have a 0 after the number(X2.0). I also want to drop the decimal and the 0. Hoping someone more knowledgeable can show me the way.
edit Q to change data type in the data frame
key A1 A2 A3 BX CX DX
1 1.0 2.0 B1 C1 D1
2 3 4 B2 C2 D2
3 7.0 B3 C3 D3
4 5 6.0 7.0 B4 C4 D4
5 B5 C5 D5
new df looks like
key A1 A2 A3 BX CX DX NC
1 1.0 2.0 B1 C1 D1 1_2
2 3 4 B2 C2 D2 3_4
3 7.0 B3 C3 D3 7
4 5 6.0 7.0 B4 C4 D4 5_6_7
5 B5 C5 D5
python pandas
I have a data frame which looks like this.
key A1 A2 A3 BX CX DX
1 X1 Y1 B1 C1 D1
2 X2 Z2 B2 C2 D2
3 X3 B3 C3 D3
4 X4 B4 C4 D4
5 B5 C5 D5
I am trying to form a new col 'NC' which is concatenated from columns A1,A2 and A3. If there is no entry in a certain column, the next column needs to step forward.The seperator can be a "," or a "_"
The final df looks like
key A1 A2 A3 BX CX DX NC
1 X1 Y1 B1 C1 D1 X1_Y1
2 X2 Z2 B2 C2 D2 X2_Z2
3 X3 B3 C3 D3 X3
4 X4 B4 C4 D4 X4
5 B5 C5 D5
If there are no entries in A1-A3, then the entry in NC remains blank.I have looked at other posts in SO and have tried other ways, but I cant seemed to get it right. The entries in A1-A3 columns are floats which sometimes have a 0 after the number(X2.0). I also want to drop the decimal and the 0. Hoping someone more knowledgeable can show me the way.
edit Q to change data type in the data frame
key A1 A2 A3 BX CX DX
1 1.0 2.0 B1 C1 D1
2 3 4 B2 C2 D2
3 7.0 B3 C3 D3
4 5 6.0 7.0 B4 C4 D4
5 B5 C5 D5
new df looks like
key A1 A2 A3 BX CX DX NC
1 1.0 2.0 B1 C1 D1 1_2
2 3 4 B2 C2 D2 3_4
3 7.0 B3 C3 D3 7
4 5 6.0 7.0 B4 C4 D4 5_6_7
5 B5 C5 D5
python pandas
python pandas
edited Nov 25 '18 at 1:24
Acinonyx
asked Nov 25 '18 at 1:01
AcinonyxAcinonyx
399
399
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
You can use filter to filter your columns, and agg to join:
# Extract columns
v = df.filter(like='A')
# Convert blanks to NaNs so we can call Series.dropna later.
df['NC'] = v[v.astype(bool)].agg(lambda x: '_'.join(x.dropna()), axis=1)
# Or,
# df['NC'] = v[v.astype(bool)].agg(
# lambda x: x.dropna().str.cat(sep='_'), axis=1)
print(df)
key A1 A2 A3 BX CX DX NC
0 1 X1 Y1 B1 C1 D1 X1_Y1
1 2 X2 Z2 B2 C2 D2 X2_Z2
2 3 X3 B3 C3 D3 X3
3 4 X4 B4 C4 D4 X4
4 5 B5 C5 D5
Can you take a look at the Q again. Changed the data type. thx
– Acinonyx
Nov 25 '18 at 1:45
@Acinonyx use'_'.join(x.dropna().astype(int).astype(str))inside the lambda function (keeping everything else the same) instead? If that doesn't work, try something simpler:'_'.join(x.dropna().astype(str))
– coldspeed
Nov 25 '18 at 2:11
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53463798%2fconcatenating-select-columns-of-a-panda-while-ignoring-blanks-in-columns%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
You can use filter to filter your columns, and agg to join:
# Extract columns
v = df.filter(like='A')
# Convert blanks to NaNs so we can call Series.dropna later.
df['NC'] = v[v.astype(bool)].agg(lambda x: '_'.join(x.dropna()), axis=1)
# Or,
# df['NC'] = v[v.astype(bool)].agg(
# lambda x: x.dropna().str.cat(sep='_'), axis=1)
print(df)
key A1 A2 A3 BX CX DX NC
0 1 X1 Y1 B1 C1 D1 X1_Y1
1 2 X2 Z2 B2 C2 D2 X2_Z2
2 3 X3 B3 C3 D3 X3
3 4 X4 B4 C4 D4 X4
4 5 B5 C5 D5
Can you take a look at the Q again. Changed the data type. thx
– Acinonyx
Nov 25 '18 at 1:45
@Acinonyx use'_'.join(x.dropna().astype(int).astype(str))inside the lambda function (keeping everything else the same) instead? If that doesn't work, try something simpler:'_'.join(x.dropna().astype(str))
– coldspeed
Nov 25 '18 at 2:11
add a comment |
You can use filter to filter your columns, and agg to join:
# Extract columns
v = df.filter(like='A')
# Convert blanks to NaNs so we can call Series.dropna later.
df['NC'] = v[v.astype(bool)].agg(lambda x: '_'.join(x.dropna()), axis=1)
# Or,
# df['NC'] = v[v.astype(bool)].agg(
# lambda x: x.dropna().str.cat(sep='_'), axis=1)
print(df)
key A1 A2 A3 BX CX DX NC
0 1 X1 Y1 B1 C1 D1 X1_Y1
1 2 X2 Z2 B2 C2 D2 X2_Z2
2 3 X3 B3 C3 D3 X3
3 4 X4 B4 C4 D4 X4
4 5 B5 C5 D5
Can you take a look at the Q again. Changed the data type. thx
– Acinonyx
Nov 25 '18 at 1:45
@Acinonyx use'_'.join(x.dropna().astype(int).astype(str))inside the lambda function (keeping everything else the same) instead? If that doesn't work, try something simpler:'_'.join(x.dropna().astype(str))
– coldspeed
Nov 25 '18 at 2:11
add a comment |
You can use filter to filter your columns, and agg to join:
# Extract columns
v = df.filter(like='A')
# Convert blanks to NaNs so we can call Series.dropna later.
df['NC'] = v[v.astype(bool)].agg(lambda x: '_'.join(x.dropna()), axis=1)
# Or,
# df['NC'] = v[v.astype(bool)].agg(
# lambda x: x.dropna().str.cat(sep='_'), axis=1)
print(df)
key A1 A2 A3 BX CX DX NC
0 1 X1 Y1 B1 C1 D1 X1_Y1
1 2 X2 Z2 B2 C2 D2 X2_Z2
2 3 X3 B3 C3 D3 X3
3 4 X4 B4 C4 D4 X4
4 5 B5 C5 D5
You can use filter to filter your columns, and agg to join:
# Extract columns
v = df.filter(like='A')
# Convert blanks to NaNs so we can call Series.dropna later.
df['NC'] = v[v.astype(bool)].agg(lambda x: '_'.join(x.dropna()), axis=1)
# Or,
# df['NC'] = v[v.astype(bool)].agg(
# lambda x: x.dropna().str.cat(sep='_'), axis=1)
print(df)
key A1 A2 A3 BX CX DX NC
0 1 X1 Y1 B1 C1 D1 X1_Y1
1 2 X2 Z2 B2 C2 D2 X2_Z2
2 3 X3 B3 C3 D3 X3
3 4 X4 B4 C4 D4 X4
4 5 B5 C5 D5
answered Nov 25 '18 at 1:06
coldspeedcoldspeed
135k23145230
135k23145230
Can you take a look at the Q again. Changed the data type. thx
– Acinonyx
Nov 25 '18 at 1:45
@Acinonyx use'_'.join(x.dropna().astype(int).astype(str))inside the lambda function (keeping everything else the same) instead? If that doesn't work, try something simpler:'_'.join(x.dropna().astype(str))
– coldspeed
Nov 25 '18 at 2:11
add a comment |
Can you take a look at the Q again. Changed the data type. thx
– Acinonyx
Nov 25 '18 at 1:45
@Acinonyx use'_'.join(x.dropna().astype(int).astype(str))inside the lambda function (keeping everything else the same) instead? If that doesn't work, try something simpler:'_'.join(x.dropna().astype(str))
– coldspeed
Nov 25 '18 at 2:11
Can you take a look at the Q again. Changed the data type. thx
– Acinonyx
Nov 25 '18 at 1:45
Can you take a look at the Q again. Changed the data type. thx
– Acinonyx
Nov 25 '18 at 1:45
@Acinonyx use
'_'.join(x.dropna().astype(int).astype(str)) inside the lambda function (keeping everything else the same) instead? If that doesn't work, try something simpler: '_'.join(x.dropna().astype(str))– coldspeed
Nov 25 '18 at 2:11
@Acinonyx use
'_'.join(x.dropna().astype(int).astype(str)) inside the lambda function (keeping everything else the same) instead? If that doesn't work, try something simpler: '_'.join(x.dropna().astype(str))– coldspeed
Nov 25 '18 at 2:11
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53463798%2fconcatenating-select-columns-of-a-panda-while-ignoring-blanks-in-columns%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown