pandas get dummies for column with list
Input:-
empNo name
1234 [ AB, DE ]
5678 [ FG, IJ ]
Command:-
dataFrame = dataFrame.join(dataFrame.name.str.join('|').str.get_dummies().add_prefix('dummy_name_'))
The above command brings dummy "for each character of the column name"
Output:-
empNo name dummy_name_A dummy_name_B dummy_name_D dummy_name_E dummy_name_F dummy_name_G dummy_name_I dummy_name_J
1234 [ AB, DE ] 1 1 1 1 0 0 0 0
5678 [ FG, IJ ] 0 0 0 0 1 1 1 1
Expected:-
empNo name dummy_name_AB dummy_name_DE dummy_name_FG dummy_name_IJ
1234 [ AB, DE ] 1 1 0 0
5678 [ FG, IJ ] 0 0 1 1
python pandas dataframe
add a comment |
Input:-
empNo name
1234 [ AB, DE ]
5678 [ FG, IJ ]
Command:-
dataFrame = dataFrame.join(dataFrame.name.str.join('|').str.get_dummies().add_prefix('dummy_name_'))
The above command brings dummy "for each character of the column name"
Output:-
empNo name dummy_name_A dummy_name_B dummy_name_D dummy_name_E dummy_name_F dummy_name_G dummy_name_I dummy_name_J
1234 [ AB, DE ] 1 1 1 1 0 0 0 0
5678 [ FG, IJ ] 0 0 0 0 1 1 1 1
Expected:-
empNo name dummy_name_AB dummy_name_DE dummy_name_FG dummy_name_IJ
1234 [ AB, DE ] 1 1 0 0
5678 [ FG, IJ ] 0 0 1 1
python pandas dataframe
4
Strange. Can you share your dataframe like this:print(dataFrame.to_dict())and post result here.
– Anton vBR
Nov 25 '18 at 19:03
4
If it is huge then use:print(dataFrame.head(2).to_dict())to limit it to two rows.
– Anton vBR
Nov 25 '18 at 19:35
1
I got the issue Anton. Had accidentally converted the datatype of the column to [ .astype(str) ]. I removed it and my earlier command works fine as you had correctly pointed out. Thanks for the help.
– Jenny
Nov 25 '18 at 19:46
add a comment |
Input:-
empNo name
1234 [ AB, DE ]
5678 [ FG, IJ ]
Command:-
dataFrame = dataFrame.join(dataFrame.name.str.join('|').str.get_dummies().add_prefix('dummy_name_'))
The above command brings dummy "for each character of the column name"
Output:-
empNo name dummy_name_A dummy_name_B dummy_name_D dummy_name_E dummy_name_F dummy_name_G dummy_name_I dummy_name_J
1234 [ AB, DE ] 1 1 1 1 0 0 0 0
5678 [ FG, IJ ] 0 0 0 0 1 1 1 1
Expected:-
empNo name dummy_name_AB dummy_name_DE dummy_name_FG dummy_name_IJ
1234 [ AB, DE ] 1 1 0 0
5678 [ FG, IJ ] 0 0 1 1
python pandas dataframe
Input:-
empNo name
1234 [ AB, DE ]
5678 [ FG, IJ ]
Command:-
dataFrame = dataFrame.join(dataFrame.name.str.join('|').str.get_dummies().add_prefix('dummy_name_'))
The above command brings dummy "for each character of the column name"
Output:-
empNo name dummy_name_A dummy_name_B dummy_name_D dummy_name_E dummy_name_F dummy_name_G dummy_name_I dummy_name_J
1234 [ AB, DE ] 1 1 1 1 0 0 0 0
5678 [ FG, IJ ] 0 0 0 0 1 1 1 1
Expected:-
empNo name dummy_name_AB dummy_name_DE dummy_name_FG dummy_name_IJ
1234 [ AB, DE ] 1 1 0 0
5678 [ FG, IJ ] 0 0 1 1
python pandas dataframe
python pandas dataframe
edited Nov 25 '18 at 19:00
Mayank Porwal
4,9802725
4,9802725
asked Nov 25 '18 at 18:58
JennyJenny
166
166
4
Strange. Can you share your dataframe like this:print(dataFrame.to_dict())and post result here.
– Anton vBR
Nov 25 '18 at 19:03
4
If it is huge then use:print(dataFrame.head(2).to_dict())to limit it to two rows.
– Anton vBR
Nov 25 '18 at 19:35
1
I got the issue Anton. Had accidentally converted the datatype of the column to [ .astype(str) ]. I removed it and my earlier command works fine as you had correctly pointed out. Thanks for the help.
– Jenny
Nov 25 '18 at 19:46
add a comment |
4
Strange. Can you share your dataframe like this:print(dataFrame.to_dict())and post result here.
– Anton vBR
Nov 25 '18 at 19:03
4
If it is huge then use:print(dataFrame.head(2).to_dict())to limit it to two rows.
– Anton vBR
Nov 25 '18 at 19:35
1
I got the issue Anton. Had accidentally converted the datatype of the column to [ .astype(str) ]. I removed it and my earlier command works fine as you had correctly pointed out. Thanks for the help.
– Jenny
Nov 25 '18 at 19:46
4
4
Strange. Can you share your dataframe like this:
print(dataFrame.to_dict()) and post result here.– Anton vBR
Nov 25 '18 at 19:03
Strange. Can you share your dataframe like this:
print(dataFrame.to_dict()) and post result here.– Anton vBR
Nov 25 '18 at 19:03
4
4
If it is huge then use:
print(dataFrame.head(2).to_dict()) to limit it to two rows.– Anton vBR
Nov 25 '18 at 19:35
If it is huge then use:
print(dataFrame.head(2).to_dict()) to limit it to two rows.– Anton vBR
Nov 25 '18 at 19:35
1
1
I got the issue Anton. Had accidentally converted the datatype of the column to [ .astype(str) ]. I removed it and my earlier command works fine as you had correctly pointed out. Thanks for the help.
– Jenny
Nov 25 '18 at 19:46
I got the issue Anton. Had accidentally converted the datatype of the column to [ .astype(str) ]. I removed it and my earlier command works fine as you had correctly pointed out. Thanks for the help.
– Jenny
Nov 25 '18 at 19:46
add a comment |
1 Answer
1
active
oldest
votes
I think the list is not the list , so we using ast to convert the string type column back to list
import ast
df.name=df.name.apply(ast.literal_eval)
Then using str get_dummies
s=df.name.apply(pd.Series).stack().str.get_dummies().sum(level=0).add_prefix('dummy_name_')
s
dummy_name_AB dummy_name_DE dummy_name_FG dummy_name_IJ
0 1 1 0 0
1 0 0 1 1
Then
pd.concat([df[['empNo']],s],axis=1)
The data input
df.to_dict()
{'empNo': {0: 1234, 1: 5678}, 'name': {0: ['AB', 'DE'], 1: ['FG', 'IJ']}}
Thanks W-B. This command brings the output as dummy_name_['AB'] , dummy_name_['DE'] , dummy_name_[ 'AB','DE' ], dummy_name_['FG'] , dummy_name_['IJ'] , dummy_name_[ 'FG','IJ' ], and one more additional column for an empty list dummy_name_ Can you please help achieving the expected result given above ?
– Jenny
Nov 25 '18 at 19:30
@Jenny please check above data input to see the different between mine and yours
– Wen-Ben
Nov 25 '18 at 19:32
I got the issue W-B. Had accidentally converted the datatype of the column to [ .astype(str) ]. I removed it and my earlier command works fine as you had correctly pointed out. Thanks for the help.
– Jenny
Nov 25 '18 at 19:53
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53470831%2fpandas-get-dummies-for-column-with-list%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
I think the list is not the list , so we using ast to convert the string type column back to list
import ast
df.name=df.name.apply(ast.literal_eval)
Then using str get_dummies
s=df.name.apply(pd.Series).stack().str.get_dummies().sum(level=0).add_prefix('dummy_name_')
s
dummy_name_AB dummy_name_DE dummy_name_FG dummy_name_IJ
0 1 1 0 0
1 0 0 1 1
Then
pd.concat([df[['empNo']],s],axis=1)
The data input
df.to_dict()
{'empNo': {0: 1234, 1: 5678}, 'name': {0: ['AB', 'DE'], 1: ['FG', 'IJ']}}
Thanks W-B. This command brings the output as dummy_name_['AB'] , dummy_name_['DE'] , dummy_name_[ 'AB','DE' ], dummy_name_['FG'] , dummy_name_['IJ'] , dummy_name_[ 'FG','IJ' ], and one more additional column for an empty list dummy_name_ Can you please help achieving the expected result given above ?
– Jenny
Nov 25 '18 at 19:30
@Jenny please check above data input to see the different between mine and yours
– Wen-Ben
Nov 25 '18 at 19:32
I got the issue W-B. Had accidentally converted the datatype of the column to [ .astype(str) ]. I removed it and my earlier command works fine as you had correctly pointed out. Thanks for the help.
– Jenny
Nov 25 '18 at 19:53
add a comment |
I think the list is not the list , so we using ast to convert the string type column back to list
import ast
df.name=df.name.apply(ast.literal_eval)
Then using str get_dummies
s=df.name.apply(pd.Series).stack().str.get_dummies().sum(level=0).add_prefix('dummy_name_')
s
dummy_name_AB dummy_name_DE dummy_name_FG dummy_name_IJ
0 1 1 0 0
1 0 0 1 1
Then
pd.concat([df[['empNo']],s],axis=1)
The data input
df.to_dict()
{'empNo': {0: 1234, 1: 5678}, 'name': {0: ['AB', 'DE'], 1: ['FG', 'IJ']}}
Thanks W-B. This command brings the output as dummy_name_['AB'] , dummy_name_['DE'] , dummy_name_[ 'AB','DE' ], dummy_name_['FG'] , dummy_name_['IJ'] , dummy_name_[ 'FG','IJ' ], and one more additional column for an empty list dummy_name_ Can you please help achieving the expected result given above ?
– Jenny
Nov 25 '18 at 19:30
@Jenny please check above data input to see the different between mine and yours
– Wen-Ben
Nov 25 '18 at 19:32
I got the issue W-B. Had accidentally converted the datatype of the column to [ .astype(str) ]. I removed it and my earlier command works fine as you had correctly pointed out. Thanks for the help.
– Jenny
Nov 25 '18 at 19:53
add a comment |
I think the list is not the list , so we using ast to convert the string type column back to list
import ast
df.name=df.name.apply(ast.literal_eval)
Then using str get_dummies
s=df.name.apply(pd.Series).stack().str.get_dummies().sum(level=0).add_prefix('dummy_name_')
s
dummy_name_AB dummy_name_DE dummy_name_FG dummy_name_IJ
0 1 1 0 0
1 0 0 1 1
Then
pd.concat([df[['empNo']],s],axis=1)
The data input
df.to_dict()
{'empNo': {0: 1234, 1: 5678}, 'name': {0: ['AB', 'DE'], 1: ['FG', 'IJ']}}
I think the list is not the list , so we using ast to convert the string type column back to list
import ast
df.name=df.name.apply(ast.literal_eval)
Then using str get_dummies
s=df.name.apply(pd.Series).stack().str.get_dummies().sum(level=0).add_prefix('dummy_name_')
s
dummy_name_AB dummy_name_DE dummy_name_FG dummy_name_IJ
0 1 1 0 0
1 0 0 1 1
Then
pd.concat([df[['empNo']],s],axis=1)
The data input
df.to_dict()
{'empNo': {0: 1234, 1: 5678}, 'name': {0: ['AB', 'DE'], 1: ['FG', 'IJ']}}
edited Nov 25 '18 at 19:32
answered Nov 25 '18 at 19:23
Wen-BenWen-Ben
118k83469
118k83469
Thanks W-B. This command brings the output as dummy_name_['AB'] , dummy_name_['DE'] , dummy_name_[ 'AB','DE' ], dummy_name_['FG'] , dummy_name_['IJ'] , dummy_name_[ 'FG','IJ' ], and one more additional column for an empty list dummy_name_ Can you please help achieving the expected result given above ?
– Jenny
Nov 25 '18 at 19:30
@Jenny please check above data input to see the different between mine and yours
– Wen-Ben
Nov 25 '18 at 19:32
I got the issue W-B. Had accidentally converted the datatype of the column to [ .astype(str) ]. I removed it and my earlier command works fine as you had correctly pointed out. Thanks for the help.
– Jenny
Nov 25 '18 at 19:53
add a comment |
Thanks W-B. This command brings the output as dummy_name_['AB'] , dummy_name_['DE'] , dummy_name_[ 'AB','DE' ], dummy_name_['FG'] , dummy_name_['IJ'] , dummy_name_[ 'FG','IJ' ], and one more additional column for an empty list dummy_name_ Can you please help achieving the expected result given above ?
– Jenny
Nov 25 '18 at 19:30
@Jenny please check above data input to see the different between mine and yours
– Wen-Ben
Nov 25 '18 at 19:32
I got the issue W-B. Had accidentally converted the datatype of the column to [ .astype(str) ]. I removed it and my earlier command works fine as you had correctly pointed out. Thanks for the help.
– Jenny
Nov 25 '18 at 19:53
Thanks W-B. This command brings the output as dummy_name_['AB'] , dummy_name_['DE'] , dummy_name_[ 'AB','DE' ], dummy_name_['FG'] , dummy_name_['IJ'] , dummy_name_[ 'FG','IJ' ], and one more additional column for an empty list dummy_name_ Can you please help achieving the expected result given above ?
– Jenny
Nov 25 '18 at 19:30
Thanks W-B. This command brings the output as dummy_name_['AB'] , dummy_name_['DE'] , dummy_name_[ 'AB','DE' ], dummy_name_['FG'] , dummy_name_['IJ'] , dummy_name_[ 'FG','IJ' ], and one more additional column for an empty list dummy_name_ Can you please help achieving the expected result given above ?
– Jenny
Nov 25 '18 at 19:30
@Jenny please check above data input to see the different between mine and yours
– Wen-Ben
Nov 25 '18 at 19:32
@Jenny please check above data input to see the different between mine and yours
– Wen-Ben
Nov 25 '18 at 19:32
I got the issue W-B. Had accidentally converted the datatype of the column to [ .astype(str) ]. I removed it and my earlier command works fine as you had correctly pointed out. Thanks for the help.
– Jenny
Nov 25 '18 at 19:53
I got the issue W-B. Had accidentally converted the datatype of the column to [ .astype(str) ]. I removed it and my earlier command works fine as you had correctly pointed out. Thanks for the help.
– Jenny
Nov 25 '18 at 19:53
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53470831%2fpandas-get-dummies-for-column-with-list%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
4
Strange. Can you share your dataframe like this:
print(dataFrame.to_dict())and post result here.– Anton vBR
Nov 25 '18 at 19:03
4
If it is huge then use:
print(dataFrame.head(2).to_dict())to limit it to two rows.– Anton vBR
Nov 25 '18 at 19:35
1
I got the issue Anton. Had accidentally converted the datatype of the column to [ .astype(str) ]. I removed it and my earlier command works fine as you had correctly pointed out. Thanks for the help.
– Jenny
Nov 25 '18 at 19:46