Python: Generate dictionary from pandas dataframe with rows as keys and columns as values












1















I have a dataframe that looks like this:



     Curricula Course1 Course2 Course3 ... CourseN
0 q1 c1 c2 NaN NaN
1 q2 c14 c21 c1 Nan
2 q3 c2 c14 NaN Nan
...
M qm c7 c9 c21


Where the number of Courses per Curricula is different.



What I need is a dictionary from this dataframe looking like this:



{'q1': 'c1', 'q1': 'c2', 'q2': 'c14', 'q2': 'c21', 'q2: 'c1' ... }


Where the row names are my keys and for each row, the dictionary is filled with all the 'Curricula': 'Course' information that is given, excluding 'NaN' values.



What i tried so far was set the index to the 'Curricula' column, transposing the dataframe and using the to_dict('records') methods but this resulted in the following output:



in:



df.set_index('Curricula')
df_transposed = df.transpose()
Dic = df_transposed.to_dict('records')


out:



[{0: 'q1', 1: 'q2', 2: 'q3', ... }, {0: 'c1', 1: 'c14', 2: 'c2' ...} ... {0: NaN, 1: 'c1', 2: 'Nan']


So here the columns integer values are used as keys instead of my wanted 'Curricula' column values and additionally, the NaN values are not excluded.



Anyone an idea how to fix that?



Best regards,
Jan










share|improve this question


















  • 2





    How do you expect to have duplicate keys in your dictionary? that's not possible

    – user3483203
    Nov 25 '18 at 16:58
















1















I have a dataframe that looks like this:



     Curricula Course1 Course2 Course3 ... CourseN
0 q1 c1 c2 NaN NaN
1 q2 c14 c21 c1 Nan
2 q3 c2 c14 NaN Nan
...
M qm c7 c9 c21


Where the number of Courses per Curricula is different.



What I need is a dictionary from this dataframe looking like this:



{'q1': 'c1', 'q1': 'c2', 'q2': 'c14', 'q2': 'c21', 'q2: 'c1' ... }


Where the row names are my keys and for each row, the dictionary is filled with all the 'Curricula': 'Course' information that is given, excluding 'NaN' values.



What i tried so far was set the index to the 'Curricula' column, transposing the dataframe and using the to_dict('records') methods but this resulted in the following output:



in:



df.set_index('Curricula')
df_transposed = df.transpose()
Dic = df_transposed.to_dict('records')


out:



[{0: 'q1', 1: 'q2', 2: 'q3', ... }, {0: 'c1', 1: 'c14', 2: 'c2' ...} ... {0: NaN, 1: 'c1', 2: 'Nan']


So here the columns integer values are used as keys instead of my wanted 'Curricula' column values and additionally, the NaN values are not excluded.



Anyone an idea how to fix that?



Best regards,
Jan










share|improve this question


















  • 2





    How do you expect to have duplicate keys in your dictionary? that's not possible

    – user3483203
    Nov 25 '18 at 16:58














1












1








1








I have a dataframe that looks like this:



     Curricula Course1 Course2 Course3 ... CourseN
0 q1 c1 c2 NaN NaN
1 q2 c14 c21 c1 Nan
2 q3 c2 c14 NaN Nan
...
M qm c7 c9 c21


Where the number of Courses per Curricula is different.



What I need is a dictionary from this dataframe looking like this:



{'q1': 'c1', 'q1': 'c2', 'q2': 'c14', 'q2': 'c21', 'q2: 'c1' ... }


Where the row names are my keys and for each row, the dictionary is filled with all the 'Curricula': 'Course' information that is given, excluding 'NaN' values.



What i tried so far was set the index to the 'Curricula' column, transposing the dataframe and using the to_dict('records') methods but this resulted in the following output:



in:



df.set_index('Curricula')
df_transposed = df.transpose()
Dic = df_transposed.to_dict('records')


out:



[{0: 'q1', 1: 'q2', 2: 'q3', ... }, {0: 'c1', 1: 'c14', 2: 'c2' ...} ... {0: NaN, 1: 'c1', 2: 'Nan']


So here the columns integer values are used as keys instead of my wanted 'Curricula' column values and additionally, the NaN values are not excluded.



Anyone an idea how to fix that?



Best regards,
Jan










share|improve this question














I have a dataframe that looks like this:



     Curricula Course1 Course2 Course3 ... CourseN
0 q1 c1 c2 NaN NaN
1 q2 c14 c21 c1 Nan
2 q3 c2 c14 NaN Nan
...
M qm c7 c9 c21


Where the number of Courses per Curricula is different.



What I need is a dictionary from this dataframe looking like this:



{'q1': 'c1', 'q1': 'c2', 'q2': 'c14', 'q2': 'c21', 'q2: 'c1' ... }


Where the row names are my keys and for each row, the dictionary is filled with all the 'Curricula': 'Course' information that is given, excluding 'NaN' values.



What i tried so far was set the index to the 'Curricula' column, transposing the dataframe and using the to_dict('records') methods but this resulted in the following output:



in:



df.set_index('Curricula')
df_transposed = df.transpose()
Dic = df_transposed.to_dict('records')


out:



[{0: 'q1', 1: 'q2', 2: 'q3', ... }, {0: 'c1', 1: 'c14', 2: 'c2' ...} ... {0: NaN, 1: 'c1', 2: 'Nan']


So here the columns integer values are used as keys instead of my wanted 'Curricula' column values and additionally, the NaN values are not excluded.



Anyone an idea how to fix that?



Best regards,
Jan







python pandas dictionary dataframe todictionary






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 25 '18 at 16:51









JanBJanB

415




415








  • 2





    How do you expect to have duplicate keys in your dictionary? that's not possible

    – user3483203
    Nov 25 '18 at 16:58














  • 2





    How do you expect to have duplicate keys in your dictionary? that's not possible

    – user3483203
    Nov 25 '18 at 16:58








2




2





How do you expect to have duplicate keys in your dictionary? that's not possible

– user3483203
Nov 25 '18 at 16:58





How do you expect to have duplicate keys in your dictionary? that's not possible

– user3483203
Nov 25 '18 at 16:58












1 Answer
1






active

oldest

votes


















1














Setup



df = pd.DataFrame({'Curricula': {0: 'q1', 1: 'q2', 2: 'q3'},
'Course1': {0: 'c1', 1: 'c14', 2: 'c2'},
'Course2': {0: 'c2', 1: 'c21', 2: 'c14'},
'Course3': {0: np.nan, 1: 'c1', 2: np.nan}})

print(df)

Curricula Course1 Course2 Course3
0 q1 c1 c2 NaN
1 q2 c14 c21 c1
2 q3 c2 c14 NaN




You can't have duplicate keys in a dictionary, however you can use agg along with set_index and stack to create a list for each unique key:



df.set_index('Curricula').stack().groupby(level=0).agg(list).to_dict()




{'q1': ['c1', 'c2'], 'q2': ['c14', 'c21', 'c1'], 'q3': ['c2', 'c14']}   





share|improve this answer
























  • Total newbie = Totally forgot about duplicate keys... xD I will need some time to figure out how to work around my duplicate keys problem. Thank you very much for your answer @user3483203 ! I will go forward working with your answer :)

    – JanB
    Nov 25 '18 at 17:11











  • @JanB also, I should have mentioned, in your code when you call set_index, it doesn't do anything because set_index is not in place by default. You can call df.set_index('foo', inplace=True), but your result is being lost.

    – user3483203
    Nov 25 '18 at 17:13











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53469711%2fpython-generate-dictionary-from-pandas-dataframe-with-rows-as-keys-and-columns%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









1














Setup



df = pd.DataFrame({'Curricula': {0: 'q1', 1: 'q2', 2: 'q3'},
'Course1': {0: 'c1', 1: 'c14', 2: 'c2'},
'Course2': {0: 'c2', 1: 'c21', 2: 'c14'},
'Course3': {0: np.nan, 1: 'c1', 2: np.nan}})

print(df)

Curricula Course1 Course2 Course3
0 q1 c1 c2 NaN
1 q2 c14 c21 c1
2 q3 c2 c14 NaN




You can't have duplicate keys in a dictionary, however you can use agg along with set_index and stack to create a list for each unique key:



df.set_index('Curricula').stack().groupby(level=0).agg(list).to_dict()




{'q1': ['c1', 'c2'], 'q2': ['c14', 'c21', 'c1'], 'q3': ['c2', 'c14']}   





share|improve this answer
























  • Total newbie = Totally forgot about duplicate keys... xD I will need some time to figure out how to work around my duplicate keys problem. Thank you very much for your answer @user3483203 ! I will go forward working with your answer :)

    – JanB
    Nov 25 '18 at 17:11











  • @JanB also, I should have mentioned, in your code when you call set_index, it doesn't do anything because set_index is not in place by default. You can call df.set_index('foo', inplace=True), but your result is being lost.

    – user3483203
    Nov 25 '18 at 17:13
















1














Setup



df = pd.DataFrame({'Curricula': {0: 'q1', 1: 'q2', 2: 'q3'},
'Course1': {0: 'c1', 1: 'c14', 2: 'c2'},
'Course2': {0: 'c2', 1: 'c21', 2: 'c14'},
'Course3': {0: np.nan, 1: 'c1', 2: np.nan}})

print(df)

Curricula Course1 Course2 Course3
0 q1 c1 c2 NaN
1 q2 c14 c21 c1
2 q3 c2 c14 NaN




You can't have duplicate keys in a dictionary, however you can use agg along with set_index and stack to create a list for each unique key:



df.set_index('Curricula').stack().groupby(level=0).agg(list).to_dict()




{'q1': ['c1', 'c2'], 'q2': ['c14', 'c21', 'c1'], 'q3': ['c2', 'c14']}   





share|improve this answer
























  • Total newbie = Totally forgot about duplicate keys... xD I will need some time to figure out how to work around my duplicate keys problem. Thank you very much for your answer @user3483203 ! I will go forward working with your answer :)

    – JanB
    Nov 25 '18 at 17:11











  • @JanB also, I should have mentioned, in your code when you call set_index, it doesn't do anything because set_index is not in place by default. You can call df.set_index('foo', inplace=True), but your result is being lost.

    – user3483203
    Nov 25 '18 at 17:13














1












1








1







Setup



df = pd.DataFrame({'Curricula': {0: 'q1', 1: 'q2', 2: 'q3'},
'Course1': {0: 'c1', 1: 'c14', 2: 'c2'},
'Course2': {0: 'c2', 1: 'c21', 2: 'c14'},
'Course3': {0: np.nan, 1: 'c1', 2: np.nan}})

print(df)

Curricula Course1 Course2 Course3
0 q1 c1 c2 NaN
1 q2 c14 c21 c1
2 q3 c2 c14 NaN




You can't have duplicate keys in a dictionary, however you can use agg along with set_index and stack to create a list for each unique key:



df.set_index('Curricula').stack().groupby(level=0).agg(list).to_dict()




{'q1': ['c1', 'c2'], 'q2': ['c14', 'c21', 'c1'], 'q3': ['c2', 'c14']}   





share|improve this answer













Setup



df = pd.DataFrame({'Curricula': {0: 'q1', 1: 'q2', 2: 'q3'},
'Course1': {0: 'c1', 1: 'c14', 2: 'c2'},
'Course2': {0: 'c2', 1: 'c21', 2: 'c14'},
'Course3': {0: np.nan, 1: 'c1', 2: np.nan}})

print(df)

Curricula Course1 Course2 Course3
0 q1 c1 c2 NaN
1 q2 c14 c21 c1
2 q3 c2 c14 NaN




You can't have duplicate keys in a dictionary, however you can use agg along with set_index and stack to create a list for each unique key:



df.set_index('Curricula').stack().groupby(level=0).agg(list).to_dict()




{'q1': ['c1', 'c2'], 'q2': ['c14', 'c21', 'c1'], 'q3': ['c2', 'c14']}   






share|improve this answer












share|improve this answer



share|improve this answer










answered Nov 25 '18 at 16:56









user3483203user3483203

31.5k82656




31.5k82656













  • Total newbie = Totally forgot about duplicate keys... xD I will need some time to figure out how to work around my duplicate keys problem. Thank you very much for your answer @user3483203 ! I will go forward working with your answer :)

    – JanB
    Nov 25 '18 at 17:11











  • @JanB also, I should have mentioned, in your code when you call set_index, it doesn't do anything because set_index is not in place by default. You can call df.set_index('foo', inplace=True), but your result is being lost.

    – user3483203
    Nov 25 '18 at 17:13



















  • Total newbie = Totally forgot about duplicate keys... xD I will need some time to figure out how to work around my duplicate keys problem. Thank you very much for your answer @user3483203 ! I will go forward working with your answer :)

    – JanB
    Nov 25 '18 at 17:11











  • @JanB also, I should have mentioned, in your code when you call set_index, it doesn't do anything because set_index is not in place by default. You can call df.set_index('foo', inplace=True), but your result is being lost.

    – user3483203
    Nov 25 '18 at 17:13

















Total newbie = Totally forgot about duplicate keys... xD I will need some time to figure out how to work around my duplicate keys problem. Thank you very much for your answer @user3483203 ! I will go forward working with your answer :)

– JanB
Nov 25 '18 at 17:11





Total newbie = Totally forgot about duplicate keys... xD I will need some time to figure out how to work around my duplicate keys problem. Thank you very much for your answer @user3483203 ! I will go forward working with your answer :)

– JanB
Nov 25 '18 at 17:11













@JanB also, I should have mentioned, in your code when you call set_index, it doesn't do anything because set_index is not in place by default. You can call df.set_index('foo', inplace=True), but your result is being lost.

– user3483203
Nov 25 '18 at 17:13





@JanB also, I should have mentioned, in your code when you call set_index, it doesn't do anything because set_index is not in place by default. You can call df.set_index('foo', inplace=True), but your result is being lost.

– user3483203
Nov 25 '18 at 17:13




















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53469711%2fpython-generate-dictionary-from-pandas-dataframe-with-rows-as-keys-and-columns%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Create new schema in PostgreSQL using DBeaver

Deepest pit of an array with Javascript: test on Codility

Costa Masnaga