Python: Generate dictionary from pandas dataframe with rows as keys and columns as values

I have a dataframe that looks like this:

     Curricula Course1 Course2 Course3 ... CourseN

0       q1      c1        c2     NaN        NaN

1       q2      c14       c21    c1         Nan

2       q3      c2        c14    NaN        Nan

...

M       qm      c7        c9     c21

Where the number of Courses per Curricula is different.

What I need is a dictionary from this dataframe looking like this:

{'q1': 'c1', 'q1': 'c2', 'q2': 'c14', 'q2': 'c21', 'q2: 'c1' ... }

Where the row names are my keys and for each row, the dictionary is filled with all the 'Curricula': 'Course' information that is given, excluding 'NaN' values.

What i tried so far was set the index to the 'Curricula' column, transposing the dataframe and using the to_dict('records') methods but this resulted in the following output:

in:

df.set_index('Curricula')

df_transposed = df.transpose()

Dic = df_transposed.to_dict('records')

out:

[{0: 'q1', 1: 'q2', 2: 'q3', ... }, {0: 'c1', 1: 'c14', 2: 'c2' ...} ... {0: NaN, 1: 'c1', 2: 'Nan']

So here the columns integer values are used as keys instead of my wanted 'Curricula' column values and additionally, the NaN values are not excluded.

Anyone an idea how to fix that?

Best regards,
Jan

asked Nov 25 '18 at 16:51

JanB

415

2

How do you expect to have duplicate keys in your dictionary? that's not possible

– user3483203
Nov 25 '18 at 16:58

add a comment |

I have a dataframe that looks like this:

     Curricula Course1 Course2 Course3 ... CourseN

0       q1      c1        c2     NaN        NaN

1       q2      c14       c21    c1         Nan

2       q3      c2        c14    NaN        Nan

...

M       qm      c7        c9     c21

Where the number of Courses per Curricula is different.

What I need is a dictionary from this dataframe looking like this:

{'q1': 'c1', 'q1': 'c2', 'q2': 'c14', 'q2': 'c21', 'q2: 'c1' ... }

Where the row names are my keys and for each row, the dictionary is filled with all the 'Curricula': 'Course' information that is given, excluding 'NaN' values.

What i tried so far was set the index to the 'Curricula' column, transposing the dataframe and using the to_dict('records') methods but this resulted in the following output:

in:

df.set_index('Curricula')

df_transposed = df.transpose()

Dic = df_transposed.to_dict('records')

out:

[{0: 'q1', 1: 'q2', 2: 'q3', ... }, {0: 'c1', 1: 'c14', 2: 'c2' ...} ... {0: NaN, 1: 'c1', 2: 'Nan']

So here the columns integer values are used as keys instead of my wanted 'Curricula' column values and additionally, the NaN values are not excluded.

Anyone an idea how to fix that?

Best regards,
Jan

asked Nov 25 '18 at 16:51

JanB

415

2

How do you expect to have duplicate keys in your dictionary? that's not possible

– user3483203
Nov 25 '18 at 16:58

add a comment |

I have a dataframe that looks like this:

     Curricula Course1 Course2 Course3 ... CourseN

0       q1      c1        c2     NaN        NaN

1       q2      c14       c21    c1         Nan

2       q3      c2        c14    NaN        Nan

...

M       qm      c7        c9     c21

Where the number of Courses per Curricula is different.

What I need is a dictionary from this dataframe looking like this:

{'q1': 'c1', 'q1': 'c2', 'q2': 'c14', 'q2': 'c21', 'q2: 'c1' ... }

Where the row names are my keys and for each row, the dictionary is filled with all the 'Curricula': 'Course' information that is given, excluding 'NaN' values.

What i tried so far was set the index to the 'Curricula' column, transposing the dataframe and using the to_dict('records') methods but this resulted in the following output:

in:

df.set_index('Curricula')

df_transposed = df.transpose()

Dic = df_transposed.to_dict('records')

out:

[{0: 'q1', 1: 'q2', 2: 'q3', ... }, {0: 'c1', 1: 'c14', 2: 'c2' ...} ... {0: NaN, 1: 'c1', 2: 'Nan']

So here the columns integer values are used as keys instead of my wanted 'Curricula' column values and additionally, the NaN values are not excluded.

Anyone an idea how to fix that?

Best regards,
Jan

asked Nov 25 '18 at 16:51

JanB

415

I have a dataframe that looks like this:

     Curricula Course1 Course2 Course3 ... CourseN

0       q1      c1        c2     NaN        NaN

1       q2      c14       c21    c1         Nan

2       q3      c2        c14    NaN        Nan

...

M       qm      c7        c9     c21

Where the number of Courses per Curricula is different.

What I need is a dictionary from this dataframe looking like this:

{'q1': 'c1', 'q1': 'c2', 'q2': 'c14', 'q2': 'c21', 'q2: 'c1' ... }

Where the row names are my keys and for each row, the dictionary is filled with all the 'Curricula': 'Course' information that is given, excluding 'NaN' values.

What i tried so far was set the index to the 'Curricula' column, transposing the dataframe and using the to_dict('records') methods but this resulted in the following output:

in:

df.set_index('Curricula')

df_transposed = df.transpose()

Dic = df_transposed.to_dict('records')

out:

[{0: 'q1', 1: 'q2', 2: 'q3', ... }, {0: 'c1', 1: 'c14', 2: 'c2' ...} ... {0: NaN, 1: 'c1', 2: 'Nan']

So here the columns integer values are used as keys instead of my wanted 'Curricula' column values and additionally, the NaN values are not excluded.

Anyone an idea how to fix that?

Best regards,
Jan

python pandas dictionary dataframe todictionary

asked Nov 25 '18 at 16:51

JanB

415

asked Nov 25 '18 at 16:51

JanB

415

asked Nov 25 '18 at 16:51

JanB

415

asked Nov 25 '18 at 16:51

JanB

415

asked Nov 25 '18 at 16:51

JanB

415

2

How do you expect to have duplicate keys in your dictionary? that's not possible

– user3483203
Nov 25 '18 at 16:58

add a comment |

2

How do you expect to have duplicate keys in your dictionary? that's not possible

– user3483203
Nov 25 '18 at 16:58

How do you expect to have duplicate keys in your dictionary? that's not possible

– user3483203
Nov 25 '18 at 16:58

add a comment |

1 Answer
1

active

oldest

votes

Setup

df = pd.DataFrame({'Curricula': {0: 'q1', 1: 'q2', 2: 'q3'},

 'Course1': {0: 'c1', 1: 'c14', 2: 'c2'},

 'Course2': {0: 'c2', 1: 'c21', 2: 'c14'},

 'Course3': {0: np.nan, 1: 'c1', 2: np.nan}})



print(df)



  Curricula Course1 Course2 Course3

0        q1      c1      c2     NaN

1        q2     c14     c21      c1

2        q3      c2     c14     NaN

You can't have duplicate keys in a dictionary, however you can use agg along with set_index and stack to create a list for each unique key:

df.set_index('Curricula').stack().groupby(level=0).agg(list).to_dict()

{'q1': ['c1', 'c2'], 'q2': ['c14', 'c21', 'c1'], 'q3': ['c2', 'c14']}

answered Nov 25 '18 at 16:56

user3483203

31.5k82656

Total newbie = Totally forgot about duplicate keys... xD I will need some time to figure out how to work around my duplicate keys problem. Thank you very much for your answer @user3483203 ! I will go forward working with your answer :)

– JanB
Nov 25 '18 at 17:11

@JanB also, I should have mentioned, in your code when you call set_index, it doesn't do anything because set_index is not in place by default. You can call df.set_index('foo', inplace=True), but your result is being lost.

– user3483203
Nov 25 '18 at 17:13

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53469711%2fpython-generate-dictionary-from-pandas-dataframe-with-rows-as-keys-and-columns%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

Setup

df = pd.DataFrame({'Curricula': {0: 'q1', 1: 'q2', 2: 'q3'},

 'Course1': {0: 'c1', 1: 'c14', 2: 'c2'},

 'Course2': {0: 'c2', 1: 'c21', 2: 'c14'},

 'Course3': {0: np.nan, 1: 'c1', 2: np.nan}})



print(df)



  Curricula Course1 Course2 Course3

0        q1      c1      c2     NaN

1        q2     c14     c21      c1

2        q3      c2     c14     NaN

You can't have duplicate keys in a dictionary, however you can use agg along with set_index and stack to create a list for each unique key:

df.set_index('Curricula').stack().groupby(level=0).agg(list).to_dict()

{'q1': ['c1', 'c2'], 'q2': ['c14', 'c21', 'c1'], 'q3': ['c2', 'c14']}

answered Nov 25 '18 at 16:56

user3483203

31.5k82656

Total newbie = Totally forgot about duplicate keys... xD I will need some time to figure out how to work around my duplicate keys problem. Thank you very much for your answer @user3483203 ! I will go forward working with your answer :)

– JanB
Nov 25 '18 at 17:11

@JanB also, I should have mentioned, in your code when you call set_index, it doesn't do anything because set_index is not in place by default. You can call df.set_index('foo', inplace=True), but your result is being lost.

– user3483203
Nov 25 '18 at 17:13

add a comment |

Setup

df = pd.DataFrame({'Curricula': {0: 'q1', 1: 'q2', 2: 'q3'},

 'Course1': {0: 'c1', 1: 'c14', 2: 'c2'},

 'Course2': {0: 'c2', 1: 'c21', 2: 'c14'},

 'Course3': {0: np.nan, 1: 'c1', 2: np.nan}})



print(df)



  Curricula Course1 Course2 Course3

0        q1      c1      c2     NaN

1        q2     c14     c21      c1

2        q3      c2     c14     NaN

You can't have duplicate keys in a dictionary, however you can use agg along with set_index and stack to create a list for each unique key:

df.set_index('Curricula').stack().groupby(level=0).agg(list).to_dict()

{'q1': ['c1', 'c2'], 'q2': ['c14', 'c21', 'c1'], 'q3': ['c2', 'c14']}

answered Nov 25 '18 at 16:56

user3483203

31.5k82656

Total newbie = Totally forgot about duplicate keys... xD I will need some time to figure out how to work around my duplicate keys problem. Thank you very much for your answer @user3483203 ! I will go forward working with your answer :)

– JanB
Nov 25 '18 at 17:11

@JanB also, I should have mentioned, in your code when you call set_index, it doesn't do anything because set_index is not in place by default. You can call df.set_index('foo', inplace=True), but your result is being lost.

– user3483203
Nov 25 '18 at 17:13

add a comment |

Setup

df = pd.DataFrame({'Curricula': {0: 'q1', 1: 'q2', 2: 'q3'},

 'Course1': {0: 'c1', 1: 'c14', 2: 'c2'},

 'Course2': {0: 'c2', 1: 'c21', 2: 'c14'},

 'Course3': {0: np.nan, 1: 'c1', 2: np.nan}})



print(df)



  Curricula Course1 Course2 Course3

0        q1      c1      c2     NaN

1        q2     c14     c21      c1

2        q3      c2     c14     NaN

You can't have duplicate keys in a dictionary, however you can use agg along with set_index and stack to create a list for each unique key:

df.set_index('Curricula').stack().groupby(level=0).agg(list).to_dict()

{'q1': ['c1', 'c2'], 'q2': ['c14', 'c21', 'c1'], 'q3': ['c2', 'c14']}

answered Nov 25 '18 at 16:56

user3483203

31.5k82656

Setup

df = pd.DataFrame({'Curricula': {0: 'q1', 1: 'q2', 2: 'q3'},

 'Course1': {0: 'c1', 1: 'c14', 2: 'c2'},

 'Course2': {0: 'c2', 1: 'c21', 2: 'c14'},

 'Course3': {0: np.nan, 1: 'c1', 2: np.nan}})



print(df)



  Curricula Course1 Course2 Course3

0        q1      c1      c2     NaN

1        q2     c14     c21      c1

2        q3      c2     c14     NaN

You can't have duplicate keys in a dictionary, however you can use agg along with set_index and stack to create a list for each unique key:

df.set_index('Curricula').stack().groupby(level=0).agg(list).to_dict()

{'q1': ['c1', 'c2'], 'q2': ['c14', 'c21', 'c1'], 'q3': ['c2', 'c14']}

answered Nov 25 '18 at 16:56

user3483203

31.5k82656

answered Nov 25 '18 at 16:56

user3483203

31.5k82656

answered Nov 25 '18 at 16:56

user3483203

31.5k82656

answered Nov 25 '18 at 16:56

user3483203

31.5k82656

Total newbie = Totally forgot about duplicate keys... xD I will need some time to figure out how to work around my duplicate keys problem. Thank you very much for your answer @user3483203 ! I will go forward working with your answer :)

– JanB
Nov 25 '18 at 17:11

@JanB also, I should have mentioned, in your code when you call set_index, it doesn't do anything because set_index is not in place by default. You can call df.set_index('foo', inplace=True), but your result is being lost.

– user3483203
Nov 25 '18 at 17:13

add a comment |

Total newbie = Totally forgot about duplicate keys... xD I will need some time to figure out how to work around my duplicate keys problem. Thank you very much for your answer @user3483203 ! I will go forward working with your answer :)

– JanB
Nov 25 '18 at 17:11

@JanB also, I should have mentioned, in your code when you call set_index, it doesn't do anything because set_index is not in place by default. You can call df.set_index('foo', inplace=True), but your result is being lost.

– user3483203
Nov 25 '18 at 17:13

Total newbie = Totally forgot about duplicate keys... xD I will need some time to figure out how to work around my duplicate keys problem. Thank you very much for your answer @user3483203 ! I will go forward working with your answer :)

– JanB
Nov 25 '18 at 17:11

@JanB also, I should have mentioned, in your code when you call set_index, it doesn't do anything because set_index is not in place by default. You can call df.set_index('foo', inplace=True), but your result is being lost.

– user3483203
Nov 25 '18 at 17:13

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Nsryjdtyk