Python: Generate dictionary from pandas dataframe with rows as keys and columns as values
I have a dataframe that looks like this:
Curricula Course1 Course2 Course3 ... CourseN
0 q1 c1 c2 NaN NaN
1 q2 c14 c21 c1 Nan
2 q3 c2 c14 NaN Nan
...
M qm c7 c9 c21
Where the number of Courses per Curricula is different.
What I need is a dictionary from this dataframe looking like this:
{'q1': 'c1', 'q1': 'c2', 'q2': 'c14', 'q2': 'c21', 'q2: 'c1' ... }
Where the row names are my keys and for each row, the dictionary is filled with all the 'Curricula': 'Course' information that is given, excluding 'NaN' values.
What i tried so far was set the index to the 'Curricula' column, transposing the dataframe and using the to_dict('records') methods but this resulted in the following output:
in:
df.set_index('Curricula')
df_transposed = df.transpose()
Dic = df_transposed.to_dict('records')
out:
[{0: 'q1', 1: 'q2', 2: 'q3', ... }, {0: 'c1', 1: 'c14', 2: 'c2' ...} ... {0: NaN, 1: 'c1', 2: 'Nan']
So here the columns integer values are used as keys instead of my wanted 'Curricula' column values and additionally, the NaN values are not excluded.
Anyone an idea how to fix that?
Best regards,
Jan
python pandas dictionary dataframe todictionary
add a comment |
I have a dataframe that looks like this:
Curricula Course1 Course2 Course3 ... CourseN
0 q1 c1 c2 NaN NaN
1 q2 c14 c21 c1 Nan
2 q3 c2 c14 NaN Nan
...
M qm c7 c9 c21
Where the number of Courses per Curricula is different.
What I need is a dictionary from this dataframe looking like this:
{'q1': 'c1', 'q1': 'c2', 'q2': 'c14', 'q2': 'c21', 'q2: 'c1' ... }
Where the row names are my keys and for each row, the dictionary is filled with all the 'Curricula': 'Course' information that is given, excluding 'NaN' values.
What i tried so far was set the index to the 'Curricula' column, transposing the dataframe and using the to_dict('records') methods but this resulted in the following output:
in:
df.set_index('Curricula')
df_transposed = df.transpose()
Dic = df_transposed.to_dict('records')
out:
[{0: 'q1', 1: 'q2', 2: 'q3', ... }, {0: 'c1', 1: 'c14', 2: 'c2' ...} ... {0: NaN, 1: 'c1', 2: 'Nan']
So here the columns integer values are used as keys instead of my wanted 'Curricula' column values and additionally, the NaN values are not excluded.
Anyone an idea how to fix that?
Best regards,
Jan
python pandas dictionary dataframe todictionary
2
How do you expect to have duplicate keys in your dictionary? that's not possible
– user3483203
Nov 25 '18 at 16:58
add a comment |
I have a dataframe that looks like this:
Curricula Course1 Course2 Course3 ... CourseN
0 q1 c1 c2 NaN NaN
1 q2 c14 c21 c1 Nan
2 q3 c2 c14 NaN Nan
...
M qm c7 c9 c21
Where the number of Courses per Curricula is different.
What I need is a dictionary from this dataframe looking like this:
{'q1': 'c1', 'q1': 'c2', 'q2': 'c14', 'q2': 'c21', 'q2: 'c1' ... }
Where the row names are my keys and for each row, the dictionary is filled with all the 'Curricula': 'Course' information that is given, excluding 'NaN' values.
What i tried so far was set the index to the 'Curricula' column, transposing the dataframe and using the to_dict('records') methods but this resulted in the following output:
in:
df.set_index('Curricula')
df_transposed = df.transpose()
Dic = df_transposed.to_dict('records')
out:
[{0: 'q1', 1: 'q2', 2: 'q3', ... }, {0: 'c1', 1: 'c14', 2: 'c2' ...} ... {0: NaN, 1: 'c1', 2: 'Nan']
So here the columns integer values are used as keys instead of my wanted 'Curricula' column values and additionally, the NaN values are not excluded.
Anyone an idea how to fix that?
Best regards,
Jan
python pandas dictionary dataframe todictionary
I have a dataframe that looks like this:
Curricula Course1 Course2 Course3 ... CourseN
0 q1 c1 c2 NaN NaN
1 q2 c14 c21 c1 Nan
2 q3 c2 c14 NaN Nan
...
M qm c7 c9 c21
Where the number of Courses per Curricula is different.
What I need is a dictionary from this dataframe looking like this:
{'q1': 'c1', 'q1': 'c2', 'q2': 'c14', 'q2': 'c21', 'q2: 'c1' ... }
Where the row names are my keys and for each row, the dictionary is filled with all the 'Curricula': 'Course' information that is given, excluding 'NaN' values.
What i tried so far was set the index to the 'Curricula' column, transposing the dataframe and using the to_dict('records') methods but this resulted in the following output:
in:
df.set_index('Curricula')
df_transposed = df.transpose()
Dic = df_transposed.to_dict('records')
out:
[{0: 'q1', 1: 'q2', 2: 'q3', ... }, {0: 'c1', 1: 'c14', 2: 'c2' ...} ... {0: NaN, 1: 'c1', 2: 'Nan']
So here the columns integer values are used as keys instead of my wanted 'Curricula' column values and additionally, the NaN values are not excluded.
Anyone an idea how to fix that?
Best regards,
Jan
python pandas dictionary dataframe todictionary
python pandas dictionary dataframe todictionary
asked Nov 25 '18 at 16:51
JanBJanB
415
415
2
How do you expect to have duplicate keys in your dictionary? that's not possible
– user3483203
Nov 25 '18 at 16:58
add a comment |
2
How do you expect to have duplicate keys in your dictionary? that's not possible
– user3483203
Nov 25 '18 at 16:58
2
2
How do you expect to have duplicate keys in your dictionary? that's not possible
– user3483203
Nov 25 '18 at 16:58
How do you expect to have duplicate keys in your dictionary? that's not possible
– user3483203
Nov 25 '18 at 16:58
add a comment |
1 Answer
1
active
oldest
votes
Setup
df = pd.DataFrame({'Curricula': {0: 'q1', 1: 'q2', 2: 'q3'},
'Course1': {0: 'c1', 1: 'c14', 2: 'c2'},
'Course2': {0: 'c2', 1: 'c21', 2: 'c14'},
'Course3': {0: np.nan, 1: 'c1', 2: np.nan}})
print(df)
Curricula Course1 Course2 Course3
0 q1 c1 c2 NaN
1 q2 c14 c21 c1
2 q3 c2 c14 NaN
You can't have duplicate keys in a dictionary, however you can use agg
along with set_index
and stack
to create a list for each unique key:
df.set_index('Curricula').stack().groupby(level=0).agg(list).to_dict()
{'q1': ['c1', 'c2'], 'q2': ['c14', 'c21', 'c1'], 'q3': ['c2', 'c14']}
Total newbie = Totally forgot about duplicate keys... xD I will need some time to figure out how to work around my duplicate keys problem. Thank you very much for your answer @user3483203 ! I will go forward working with your answer :)
– JanB
Nov 25 '18 at 17:11
@JanB also, I should have mentioned, in your code when you callset_index
, it doesn't do anything becauseset_index
is not in place by default. You can calldf.set_index('foo', inplace=True)
, but your result is being lost.
– user3483203
Nov 25 '18 at 17:13
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53469711%2fpython-generate-dictionary-from-pandas-dataframe-with-rows-as-keys-and-columns%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
Setup
df = pd.DataFrame({'Curricula': {0: 'q1', 1: 'q2', 2: 'q3'},
'Course1': {0: 'c1', 1: 'c14', 2: 'c2'},
'Course2': {0: 'c2', 1: 'c21', 2: 'c14'},
'Course3': {0: np.nan, 1: 'c1', 2: np.nan}})
print(df)
Curricula Course1 Course2 Course3
0 q1 c1 c2 NaN
1 q2 c14 c21 c1
2 q3 c2 c14 NaN
You can't have duplicate keys in a dictionary, however you can use agg
along with set_index
and stack
to create a list for each unique key:
df.set_index('Curricula').stack().groupby(level=0).agg(list).to_dict()
{'q1': ['c1', 'c2'], 'q2': ['c14', 'c21', 'c1'], 'q3': ['c2', 'c14']}
Total newbie = Totally forgot about duplicate keys... xD I will need some time to figure out how to work around my duplicate keys problem. Thank you very much for your answer @user3483203 ! I will go forward working with your answer :)
– JanB
Nov 25 '18 at 17:11
@JanB also, I should have mentioned, in your code when you callset_index
, it doesn't do anything becauseset_index
is not in place by default. You can calldf.set_index('foo', inplace=True)
, but your result is being lost.
– user3483203
Nov 25 '18 at 17:13
add a comment |
Setup
df = pd.DataFrame({'Curricula': {0: 'q1', 1: 'q2', 2: 'q3'},
'Course1': {0: 'c1', 1: 'c14', 2: 'c2'},
'Course2': {0: 'c2', 1: 'c21', 2: 'c14'},
'Course3': {0: np.nan, 1: 'c1', 2: np.nan}})
print(df)
Curricula Course1 Course2 Course3
0 q1 c1 c2 NaN
1 q2 c14 c21 c1
2 q3 c2 c14 NaN
You can't have duplicate keys in a dictionary, however you can use agg
along with set_index
and stack
to create a list for each unique key:
df.set_index('Curricula').stack().groupby(level=0).agg(list).to_dict()
{'q1': ['c1', 'c2'], 'q2': ['c14', 'c21', 'c1'], 'q3': ['c2', 'c14']}
Total newbie = Totally forgot about duplicate keys... xD I will need some time to figure out how to work around my duplicate keys problem. Thank you very much for your answer @user3483203 ! I will go forward working with your answer :)
– JanB
Nov 25 '18 at 17:11
@JanB also, I should have mentioned, in your code when you callset_index
, it doesn't do anything becauseset_index
is not in place by default. You can calldf.set_index('foo', inplace=True)
, but your result is being lost.
– user3483203
Nov 25 '18 at 17:13
add a comment |
Setup
df = pd.DataFrame({'Curricula': {0: 'q1', 1: 'q2', 2: 'q3'},
'Course1': {0: 'c1', 1: 'c14', 2: 'c2'},
'Course2': {0: 'c2', 1: 'c21', 2: 'c14'},
'Course3': {0: np.nan, 1: 'c1', 2: np.nan}})
print(df)
Curricula Course1 Course2 Course3
0 q1 c1 c2 NaN
1 q2 c14 c21 c1
2 q3 c2 c14 NaN
You can't have duplicate keys in a dictionary, however you can use agg
along with set_index
and stack
to create a list for each unique key:
df.set_index('Curricula').stack().groupby(level=0).agg(list).to_dict()
{'q1': ['c1', 'c2'], 'q2': ['c14', 'c21', 'c1'], 'q3': ['c2', 'c14']}
Setup
df = pd.DataFrame({'Curricula': {0: 'q1', 1: 'q2', 2: 'q3'},
'Course1': {0: 'c1', 1: 'c14', 2: 'c2'},
'Course2': {0: 'c2', 1: 'c21', 2: 'c14'},
'Course3': {0: np.nan, 1: 'c1', 2: np.nan}})
print(df)
Curricula Course1 Course2 Course3
0 q1 c1 c2 NaN
1 q2 c14 c21 c1
2 q3 c2 c14 NaN
You can't have duplicate keys in a dictionary, however you can use agg
along with set_index
and stack
to create a list for each unique key:
df.set_index('Curricula').stack().groupby(level=0).agg(list).to_dict()
{'q1': ['c1', 'c2'], 'q2': ['c14', 'c21', 'c1'], 'q3': ['c2', 'c14']}
answered Nov 25 '18 at 16:56
user3483203user3483203
31.5k82656
31.5k82656
Total newbie = Totally forgot about duplicate keys... xD I will need some time to figure out how to work around my duplicate keys problem. Thank you very much for your answer @user3483203 ! I will go forward working with your answer :)
– JanB
Nov 25 '18 at 17:11
@JanB also, I should have mentioned, in your code when you callset_index
, it doesn't do anything becauseset_index
is not in place by default. You can calldf.set_index('foo', inplace=True)
, but your result is being lost.
– user3483203
Nov 25 '18 at 17:13
add a comment |
Total newbie = Totally forgot about duplicate keys... xD I will need some time to figure out how to work around my duplicate keys problem. Thank you very much for your answer @user3483203 ! I will go forward working with your answer :)
– JanB
Nov 25 '18 at 17:11
@JanB also, I should have mentioned, in your code when you callset_index
, it doesn't do anything becauseset_index
is not in place by default. You can calldf.set_index('foo', inplace=True)
, but your result is being lost.
– user3483203
Nov 25 '18 at 17:13
Total newbie = Totally forgot about duplicate keys... xD I will need some time to figure out how to work around my duplicate keys problem. Thank you very much for your answer @user3483203 ! I will go forward working with your answer :)
– JanB
Nov 25 '18 at 17:11
Total newbie = Totally forgot about duplicate keys... xD I will need some time to figure out how to work around my duplicate keys problem. Thank you very much for your answer @user3483203 ! I will go forward working with your answer :)
– JanB
Nov 25 '18 at 17:11
@JanB also, I should have mentioned, in your code when you call
set_index
, it doesn't do anything because set_index
is not in place by default. You can call df.set_index('foo', inplace=True)
, but your result is being lost.– user3483203
Nov 25 '18 at 17:13
@JanB also, I should have mentioned, in your code when you call
set_index
, it doesn't do anything because set_index
is not in place by default. You can call df.set_index('foo', inplace=True)
, but your result is being lost.– user3483203
Nov 25 '18 at 17:13
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53469711%2fpython-generate-dictionary-from-pandas-dataframe-with-rows-as-keys-and-columns%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
2
How do you expect to have duplicate keys in your dictionary? that's not possible
– user3483203
Nov 25 '18 at 16:58