How to unfold a dictionary of dictionaries into a pandas DataFrame for larger dictionaries?

Consider the following dictionary of dictionaries in python3.x

dict1 = {4: {4:25, 5:39, 3:42}, 5:{24:94, 252:49, 25:4, 55:923}}

I would like to unfold this into a pandas DataFrame. There appear to be two options:

df1 = pd.DataFrame.from_dict(dict1, orient='columns')



print(df1)

        4      5

3    42.0    NaN

4    25.0    NaN

5    39.0    NaN

24    NaN   94.0

25    NaN    4.0

55    NaN  923.0

252   NaN   49.0

whereby the columns for this are the main dictionary keys 4 and `5', the row indices are the subdictionary keys and the values are the subdictionary values.

The other option is

df2 = pd.DataFrame.from_dict(dict1, orient='index')

print(df2)

    4     5     3     24    252  25     55 

4  25.0  39.0  42.0   NaN   NaN  NaN    NaN

5   NaN   NaN   NaN  94.0  49.0  4.0  923.0

whereby the columns are the keys of the inner "sub-dictionary", the row indices are the keys of the main dictionary, and the values are the subdictionary keys.

Is there a standard approach which allows us to unfold the python dictionary as follows?

key inner_key values

4        3      42 

4        4      25

4        5      39

5        24     94

5        25     4

5        55     923

5        252    49

It would be best not to manipulate the DataFrame after using from_dict(), as for far larger python dictionaries, this could become quite memory intensive.

asked Nov 22 '18 at 1:58

ShanZhengYang

4,1201350111

add a comment |

Consider the following dictionary of dictionaries in python3.x

dict1 = {4: {4:25, 5:39, 3:42}, 5:{24:94, 252:49, 25:4, 55:923}}

I would like to unfold this into a pandas DataFrame. There appear to be two options:

df1 = pd.DataFrame.from_dict(dict1, orient='columns')



print(df1)

        4      5

3    42.0    NaN

4    25.0    NaN

5    39.0    NaN

24    NaN   94.0

25    NaN    4.0

55    NaN  923.0

252   NaN   49.0

whereby the columns for this are the main dictionary keys 4 and `5', the row indices are the subdictionary keys and the values are the subdictionary values.

The other option is

df2 = pd.DataFrame.from_dict(dict1, orient='index')

print(df2)

    4     5     3     24    252  25     55 

4  25.0  39.0  42.0   NaN   NaN  NaN    NaN

5   NaN   NaN   NaN  94.0  49.0  4.0  923.0

whereby the columns are the keys of the inner "sub-dictionary", the row indices are the keys of the main dictionary, and the values are the subdictionary keys.

Is there a standard approach which allows us to unfold the python dictionary as follows?

key inner_key values

4        3      42 

4        4      25

4        5      39

5        24     94

5        25     4

5        55     923

5        252    49

It would be best not to manipulate the DataFrame after using from_dict(), as for far larger python dictionaries, this could become quite memory intensive.

asked Nov 22 '18 at 1:58

ShanZhengYang

4,1201350111

add a comment |

Consider the following dictionary of dictionaries in python3.x

dict1 = {4: {4:25, 5:39, 3:42}, 5:{24:94, 252:49, 25:4, 55:923}}

I would like to unfold this into a pandas DataFrame. There appear to be two options:

df1 = pd.DataFrame.from_dict(dict1, orient='columns')



print(df1)

        4      5

3    42.0    NaN

4    25.0    NaN

5    39.0    NaN

24    NaN   94.0

25    NaN    4.0

55    NaN  923.0

252   NaN   49.0

whereby the columns for this are the main dictionary keys 4 and `5', the row indices are the subdictionary keys and the values are the subdictionary values.

The other option is

df2 = pd.DataFrame.from_dict(dict1, orient='index')

print(df2)

    4     5     3     24    252  25     55 

4  25.0  39.0  42.0   NaN   NaN  NaN    NaN

5   NaN   NaN   NaN  94.0  49.0  4.0  923.0

whereby the columns are the keys of the inner "sub-dictionary", the row indices are the keys of the main dictionary, and the values are the subdictionary keys.

Is there a standard approach which allows us to unfold the python dictionary as follows?

key inner_key values

4        3      42 

4        4      25

4        5      39

5        24     94

5        25     4

5        55     923

5        252    49

It would be best not to manipulate the DataFrame after using from_dict(), as for far larger python dictionaries, this could become quite memory intensive.

asked Nov 22 '18 at 1:58

ShanZhengYang

4,1201350111

Consider the following dictionary of dictionaries in python3.x

dict1 = {4: {4:25, 5:39, 3:42}, 5:{24:94, 252:49, 25:4, 55:923}}

I would like to unfold this into a pandas DataFrame. There appear to be two options:

df1 = pd.DataFrame.from_dict(dict1, orient='columns')



print(df1)

        4      5

3    42.0    NaN

4    25.0    NaN

5    39.0    NaN

24    NaN   94.0

25    NaN    4.0

55    NaN  923.0

252   NaN   49.0

whereby the columns for this are the main dictionary keys 4 and `5', the row indices are the subdictionary keys and the values are the subdictionary values.

The other option is

df2 = pd.DataFrame.from_dict(dict1, orient='index')

print(df2)

    4     5     3     24    252  25     55 

4  25.0  39.0  42.0   NaN   NaN  NaN    NaN

5   NaN   NaN   NaN  94.0  49.0  4.0  923.0

whereby the columns are the keys of the inner "sub-dictionary", the row indices are the keys of the main dictionary, and the values are the subdictionary keys.

Is there a standard approach which allows us to unfold the python dictionary as follows?

key inner_key values

4        3      42 

4        4      25

4        5      39

5        24     94

5        25     4

5        55     923

5        252    49

It would be best not to manipulate the DataFrame after using from_dict(), as for far larger python dictionaries, this could become quite memory intensive.

python python-3.x pandas dictionary dataframe

asked Nov 22 '18 at 1:58

ShanZhengYang

4,1201350111

asked Nov 22 '18 at 1:58

ShanZhengYang

4,1201350111

asked Nov 22 '18 at 1:58

ShanZhengYang

4,1201350111

asked Nov 22 '18 at 1:58

ShanZhengYang

4,1201350111

asked Nov 22 '18 at 1:58

ShanZhengYang

4,1201350111

add a comment |

2 Answers
2

active

oldest

votes

List comprehension

A list comprehension should be fairly efficient:

dict1 = {4: {4:25, 5:39, 3:42}, 5: {24:94, 252:49, 25:4, 55:923}}



cols = ['key', 'inner_key', 'values']



df = pd.DataFrame([[k1, k2, v2] for k1, v1 in dict1.items() for k2, v2 in v1.items()],

                  columns=cols).sort_values(cols)



print(df)



   key  inner_key  values

2    4          3      42

0    4          4      25

1    4          5      39

3    5         24      94

5    5         25       4

6    5         55     923

4    5        252      49

`pd.melt` + `dropna`

If you don't mind working from df1, you can unpivot your dataframe via pd.melt and then drop rows with null value.

df1 = df1.reset_index()



res = pd.melt(df1, id_vars='index', value_vars=[4, 5])

        .dropna(subset=['value']).astype(int)



print(res)



    index  variable  value

0       3         4     42

1       4         4     25

2       5         4     39

10     24         5     94

11     25         5      4

12     55         5    923

13    252         5     49

edited Nov 22 '18 at 2:15

answered Nov 22 '18 at 2:08

jpp

97.1k2159109

Thanks for the explanation! Much appreciated

– ShanZhengYang
Dec 4 '18 at 15:35

add a comment |

pd.DataFrame([[i,j,user_dict[i][j] ] for i in user_dict.keys() for j in user_dict[i].keys()],columns=['key', 'inner_key', 'values'])

Output:

   key  inner_key   values

0   4   4           25

1   4   5           39

2   4   3           42

3   5   24          94

4   5   252         49

5   5   25          4

6   5   55         923

answered Nov 22 '18 at 2:54

min2bro

2,04511232

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53422873%2fhow-to-unfold-a-dictionary-of-dictionaries-into-a-pandas-dataframe-for-larger-di%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

List comprehension

A list comprehension should be fairly efficient:

dict1 = {4: {4:25, 5:39, 3:42}, 5: {24:94, 252:49, 25:4, 55:923}}



cols = ['key', 'inner_key', 'values']



df = pd.DataFrame([[k1, k2, v2] for k1, v1 in dict1.items() for k2, v2 in v1.items()],

                  columns=cols).sort_values(cols)



print(df)



   key  inner_key  values

2    4          3      42

0    4          4      25

1    4          5      39

3    5         24      94

5    5         25       4

6    5         55     923

4    5        252      49

`pd.melt` + `dropna`

If you don't mind working from df1, you can unpivot your dataframe via pd.melt and then drop rows with null value.

df1 = df1.reset_index()



res = pd.melt(df1, id_vars='index', value_vars=[4, 5])

        .dropna(subset=['value']).astype(int)



print(res)



    index  variable  value

0       3         4     42

1       4         4     25

2       5         4     39

10     24         5     94

11     25         5      4

12     55         5    923

13    252         5     49

edited Nov 22 '18 at 2:15

answered Nov 22 '18 at 2:08

jpp

97.1k2159109

Thanks for the explanation! Much appreciated

– ShanZhengYang
Dec 4 '18 at 15:35

add a comment |

List comprehension

A list comprehension should be fairly efficient:

dict1 = {4: {4:25, 5:39, 3:42}, 5: {24:94, 252:49, 25:4, 55:923}}



cols = ['key', 'inner_key', 'values']



df = pd.DataFrame([[k1, k2, v2] for k1, v1 in dict1.items() for k2, v2 in v1.items()],

                  columns=cols).sort_values(cols)



print(df)



   key  inner_key  values

2    4          3      42

0    4          4      25

1    4          5      39

3    5         24      94

5    5         25       4

6    5         55     923

4    5        252      49

`pd.melt` + `dropna`

If you don't mind working from df1, you can unpivot your dataframe via pd.melt and then drop rows with null value.

df1 = df1.reset_index()



res = pd.melt(df1, id_vars='index', value_vars=[4, 5])

        .dropna(subset=['value']).astype(int)



print(res)



    index  variable  value

0       3         4     42

1       4         4     25

2       5         4     39

10     24         5     94

11     25         5      4

12     55         5    923

13    252         5     49

edited Nov 22 '18 at 2:15

answered Nov 22 '18 at 2:08

jpp

97.1k2159109

Thanks for the explanation! Much appreciated

– ShanZhengYang
Dec 4 '18 at 15:35

add a comment |

List comprehension

A list comprehension should be fairly efficient:

dict1 = {4: {4:25, 5:39, 3:42}, 5: {24:94, 252:49, 25:4, 55:923}}



cols = ['key', 'inner_key', 'values']



df = pd.DataFrame([[k1, k2, v2] for k1, v1 in dict1.items() for k2, v2 in v1.items()],

                  columns=cols).sort_values(cols)



print(df)



   key  inner_key  values

2    4          3      42

0    4          4      25

1    4          5      39

3    5         24      94

5    5         25       4

6    5         55     923

4    5        252      49

`pd.melt` + `dropna`

If you don't mind working from df1, you can unpivot your dataframe via pd.melt and then drop rows with null value.

df1 = df1.reset_index()



res = pd.melt(df1, id_vars='index', value_vars=[4, 5])

        .dropna(subset=['value']).astype(int)



print(res)



    index  variable  value

0       3         4     42

1       4         4     25

2       5         4     39

10     24         5     94

11     25         5      4

12     55         5    923

13    252         5     49

edited Nov 22 '18 at 2:15

answered Nov 22 '18 at 2:08

jpp

97.1k2159109

List comprehension

A list comprehension should be fairly efficient:

dict1 = {4: {4:25, 5:39, 3:42}, 5: {24:94, 252:49, 25:4, 55:923}}



cols = ['key', 'inner_key', 'values']



df = pd.DataFrame([[k1, k2, v2] for k1, v1 in dict1.items() for k2, v2 in v1.items()],

                  columns=cols).sort_values(cols)



print(df)



   key  inner_key  values

2    4          3      42

0    4          4      25

1    4          5      39

3    5         24      94

5    5         25       4

6    5         55     923

4    5        252      49

`pd.melt` + `dropna`

If you don't mind working from df1, you can unpivot your dataframe via pd.melt and then drop rows with null value.

df1 = df1.reset_index()



res = pd.melt(df1, id_vars='index', value_vars=[4, 5])

        .dropna(subset=['value']).astype(int)



print(res)



    index  variable  value

0       3         4     42

1       4         4     25

2       5         4     39

10     24         5     94

11     25         5      4

12     55         5    923

13    252         5     49

edited Nov 22 '18 at 2:15

answered Nov 22 '18 at 2:08

jpp

97.1k2159109

edited Nov 22 '18 at 2:15

answered Nov 22 '18 at 2:08

jpp

97.1k2159109

answered Nov 22 '18 at 2:08

jpp

97.1k2159109

answered Nov 22 '18 at 2:08

jpp

97.1k2159109

Thanks for the explanation! Much appreciated

– ShanZhengYang
Dec 4 '18 at 15:35

add a comment |

Thanks for the explanation! Much appreciated

– ShanZhengYang
Dec 4 '18 at 15:35

Thanks for the explanation! Much appreciated

– ShanZhengYang
Dec 4 '18 at 15:35

add a comment |

pd.DataFrame([[i,j,user_dict[i][j] ] for i in user_dict.keys() for j in user_dict[i].keys()],columns=['key', 'inner_key', 'values'])

Output:

   key  inner_key   values

0   4   4           25

1   4   5           39

2   4   3           42

3   5   24          94

4   5   252         49

5   5   25          4

6   5   55         923

answered Nov 22 '18 at 2:54

min2bro

2,04511232

add a comment |

pd.DataFrame([[i,j,user_dict[i][j] ] for i in user_dict.keys() for j in user_dict[i].keys()],columns=['key', 'inner_key', 'values'])

Output:

   key  inner_key   values

0   4   4           25

1   4   5           39

2   4   3           42

3   5   24          94

4   5   252         49

5   5   25          4

6   5   55         923

answered Nov 22 '18 at 2:54

min2bro

2,04511232

add a comment |

pd.DataFrame([[i,j,user_dict[i][j] ] for i in user_dict.keys() for j in user_dict[i].keys()],columns=['key', 'inner_key', 'values'])

Output:

   key  inner_key   values

0   4   4           25

1   4   5           39

2   4   3           42

3   5   24          94

4   5   252         49

5   5   25          4

6   5   55         923

answered Nov 22 '18 at 2:54

min2bro

2,04511232

pd.DataFrame([[i,j,user_dict[i][j] ] for i in user_dict.keys() for j in user_dict[i].keys()],columns=['key', 'inner_key', 'values'])

Output:

   key  inner_key   values

0   4   4           25

1   4   5           39

2   4   3           42

3   5   24          94

4   5   252         49

5   5   25          4

6   5   55         923

answered Nov 22 '18 at 2:54

min2bro

2,04511232

answered Nov 22 '18 at 2:54

min2bro

2,04511232

answered Nov 22 '18 at 2:54

min2bro

2,04511232

answered Nov 22 '18 at 2:54

min2bro

2,04511232

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Nsryjdtyk