How to unfold a dictionary of dictionaries into a pandas DataFrame for larger dictionaries?












2















Consider the following dictionary of dictionaries in python3.x



dict1 = {4: {4:25, 5:39, 3:42}, 5:{24:94, 252:49, 25:4, 55:923}}


I would like to unfold this into a pandas DataFrame. There appear to be two options:



df1 = pd.DataFrame.from_dict(dict1, orient='columns')

print(df1)
4 5
3 42.0 NaN
4 25.0 NaN
5 39.0 NaN
24 NaN 94.0
25 NaN 4.0
55 NaN 923.0
252 NaN 49.0


whereby the columns for this are the main dictionary keys 4 and `5', the row indices are the subdictionary keys and the values are the subdictionary values.



The other option is



df2 = pd.DataFrame.from_dict(dict1, orient='index')
print(df2)
4 5 3 24 252 25 55
4 25.0 39.0 42.0 NaN NaN NaN NaN
5 NaN NaN NaN 94.0 49.0 4.0 923.0


whereby the columns are the keys of the inner "sub-dictionary", the row indices are the keys of the main dictionary, and the values are the subdictionary keys.



Is there a standard approach which allows us to unfold the python dictionary as follows?



key inner_key values
4 3 42
4 4 25
4 5 39
5 24 94
5 25 4
5 55 923
5 252 49


It would be best not to manipulate the DataFrame after using from_dict(), as for far larger python dictionaries, this could become quite memory intensive.










share|improve this question



























    2















    Consider the following dictionary of dictionaries in python3.x



    dict1 = {4: {4:25, 5:39, 3:42}, 5:{24:94, 252:49, 25:4, 55:923}}


    I would like to unfold this into a pandas DataFrame. There appear to be two options:



    df1 = pd.DataFrame.from_dict(dict1, orient='columns')

    print(df1)
    4 5
    3 42.0 NaN
    4 25.0 NaN
    5 39.0 NaN
    24 NaN 94.0
    25 NaN 4.0
    55 NaN 923.0
    252 NaN 49.0


    whereby the columns for this are the main dictionary keys 4 and `5', the row indices are the subdictionary keys and the values are the subdictionary values.



    The other option is



    df2 = pd.DataFrame.from_dict(dict1, orient='index')
    print(df2)
    4 5 3 24 252 25 55
    4 25.0 39.0 42.0 NaN NaN NaN NaN
    5 NaN NaN NaN 94.0 49.0 4.0 923.0


    whereby the columns are the keys of the inner "sub-dictionary", the row indices are the keys of the main dictionary, and the values are the subdictionary keys.



    Is there a standard approach which allows us to unfold the python dictionary as follows?



    key inner_key values
    4 3 42
    4 4 25
    4 5 39
    5 24 94
    5 25 4
    5 55 923
    5 252 49


    It would be best not to manipulate the DataFrame after using from_dict(), as for far larger python dictionaries, this could become quite memory intensive.










    share|improve this question

























      2












      2








      2








      Consider the following dictionary of dictionaries in python3.x



      dict1 = {4: {4:25, 5:39, 3:42}, 5:{24:94, 252:49, 25:4, 55:923}}


      I would like to unfold this into a pandas DataFrame. There appear to be two options:



      df1 = pd.DataFrame.from_dict(dict1, orient='columns')

      print(df1)
      4 5
      3 42.0 NaN
      4 25.0 NaN
      5 39.0 NaN
      24 NaN 94.0
      25 NaN 4.0
      55 NaN 923.0
      252 NaN 49.0


      whereby the columns for this are the main dictionary keys 4 and `5', the row indices are the subdictionary keys and the values are the subdictionary values.



      The other option is



      df2 = pd.DataFrame.from_dict(dict1, orient='index')
      print(df2)
      4 5 3 24 252 25 55
      4 25.0 39.0 42.0 NaN NaN NaN NaN
      5 NaN NaN NaN 94.0 49.0 4.0 923.0


      whereby the columns are the keys of the inner "sub-dictionary", the row indices are the keys of the main dictionary, and the values are the subdictionary keys.



      Is there a standard approach which allows us to unfold the python dictionary as follows?



      key inner_key values
      4 3 42
      4 4 25
      4 5 39
      5 24 94
      5 25 4
      5 55 923
      5 252 49


      It would be best not to manipulate the DataFrame after using from_dict(), as for far larger python dictionaries, this could become quite memory intensive.










      share|improve this question














      Consider the following dictionary of dictionaries in python3.x



      dict1 = {4: {4:25, 5:39, 3:42}, 5:{24:94, 252:49, 25:4, 55:923}}


      I would like to unfold this into a pandas DataFrame. There appear to be two options:



      df1 = pd.DataFrame.from_dict(dict1, orient='columns')

      print(df1)
      4 5
      3 42.0 NaN
      4 25.0 NaN
      5 39.0 NaN
      24 NaN 94.0
      25 NaN 4.0
      55 NaN 923.0
      252 NaN 49.0


      whereby the columns for this are the main dictionary keys 4 and `5', the row indices are the subdictionary keys and the values are the subdictionary values.



      The other option is



      df2 = pd.DataFrame.from_dict(dict1, orient='index')
      print(df2)
      4 5 3 24 252 25 55
      4 25.0 39.0 42.0 NaN NaN NaN NaN
      5 NaN NaN NaN 94.0 49.0 4.0 923.0


      whereby the columns are the keys of the inner "sub-dictionary", the row indices are the keys of the main dictionary, and the values are the subdictionary keys.



      Is there a standard approach which allows us to unfold the python dictionary as follows?



      key inner_key values
      4 3 42
      4 4 25
      4 5 39
      5 24 94
      5 25 4
      5 55 923
      5 252 49


      It would be best not to manipulate the DataFrame after using from_dict(), as for far larger python dictionaries, this could become quite memory intensive.







      python python-3.x pandas dictionary dataframe






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Nov 22 '18 at 1:58









      ShanZhengYangShanZhengYang

      4,1201350111




      4,1201350111
























          2 Answers
          2






          active

          oldest

          votes


















          2














          List comprehension



          A list comprehension should be fairly efficient:



          dict1 = {4: {4:25, 5:39, 3:42}, 5: {24:94, 252:49, 25:4, 55:923}}

          cols = ['key', 'inner_key', 'values']

          df = pd.DataFrame([[k1, k2, v2] for k1, v1 in dict1.items() for k2, v2 in v1.items()],
          columns=cols).sort_values(cols)

          print(df)

          key inner_key values
          2 4 3 42
          0 4 4 25
          1 4 5 39
          3 5 24 94
          5 5 25 4
          6 5 55 923
          4 5 252 49



          pd.melt + dropna



          If you don't mind working from df1, you can unpivot your dataframe via pd.melt and then drop rows with null value.



          df1 = df1.reset_index()

          res = pd.melt(df1, id_vars='index', value_vars=[4, 5])
          .dropna(subset=['value']).astype(int)

          print(res)

          index variable value
          0 3 4 42
          1 4 4 25
          2 5 4 39
          10 24 5 94
          11 25 5 4
          12 55 5 923
          13 252 5 49





          share|improve this answer


























          • Thanks for the explanation! Much appreciated

            – ShanZhengYang
            Dec 4 '18 at 15:35



















          1














          pd.DataFrame([[i,j,user_dict[i][j] ] for i in user_dict.keys() for j in user_dict[i].keys()],columns=['key', 'inner_key', 'values'])


          Output:



             key  inner_key   values
          0 4 4 25
          1 4 5 39
          2 4 3 42
          3 5 24 94
          4 5 252 49
          5 5 25 4
          6 5 55 923





          share|improve this answer























            Your Answer






            StackExchange.ifUsing("editor", function () {
            StackExchange.using("externalEditor", function () {
            StackExchange.using("snippets", function () {
            StackExchange.snippets.init();
            });
            });
            }, "code-snippets");

            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "1"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: true,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: 10,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });














            draft saved

            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53422873%2fhow-to-unfold-a-dictionary-of-dictionaries-into-a-pandas-dataframe-for-larger-di%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown

























            2 Answers
            2






            active

            oldest

            votes








            2 Answers
            2






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            2














            List comprehension



            A list comprehension should be fairly efficient:



            dict1 = {4: {4:25, 5:39, 3:42}, 5: {24:94, 252:49, 25:4, 55:923}}

            cols = ['key', 'inner_key', 'values']

            df = pd.DataFrame([[k1, k2, v2] for k1, v1 in dict1.items() for k2, v2 in v1.items()],
            columns=cols).sort_values(cols)

            print(df)

            key inner_key values
            2 4 3 42
            0 4 4 25
            1 4 5 39
            3 5 24 94
            5 5 25 4
            6 5 55 923
            4 5 252 49



            pd.melt + dropna



            If you don't mind working from df1, you can unpivot your dataframe via pd.melt and then drop rows with null value.



            df1 = df1.reset_index()

            res = pd.melt(df1, id_vars='index', value_vars=[4, 5])
            .dropna(subset=['value']).astype(int)

            print(res)

            index variable value
            0 3 4 42
            1 4 4 25
            2 5 4 39
            10 24 5 94
            11 25 5 4
            12 55 5 923
            13 252 5 49





            share|improve this answer


























            • Thanks for the explanation! Much appreciated

              – ShanZhengYang
              Dec 4 '18 at 15:35
















            2














            List comprehension



            A list comprehension should be fairly efficient:



            dict1 = {4: {4:25, 5:39, 3:42}, 5: {24:94, 252:49, 25:4, 55:923}}

            cols = ['key', 'inner_key', 'values']

            df = pd.DataFrame([[k1, k2, v2] for k1, v1 in dict1.items() for k2, v2 in v1.items()],
            columns=cols).sort_values(cols)

            print(df)

            key inner_key values
            2 4 3 42
            0 4 4 25
            1 4 5 39
            3 5 24 94
            5 5 25 4
            6 5 55 923
            4 5 252 49



            pd.melt + dropna



            If you don't mind working from df1, you can unpivot your dataframe via pd.melt and then drop rows with null value.



            df1 = df1.reset_index()

            res = pd.melt(df1, id_vars='index', value_vars=[4, 5])
            .dropna(subset=['value']).astype(int)

            print(res)

            index variable value
            0 3 4 42
            1 4 4 25
            2 5 4 39
            10 24 5 94
            11 25 5 4
            12 55 5 923
            13 252 5 49





            share|improve this answer


























            • Thanks for the explanation! Much appreciated

              – ShanZhengYang
              Dec 4 '18 at 15:35














            2












            2








            2







            List comprehension



            A list comprehension should be fairly efficient:



            dict1 = {4: {4:25, 5:39, 3:42}, 5: {24:94, 252:49, 25:4, 55:923}}

            cols = ['key', 'inner_key', 'values']

            df = pd.DataFrame([[k1, k2, v2] for k1, v1 in dict1.items() for k2, v2 in v1.items()],
            columns=cols).sort_values(cols)

            print(df)

            key inner_key values
            2 4 3 42
            0 4 4 25
            1 4 5 39
            3 5 24 94
            5 5 25 4
            6 5 55 923
            4 5 252 49



            pd.melt + dropna



            If you don't mind working from df1, you can unpivot your dataframe via pd.melt and then drop rows with null value.



            df1 = df1.reset_index()

            res = pd.melt(df1, id_vars='index', value_vars=[4, 5])
            .dropna(subset=['value']).astype(int)

            print(res)

            index variable value
            0 3 4 42
            1 4 4 25
            2 5 4 39
            10 24 5 94
            11 25 5 4
            12 55 5 923
            13 252 5 49





            share|improve this answer















            List comprehension



            A list comprehension should be fairly efficient:



            dict1 = {4: {4:25, 5:39, 3:42}, 5: {24:94, 252:49, 25:4, 55:923}}

            cols = ['key', 'inner_key', 'values']

            df = pd.DataFrame([[k1, k2, v2] for k1, v1 in dict1.items() for k2, v2 in v1.items()],
            columns=cols).sort_values(cols)

            print(df)

            key inner_key values
            2 4 3 42
            0 4 4 25
            1 4 5 39
            3 5 24 94
            5 5 25 4
            6 5 55 923
            4 5 252 49



            pd.melt + dropna



            If you don't mind working from df1, you can unpivot your dataframe via pd.melt and then drop rows with null value.



            df1 = df1.reset_index()

            res = pd.melt(df1, id_vars='index', value_vars=[4, 5])
            .dropna(subset=['value']).astype(int)

            print(res)

            index variable value
            0 3 4 42
            1 4 4 25
            2 5 4 39
            10 24 5 94
            11 25 5 4
            12 55 5 923
            13 252 5 49






            share|improve this answer














            share|improve this answer



            share|improve this answer








            edited Nov 22 '18 at 2:15

























            answered Nov 22 '18 at 2:08









            jppjpp

            97.1k2159109




            97.1k2159109













            • Thanks for the explanation! Much appreciated

              – ShanZhengYang
              Dec 4 '18 at 15:35



















            • Thanks for the explanation! Much appreciated

              – ShanZhengYang
              Dec 4 '18 at 15:35

















            Thanks for the explanation! Much appreciated

            – ShanZhengYang
            Dec 4 '18 at 15:35





            Thanks for the explanation! Much appreciated

            – ShanZhengYang
            Dec 4 '18 at 15:35













            1














            pd.DataFrame([[i,j,user_dict[i][j] ] for i in user_dict.keys() for j in user_dict[i].keys()],columns=['key', 'inner_key', 'values'])


            Output:



               key  inner_key   values
            0 4 4 25
            1 4 5 39
            2 4 3 42
            3 5 24 94
            4 5 252 49
            5 5 25 4
            6 5 55 923





            share|improve this answer




























              1














              pd.DataFrame([[i,j,user_dict[i][j] ] for i in user_dict.keys() for j in user_dict[i].keys()],columns=['key', 'inner_key', 'values'])


              Output:



                 key  inner_key   values
              0 4 4 25
              1 4 5 39
              2 4 3 42
              3 5 24 94
              4 5 252 49
              5 5 25 4
              6 5 55 923





              share|improve this answer


























                1












                1








                1







                pd.DataFrame([[i,j,user_dict[i][j] ] for i in user_dict.keys() for j in user_dict[i].keys()],columns=['key', 'inner_key', 'values'])


                Output:



                   key  inner_key   values
                0 4 4 25
                1 4 5 39
                2 4 3 42
                3 5 24 94
                4 5 252 49
                5 5 25 4
                6 5 55 923





                share|improve this answer













                pd.DataFrame([[i,j,user_dict[i][j] ] for i in user_dict.keys() for j in user_dict[i].keys()],columns=['key', 'inner_key', 'values'])


                Output:



                   key  inner_key   values
                0 4 4 25
                1 4 5 39
                2 4 3 42
                3 5 24 94
                4 5 252 49
                5 5 25 4
                6 5 55 923






                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered Nov 22 '18 at 2:54









                min2bromin2bro

                2,04511232




                2,04511232






























                    draft saved

                    draft discarded




















































                    Thanks for contributing an answer to Stack Overflow!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid



                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.


                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function () {
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53422873%2fhow-to-unfold-a-dictionary-of-dictionaries-into-a-pandas-dataframe-for-larger-di%23new-answer', 'question_page');
                    }
                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    Create new schema in PostgreSQL using DBeaver

                    Deepest pit of an array with Javascript: test on Codility

                    Costa Masnaga