How to efficiently unpivot MULTIPLE columns in Hive?












1














My data is structured like in the below table:



| Name | Foo_A | Foo_B | Foo_C | Bar_A | Bar_B | Bar_C |
--------------------------------------------------------
| abcd | 16 | 32 | 14 | 52 | 41 | 17 |
| ... | ... | ... | ... | ... | ... | ... |


I am looking to query the data in Hive in a way such that it looks like this:



| Name | Class | FooVal | BarVal |
----------------------------------
| abcd | A | 16 | 52 |
| abcd | B | 32 | 41 |
| abcd | C | 14 | 17 |
| ... | ... | ... | ... |


I am already aware of and am using a UNION ALL, but what would be a more efficient way of doing this using "LATERAL VIEW explode" a map data type?










share|improve this question





























    1














    My data is structured like in the below table:



    | Name | Foo_A | Foo_B | Foo_C | Bar_A | Bar_B | Bar_C |
    --------------------------------------------------------
    | abcd | 16 | 32 | 14 | 52 | 41 | 17 |
    | ... | ... | ... | ... | ... | ... | ... |


    I am looking to query the data in Hive in a way such that it looks like this:



    | Name | Class | FooVal | BarVal |
    ----------------------------------
    | abcd | A | 16 | 52 |
    | abcd | B | 32 | 41 |
    | abcd | C | 14 | 17 |
    | ... | ... | ... | ... |


    I am already aware of and am using a UNION ALL, but what would be a more efficient way of doing this using "LATERAL VIEW explode" a map data type?










    share|improve this question



























      1












      1








      1


      1





      My data is structured like in the below table:



      | Name | Foo_A | Foo_B | Foo_C | Bar_A | Bar_B | Bar_C |
      --------------------------------------------------------
      | abcd | 16 | 32 | 14 | 52 | 41 | 17 |
      | ... | ... | ... | ... | ... | ... | ... |


      I am looking to query the data in Hive in a way such that it looks like this:



      | Name | Class | FooVal | BarVal |
      ----------------------------------
      | abcd | A | 16 | 52 |
      | abcd | B | 32 | 41 |
      | abcd | C | 14 | 17 |
      | ... | ... | ... | ... |


      I am already aware of and am using a UNION ALL, but what would be a more efficient way of doing this using "LATERAL VIEW explode" a map data type?










      share|improve this question















      My data is structured like in the below table:



      | Name | Foo_A | Foo_B | Foo_C | Bar_A | Bar_B | Bar_C |
      --------------------------------------------------------
      | abcd | 16 | 32 | 14 | 52 | 41 | 17 |
      | ... | ... | ... | ... | ... | ... | ... |


      I am looking to query the data in Hive in a way such that it looks like this:



      | Name | Class | FooVal | BarVal |
      ----------------------------------
      | abcd | A | 16 | 52 |
      | abcd | B | 32 | 41 |
      | abcd | C | 14 | 17 |
      | ... | ... | ... | ... |


      I am already aware of and am using a UNION ALL, but what would be a more efficient way of doing this using "LATERAL VIEW explode" a map data type?







      hive query-optimization hiveql unpivot






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Nov 20 at 17:50









      leftjoin

      8,17422050




      8,17422050










      asked Nov 20 at 16:27









      lisa

      112




      112
























          2 Answers
          2






          active

          oldest

          votes


















          0














          CROSS JOIN with class stack (see code example) will multiply main table rows x3, one row per class, then use case statements to derive your columns depending on class value. CROSS JOIN with small dataset (3 rows) should be transformed to map join and will execute very fast on mappers.



          set hive.auto.convert.join=true; --this enables map-join

          select t.Name,
          s.class,
          case s.class when 'A' then t.Foo_A
          when 'B' then t.foo_B
          when 'C' then t.foo_C
          end as FooVal,
          case s.class when 'A' then t.Bar_A
          when 'B' then t.Bar_B
          when 'C' then t.Bar_C
          end as BarVal
          from table t
          cross join (select stack(3,'A','B','C') as class) s
          ;


          It will scan the table only once and perform much better than UNION ALL approach.






          share|improve this answer































            0














            Thanks for reply! Please find below another way of doing it which is faster than CROSS JOIN.



                select t1.ID, t2.key_1 as class, t2.FooVal, t3.BarVal
            from table t1
            LATERAL VIEW explode (map(
            'A', Foo_A,
            'B', Foo_B,
            'C', Foo_C
            )) t2 as key_1, FooVal
            LATERAL VIEW explode (map(
            'A', Bar_A,
            'B', Bar_B,
            'C', Bar_C
            )) t3 as key_2, BarVal
            where t2.key_1 = t3.key_2;





            share|improve this answer





















              Your Answer






              StackExchange.ifUsing("editor", function () {
              StackExchange.using("externalEditor", function () {
              StackExchange.using("snippets", function () {
              StackExchange.snippets.init();
              });
              });
              }, "code-snippets");

              StackExchange.ready(function() {
              var channelOptions = {
              tags: "".split(" "),
              id: "1"
              };
              initTagRenderer("".split(" "), "".split(" "), channelOptions);

              StackExchange.using("externalEditor", function() {
              // Have to fire editor after snippets, if snippets enabled
              if (StackExchange.settings.snippets.snippetsEnabled) {
              StackExchange.using("snippets", function() {
              createEditor();
              });
              }
              else {
              createEditor();
              }
              });

              function createEditor() {
              StackExchange.prepareEditor({
              heartbeatType: 'answer',
              autoActivateHeartbeat: false,
              convertImagesToLinks: true,
              noModals: true,
              showLowRepImageUploadWarning: true,
              reputationToPostImages: 10,
              bindNavPrevention: true,
              postfix: "",
              imageUploader: {
              brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
              contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
              allowUrls: true
              },
              onDemand: true,
              discardSelector: ".discard-answer"
              ,immediatelyShowMarkdownHelp:true
              });


              }
              });














              draft saved

              draft discarded


















              StackExchange.ready(
              function () {
              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53397354%2fhow-to-efficiently-unpivot-multiple-columns-in-hive%23new-answer', 'question_page');
              }
              );

              Post as a guest















              Required, but never shown

























              2 Answers
              2






              active

              oldest

              votes








              2 Answers
              2






              active

              oldest

              votes









              active

              oldest

              votes






              active

              oldest

              votes









              0














              CROSS JOIN with class stack (see code example) will multiply main table rows x3, one row per class, then use case statements to derive your columns depending on class value. CROSS JOIN with small dataset (3 rows) should be transformed to map join and will execute very fast on mappers.



              set hive.auto.convert.join=true; --this enables map-join

              select t.Name,
              s.class,
              case s.class when 'A' then t.Foo_A
              when 'B' then t.foo_B
              when 'C' then t.foo_C
              end as FooVal,
              case s.class when 'A' then t.Bar_A
              when 'B' then t.Bar_B
              when 'C' then t.Bar_C
              end as BarVal
              from table t
              cross join (select stack(3,'A','B','C') as class) s
              ;


              It will scan the table only once and perform much better than UNION ALL approach.






              share|improve this answer




























                0














                CROSS JOIN with class stack (see code example) will multiply main table rows x3, one row per class, then use case statements to derive your columns depending on class value. CROSS JOIN with small dataset (3 rows) should be transformed to map join and will execute very fast on mappers.



                set hive.auto.convert.join=true; --this enables map-join

                select t.Name,
                s.class,
                case s.class when 'A' then t.Foo_A
                when 'B' then t.foo_B
                when 'C' then t.foo_C
                end as FooVal,
                case s.class when 'A' then t.Bar_A
                when 'B' then t.Bar_B
                when 'C' then t.Bar_C
                end as BarVal
                from table t
                cross join (select stack(3,'A','B','C') as class) s
                ;


                It will scan the table only once and perform much better than UNION ALL approach.






                share|improve this answer


























                  0












                  0








                  0






                  CROSS JOIN with class stack (see code example) will multiply main table rows x3, one row per class, then use case statements to derive your columns depending on class value. CROSS JOIN with small dataset (3 rows) should be transformed to map join and will execute very fast on mappers.



                  set hive.auto.convert.join=true; --this enables map-join

                  select t.Name,
                  s.class,
                  case s.class when 'A' then t.Foo_A
                  when 'B' then t.foo_B
                  when 'C' then t.foo_C
                  end as FooVal,
                  case s.class when 'A' then t.Bar_A
                  when 'B' then t.Bar_B
                  when 'C' then t.Bar_C
                  end as BarVal
                  from table t
                  cross join (select stack(3,'A','B','C') as class) s
                  ;


                  It will scan the table only once and perform much better than UNION ALL approach.






                  share|improve this answer














                  CROSS JOIN with class stack (see code example) will multiply main table rows x3, one row per class, then use case statements to derive your columns depending on class value. CROSS JOIN with small dataset (3 rows) should be transformed to map join and will execute very fast on mappers.



                  set hive.auto.convert.join=true; --this enables map-join

                  select t.Name,
                  s.class,
                  case s.class when 'A' then t.Foo_A
                  when 'B' then t.foo_B
                  when 'C' then t.foo_C
                  end as FooVal,
                  case s.class when 'A' then t.Bar_A
                  when 'B' then t.Bar_B
                  when 'C' then t.Bar_C
                  end as BarVal
                  from table t
                  cross join (select stack(3,'A','B','C') as class) s
                  ;


                  It will scan the table only once and perform much better than UNION ALL approach.







                  share|improve this answer














                  share|improve this answer



                  share|improve this answer








                  edited Nov 20 at 18:14

























                  answered Nov 20 at 17:36









                  leftjoin

                  8,17422050




                  8,17422050

























                      0














                      Thanks for reply! Please find below another way of doing it which is faster than CROSS JOIN.



                          select t1.ID, t2.key_1 as class, t2.FooVal, t3.BarVal
                      from table t1
                      LATERAL VIEW explode (map(
                      'A', Foo_A,
                      'B', Foo_B,
                      'C', Foo_C
                      )) t2 as key_1, FooVal
                      LATERAL VIEW explode (map(
                      'A', Bar_A,
                      'B', Bar_B,
                      'C', Bar_C
                      )) t3 as key_2, BarVal
                      where t2.key_1 = t3.key_2;





                      share|improve this answer


























                        0














                        Thanks for reply! Please find below another way of doing it which is faster than CROSS JOIN.



                            select t1.ID, t2.key_1 as class, t2.FooVal, t3.BarVal
                        from table t1
                        LATERAL VIEW explode (map(
                        'A', Foo_A,
                        'B', Foo_B,
                        'C', Foo_C
                        )) t2 as key_1, FooVal
                        LATERAL VIEW explode (map(
                        'A', Bar_A,
                        'B', Bar_B,
                        'C', Bar_C
                        )) t3 as key_2, BarVal
                        where t2.key_1 = t3.key_2;





                        share|improve this answer
























                          0












                          0








                          0






                          Thanks for reply! Please find below another way of doing it which is faster than CROSS JOIN.



                              select t1.ID, t2.key_1 as class, t2.FooVal, t3.BarVal
                          from table t1
                          LATERAL VIEW explode (map(
                          'A', Foo_A,
                          'B', Foo_B,
                          'C', Foo_C
                          )) t2 as key_1, FooVal
                          LATERAL VIEW explode (map(
                          'A', Bar_A,
                          'B', Bar_B,
                          'C', Bar_C
                          )) t3 as key_2, BarVal
                          where t2.key_1 = t3.key_2;





                          share|improve this answer












                          Thanks for reply! Please find below another way of doing it which is faster than CROSS JOIN.



                              select t1.ID, t2.key_1 as class, t2.FooVal, t3.BarVal
                          from table t1
                          LATERAL VIEW explode (map(
                          'A', Foo_A,
                          'B', Foo_B,
                          'C', Foo_C
                          )) t2 as key_1, FooVal
                          LATERAL VIEW explode (map(
                          'A', Bar_A,
                          'B', Bar_B,
                          'C', Bar_C
                          )) t3 as key_2, BarVal
                          where t2.key_1 = t3.key_2;






                          share|improve this answer












                          share|improve this answer



                          share|improve this answer










                          answered Dec 20 at 16:15









                          lisa

                          112




                          112






























                              draft saved

                              draft discarded




















































                              Thanks for contributing an answer to Stack Overflow!


                              • Please be sure to answer the question. Provide details and share your research!

                              But avoid



                              • Asking for help, clarification, or responding to other answers.

                              • Making statements based on opinion; back them up with references or personal experience.


                              To learn more, see our tips on writing great answers.





                              Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


                              Please pay close attention to the following guidance:


                              • Please be sure to answer the question. Provide details and share your research!

                              But avoid



                              • Asking for help, clarification, or responding to other answers.

                              • Making statements based on opinion; back them up with references or personal experience.


                              To learn more, see our tips on writing great answers.




                              draft saved


                              draft discarded














                              StackExchange.ready(
                              function () {
                              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53397354%2fhow-to-efficiently-unpivot-multiple-columns-in-hive%23new-answer', 'question_page');
                              }
                              );

                              Post as a guest















                              Required, but never shown





















































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown

































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown







                              Popular posts from this blog

                              Create new schema in PostgreSQL using DBeaver

                              Deepest pit of an array with Javascript: test on Codility

                              Costa Masnaga