BigQuery - Group by on arrays











up vote
0
down vote

favorite












I want to group by on an array.



sample query:



#standardSQL
WITH `project.dataset.table` AS (
SELECT 'compute' description, '[{"key":"application","value":"scaled-server"},{"key":"department","value":"hrd"}]' labels, 0.323316 cost UNION ALL
SELECT 'compute' description, '[{"key":"application","value":"scaled-server"},{"key":"department","value":"hrd"}]' labels, 0.342825 cost
)
SELECT
description,
ARRAY(
SELECT AS STRUCT
JSON_EXTRACT_SCALAR(kv, '$.key') key,
JSON_EXTRACT_SCALAR(kv, '$.value') value
FROM UNNEST(SPLIT(labels, '},{')) kv_temp,
UNNEST([CONCAT('{', REGEXP_REPLACE(kv_temp, r'^[{|}]$', ''), '}')]) kv
) labels,
cost
FROM `project.dataset.table`


Result of the above query:



Row description labels.key      labels.value    cost     
1 compute application scaled-server 0.323316
department hrd
2 compute application scaled-server 0.342825
department hrd


I want result like below:



Row description labels.key      labels.value    cost     
1 compute application scaled-server 0.666141
department hrd









share|improve this question
























  • what is the logic of grouping? is it - all keys and values should be the same in order to be grouped?
    – Mikhail Berlyant
    Nov 19 at 20:20










  • I just updated the question, the intention is, the description column has what service is this like VM, cloudSQL, BQ and the label column has tell us the information about the labels like vm labels (env=prod) SO, end of the day if I want to get cost for compute or vm which has the env=staging . Thats why I need to group by on DESCRIPTION first then key values
    – SQLadmin
    Nov 19 at 20:23

















up vote
0
down vote

favorite












I want to group by on an array.



sample query:



#standardSQL
WITH `project.dataset.table` AS (
SELECT 'compute' description, '[{"key":"application","value":"scaled-server"},{"key":"department","value":"hrd"}]' labels, 0.323316 cost UNION ALL
SELECT 'compute' description, '[{"key":"application","value":"scaled-server"},{"key":"department","value":"hrd"}]' labels, 0.342825 cost
)
SELECT
description,
ARRAY(
SELECT AS STRUCT
JSON_EXTRACT_SCALAR(kv, '$.key') key,
JSON_EXTRACT_SCALAR(kv, '$.value') value
FROM UNNEST(SPLIT(labels, '},{')) kv_temp,
UNNEST([CONCAT('{', REGEXP_REPLACE(kv_temp, r'^[{|}]$', ''), '}')]) kv
) labels,
cost
FROM `project.dataset.table`


Result of the above query:



Row description labels.key      labels.value    cost     
1 compute application scaled-server 0.323316
department hrd
2 compute application scaled-server 0.342825
department hrd


I want result like below:



Row description labels.key      labels.value    cost     
1 compute application scaled-server 0.666141
department hrd









share|improve this question
























  • what is the logic of grouping? is it - all keys and values should be the same in order to be grouped?
    – Mikhail Berlyant
    Nov 19 at 20:20










  • I just updated the question, the intention is, the description column has what service is this like VM, cloudSQL, BQ and the label column has tell us the information about the labels like vm labels (env=prod) SO, end of the day if I want to get cost for compute or vm which has the env=staging . Thats why I need to group by on DESCRIPTION first then key values
    – SQLadmin
    Nov 19 at 20:23















up vote
0
down vote

favorite









up vote
0
down vote

favorite











I want to group by on an array.



sample query:



#standardSQL
WITH `project.dataset.table` AS (
SELECT 'compute' description, '[{"key":"application","value":"scaled-server"},{"key":"department","value":"hrd"}]' labels, 0.323316 cost UNION ALL
SELECT 'compute' description, '[{"key":"application","value":"scaled-server"},{"key":"department","value":"hrd"}]' labels, 0.342825 cost
)
SELECT
description,
ARRAY(
SELECT AS STRUCT
JSON_EXTRACT_SCALAR(kv, '$.key') key,
JSON_EXTRACT_SCALAR(kv, '$.value') value
FROM UNNEST(SPLIT(labels, '},{')) kv_temp,
UNNEST([CONCAT('{', REGEXP_REPLACE(kv_temp, r'^[{|}]$', ''), '}')]) kv
) labels,
cost
FROM `project.dataset.table`


Result of the above query:



Row description labels.key      labels.value    cost     
1 compute application scaled-server 0.323316
department hrd
2 compute application scaled-server 0.342825
department hrd


I want result like below:



Row description labels.key      labels.value    cost     
1 compute application scaled-server 0.666141
department hrd









share|improve this question















I want to group by on an array.



sample query:



#standardSQL
WITH `project.dataset.table` AS (
SELECT 'compute' description, '[{"key":"application","value":"scaled-server"},{"key":"department","value":"hrd"}]' labels, 0.323316 cost UNION ALL
SELECT 'compute' description, '[{"key":"application","value":"scaled-server"},{"key":"department","value":"hrd"}]' labels, 0.342825 cost
)
SELECT
description,
ARRAY(
SELECT AS STRUCT
JSON_EXTRACT_SCALAR(kv, '$.key') key,
JSON_EXTRACT_SCALAR(kv, '$.value') value
FROM UNNEST(SPLIT(labels, '},{')) kv_temp,
UNNEST([CONCAT('{', REGEXP_REPLACE(kv_temp, r'^[{|}]$', ''), '}')]) kv
) labels,
cost
FROM `project.dataset.table`


Result of the above query:



Row description labels.key      labels.value    cost     
1 compute application scaled-server 0.323316
department hrd
2 compute application scaled-server 0.342825
department hrd


I want result like below:



Row description labels.key      labels.value    cost     
1 compute application scaled-server 0.666141
department hrd






arrays group-by google-cloud-platform google-bigquery






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 19 at 20:28









Mikhail Berlyant

54.7k43368




54.7k43368










asked Nov 19 at 20:16









SQLadmin

323215




323215












  • what is the logic of grouping? is it - all keys and values should be the same in order to be grouped?
    – Mikhail Berlyant
    Nov 19 at 20:20










  • I just updated the question, the intention is, the description column has what service is this like VM, cloudSQL, BQ and the label column has tell us the information about the labels like vm labels (env=prod) SO, end of the day if I want to get cost for compute or vm which has the env=staging . Thats why I need to group by on DESCRIPTION first then key values
    – SQLadmin
    Nov 19 at 20:23




















  • what is the logic of grouping? is it - all keys and values should be the same in order to be grouped?
    – Mikhail Berlyant
    Nov 19 at 20:20










  • I just updated the question, the intention is, the description column has what service is this like VM, cloudSQL, BQ and the label column has tell us the information about the labels like vm labels (env=prod) SO, end of the day if I want to get cost for compute or vm which has the env=staging . Thats why I need to group by on DESCRIPTION first then key values
    – SQLadmin
    Nov 19 at 20:23


















what is the logic of grouping? is it - all keys and values should be the same in order to be grouped?
– Mikhail Berlyant
Nov 19 at 20:20




what is the logic of grouping? is it - all keys and values should be the same in order to be grouped?
– Mikhail Berlyant
Nov 19 at 20:20












I just updated the question, the intention is, the description column has what service is this like VM, cloudSQL, BQ and the label column has tell us the information about the labels like vm labels (env=prod) SO, end of the day if I want to get cost for compute or vm which has the env=staging . Thats why I need to group by on DESCRIPTION first then key values
– SQLadmin
Nov 19 at 20:23






I just updated the question, the intention is, the description column has what service is this like VM, cloudSQL, BQ and the label column has tell us the information about the labels like vm labels (env=prod) SO, end of the day if I want to get cost for compute or vm which has the env=staging . Thats why I need to group by on DESCRIPTION first then key values
– SQLadmin
Nov 19 at 20:23














1 Answer
1






active

oldest

votes

















up vote
1
down vote



accepted










#standardSQL
WITH `project.dataset.table` AS (
SELECT 'compute' description, '[{"key":"application","value":"scaled-server"},{"key":"department","value":"hrd"}]' labels, 0.323316 cost UNION ALL
SELECT 'compute' description, '[{"key":"application","value":"scaled-server"},{"key":"department","value":"hrd"}]' labels, 0.342825 cost
), temp AS (
SELECT description, labels, SUM(cost) AS cost
FROM `project.dataset.table`
GROUP BY description, labels
)
SELECT
description,
ARRAY(
SELECT AS STRUCT
JSON_EXTRACT_SCALAR(kv, '$.key') key,
JSON_EXTRACT_SCALAR(kv, '$.value') value
FROM UNNEST(SPLIT(labels, '},{')) kv_temp,
UNNEST([CONCAT('{', REGEXP_REPLACE(kv_temp, r'^[{|}]$', ''), '}')]) kv
) labels,
cost
FROM temp





share|improve this answer





















    Your Answer






    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "1"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53382005%2fbigquery-group-by-on-arrays%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes








    up vote
    1
    down vote



    accepted










    #standardSQL
    WITH `project.dataset.table` AS (
    SELECT 'compute' description, '[{"key":"application","value":"scaled-server"},{"key":"department","value":"hrd"}]' labels, 0.323316 cost UNION ALL
    SELECT 'compute' description, '[{"key":"application","value":"scaled-server"},{"key":"department","value":"hrd"}]' labels, 0.342825 cost
    ), temp AS (
    SELECT description, labels, SUM(cost) AS cost
    FROM `project.dataset.table`
    GROUP BY description, labels
    )
    SELECT
    description,
    ARRAY(
    SELECT AS STRUCT
    JSON_EXTRACT_SCALAR(kv, '$.key') key,
    JSON_EXTRACT_SCALAR(kv, '$.value') value
    FROM UNNEST(SPLIT(labels, '},{')) kv_temp,
    UNNEST([CONCAT('{', REGEXP_REPLACE(kv_temp, r'^[{|}]$', ''), '}')]) kv
    ) labels,
    cost
    FROM temp





    share|improve this answer

























      up vote
      1
      down vote



      accepted










      #standardSQL
      WITH `project.dataset.table` AS (
      SELECT 'compute' description, '[{"key":"application","value":"scaled-server"},{"key":"department","value":"hrd"}]' labels, 0.323316 cost UNION ALL
      SELECT 'compute' description, '[{"key":"application","value":"scaled-server"},{"key":"department","value":"hrd"}]' labels, 0.342825 cost
      ), temp AS (
      SELECT description, labels, SUM(cost) AS cost
      FROM `project.dataset.table`
      GROUP BY description, labels
      )
      SELECT
      description,
      ARRAY(
      SELECT AS STRUCT
      JSON_EXTRACT_SCALAR(kv, '$.key') key,
      JSON_EXTRACT_SCALAR(kv, '$.value') value
      FROM UNNEST(SPLIT(labels, '},{')) kv_temp,
      UNNEST([CONCAT('{', REGEXP_REPLACE(kv_temp, r'^[{|}]$', ''), '}')]) kv
      ) labels,
      cost
      FROM temp





      share|improve this answer























        up vote
        1
        down vote



        accepted







        up vote
        1
        down vote



        accepted






        #standardSQL
        WITH `project.dataset.table` AS (
        SELECT 'compute' description, '[{"key":"application","value":"scaled-server"},{"key":"department","value":"hrd"}]' labels, 0.323316 cost UNION ALL
        SELECT 'compute' description, '[{"key":"application","value":"scaled-server"},{"key":"department","value":"hrd"}]' labels, 0.342825 cost
        ), temp AS (
        SELECT description, labels, SUM(cost) AS cost
        FROM `project.dataset.table`
        GROUP BY description, labels
        )
        SELECT
        description,
        ARRAY(
        SELECT AS STRUCT
        JSON_EXTRACT_SCALAR(kv, '$.key') key,
        JSON_EXTRACT_SCALAR(kv, '$.value') value
        FROM UNNEST(SPLIT(labels, '},{')) kv_temp,
        UNNEST([CONCAT('{', REGEXP_REPLACE(kv_temp, r'^[{|}]$', ''), '}')]) kv
        ) labels,
        cost
        FROM temp





        share|improve this answer












        #standardSQL
        WITH `project.dataset.table` AS (
        SELECT 'compute' description, '[{"key":"application","value":"scaled-server"},{"key":"department","value":"hrd"}]' labels, 0.323316 cost UNION ALL
        SELECT 'compute' description, '[{"key":"application","value":"scaled-server"},{"key":"department","value":"hrd"}]' labels, 0.342825 cost
        ), temp AS (
        SELECT description, labels, SUM(cost) AS cost
        FROM `project.dataset.table`
        GROUP BY description, labels
        )
        SELECT
        description,
        ARRAY(
        SELECT AS STRUCT
        JSON_EXTRACT_SCALAR(kv, '$.key') key,
        JSON_EXTRACT_SCALAR(kv, '$.value') value
        FROM UNNEST(SPLIT(labels, '},{')) kv_temp,
        UNNEST([CONCAT('{', REGEXP_REPLACE(kv_temp, r'^[{|}]$', ''), '}')]) kv
        ) labels,
        cost
        FROM temp






        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Nov 19 at 20:33









        Mikhail Berlyant

        54.7k43368




        54.7k43368






























            draft saved

            draft discarded




















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.





            Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


            Please pay close attention to the following guidance:


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53382005%2fbigquery-group-by-on-arrays%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Create new schema in PostgreSQL using DBeaver

            Deepest pit of an array with Javascript: test on Codility

            Costa Masnaga