Counting frequency of values by date using pandas - Part II











up vote
1
down vote

favorite












I have dataset (dataset1) that looks as follows:



Date        Company     Weekday

2015-01-01 Company1 Monday

2015-01-02 Company1 Tuesday

2015-01-03 Company1 Wednesday

2015-01-04 Company1 Thursday

2015-12-09 Company2 Monday

2015-12-10 Company2 Tuesday
………………………………………………………………………

2016-01-08 Company3 Wednesday

2016-01-09 Company3 Thursday


I then apply the following code:



dataset2 = dataset1.groupby(['Company','Weekday']).size().sort_values(ascending=False)


Once the above code has been applied, I get the following results:



Index                        0

('Company1', Monday) 80

('Company1', Tuesday) 80

('Company1', Wednesday) 79
………………………………………………………………….

('Company3', Tuesday) 34


I am trying to isolate all dataset2 entries with a count values above 50, but I get all kinds of errors when I try the following:



dataset2=dataset2.loc[dataset2[0]>50]


Can anyone offer an opinion?










share|improve this question




















  • 1




    Post error messages and try dataset2=dataset2[dataset2[0]>50]
    – Sociopath
    Nov 20 at 6:06










  • Maybe the column 0 should be used as a string? dataset2=dataset2[dataset2['0']>50]
    – Mohit Motwani
    Nov 20 at 6:09

















up vote
1
down vote

favorite












I have dataset (dataset1) that looks as follows:



Date        Company     Weekday

2015-01-01 Company1 Monday

2015-01-02 Company1 Tuesday

2015-01-03 Company1 Wednesday

2015-01-04 Company1 Thursday

2015-12-09 Company2 Monday

2015-12-10 Company2 Tuesday
………………………………………………………………………

2016-01-08 Company3 Wednesday

2016-01-09 Company3 Thursday


I then apply the following code:



dataset2 = dataset1.groupby(['Company','Weekday']).size().sort_values(ascending=False)


Once the above code has been applied, I get the following results:



Index                        0

('Company1', Monday) 80

('Company1', Tuesday) 80

('Company1', Wednesday) 79
………………………………………………………………….

('Company3', Tuesday) 34


I am trying to isolate all dataset2 entries with a count values above 50, but I get all kinds of errors when I try the following:



dataset2=dataset2.loc[dataset2[0]>50]


Can anyone offer an opinion?










share|improve this question




















  • 1




    Post error messages and try dataset2=dataset2[dataset2[0]>50]
    – Sociopath
    Nov 20 at 6:06










  • Maybe the column 0 should be used as a string? dataset2=dataset2[dataset2['0']>50]
    – Mohit Motwani
    Nov 20 at 6:09















up vote
1
down vote

favorite









up vote
1
down vote

favorite











I have dataset (dataset1) that looks as follows:



Date        Company     Weekday

2015-01-01 Company1 Monday

2015-01-02 Company1 Tuesday

2015-01-03 Company1 Wednesday

2015-01-04 Company1 Thursday

2015-12-09 Company2 Monday

2015-12-10 Company2 Tuesday
………………………………………………………………………

2016-01-08 Company3 Wednesday

2016-01-09 Company3 Thursday


I then apply the following code:



dataset2 = dataset1.groupby(['Company','Weekday']).size().sort_values(ascending=False)


Once the above code has been applied, I get the following results:



Index                        0

('Company1', Monday) 80

('Company1', Tuesday) 80

('Company1', Wednesday) 79
………………………………………………………………….

('Company3', Tuesday) 34


I am trying to isolate all dataset2 entries with a count values above 50, but I get all kinds of errors when I try the following:



dataset2=dataset2.loc[dataset2[0]>50]


Can anyone offer an opinion?










share|improve this question















I have dataset (dataset1) that looks as follows:



Date        Company     Weekday

2015-01-01 Company1 Monday

2015-01-02 Company1 Tuesday

2015-01-03 Company1 Wednesday

2015-01-04 Company1 Thursday

2015-12-09 Company2 Monday

2015-12-10 Company2 Tuesday
………………………………………………………………………

2016-01-08 Company3 Wednesday

2016-01-09 Company3 Thursday


I then apply the following code:



dataset2 = dataset1.groupby(['Company','Weekday']).size().sort_values(ascending=False)


Once the above code has been applied, I get the following results:



Index                        0

('Company1', Monday) 80

('Company1', Tuesday) 80

('Company1', Wednesday) 79
………………………………………………………………….

('Company3', Tuesday) 34


I am trying to isolate all dataset2 entries with a count values above 50, but I get all kinds of errors when I try the following:



dataset2=dataset2.loc[dataset2[0]>50]


Can anyone offer an opinion?







python pandas






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 20 at 6:05









Sociopath

3,41271635




3,41271635










asked Nov 20 at 6:04









maroulator

254




254








  • 1




    Post error messages and try dataset2=dataset2[dataset2[0]>50]
    – Sociopath
    Nov 20 at 6:06










  • Maybe the column 0 should be used as a string? dataset2=dataset2[dataset2['0']>50]
    – Mohit Motwani
    Nov 20 at 6:09
















  • 1




    Post error messages and try dataset2=dataset2[dataset2[0]>50]
    – Sociopath
    Nov 20 at 6:06










  • Maybe the column 0 should be used as a string? dataset2=dataset2[dataset2['0']>50]
    – Mohit Motwani
    Nov 20 at 6:09










1




1




Post error messages and try dataset2=dataset2[dataset2[0]>50]
– Sociopath
Nov 20 at 6:06




Post error messages and try dataset2=dataset2[dataset2[0]>50]
– Sociopath
Nov 20 at 6:06












Maybe the column 0 should be used as a string? dataset2=dataset2[dataset2['0']>50]
– Mohit Motwani
Nov 20 at 6:09






Maybe the column 0 should be used as a string? dataset2=dataset2[dataset2['0']>50]
– Mohit Motwani
Nov 20 at 6:09














1 Answer
1






active

oldest

votes

















up vote
3
down vote



accepted










Working with Series, so need:



dataset2 = dataset1.groupby(['Company','Weekday']).size().sort_values(ascending=False)
dataset2 = dataset2[dataset2 > 50]


Another solution is add Series.reset_index with parameter name for DataFrame and then filter by column count:



dataset2 = (dataset1.groupby(['Company','Weekday'])
.size()
.sort_values(ascending=False)
.reset_index(name='count'))

dataset2 = dataset2[dataset2['count'] > 50]





share|improve this answer























    Your Answer






    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "1"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53387122%2fcounting-frequency-of-values-by-date-using-pandas-part-ii%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes








    up vote
    3
    down vote



    accepted










    Working with Series, so need:



    dataset2 = dataset1.groupby(['Company','Weekday']).size().sort_values(ascending=False)
    dataset2 = dataset2[dataset2 > 50]


    Another solution is add Series.reset_index with parameter name for DataFrame and then filter by column count:



    dataset2 = (dataset1.groupby(['Company','Weekday'])
    .size()
    .sort_values(ascending=False)
    .reset_index(name='count'))

    dataset2 = dataset2[dataset2['count'] > 50]





    share|improve this answer



























      up vote
      3
      down vote



      accepted










      Working with Series, so need:



      dataset2 = dataset1.groupby(['Company','Weekday']).size().sort_values(ascending=False)
      dataset2 = dataset2[dataset2 > 50]


      Another solution is add Series.reset_index with parameter name for DataFrame and then filter by column count:



      dataset2 = (dataset1.groupby(['Company','Weekday'])
      .size()
      .sort_values(ascending=False)
      .reset_index(name='count'))

      dataset2 = dataset2[dataset2['count'] > 50]





      share|improve this answer

























        up vote
        3
        down vote



        accepted







        up vote
        3
        down vote



        accepted






        Working with Series, so need:



        dataset2 = dataset1.groupby(['Company','Weekday']).size().sort_values(ascending=False)
        dataset2 = dataset2[dataset2 > 50]


        Another solution is add Series.reset_index with parameter name for DataFrame and then filter by column count:



        dataset2 = (dataset1.groupby(['Company','Weekday'])
        .size()
        .sort_values(ascending=False)
        .reset_index(name='count'))

        dataset2 = dataset2[dataset2['count'] > 50]





        share|improve this answer














        Working with Series, so need:



        dataset2 = dataset1.groupby(['Company','Weekday']).size().sort_values(ascending=False)
        dataset2 = dataset2[dataset2 > 50]


        Another solution is add Series.reset_index with parameter name for DataFrame and then filter by column count:



        dataset2 = (dataset1.groupby(['Company','Weekday'])
        .size()
        .sort_values(ascending=False)
        .reset_index(name='count'))

        dataset2 = dataset2[dataset2['count'] > 50]






        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited Nov 20 at 6:20

























        answered Nov 20 at 6:10









        jezrael

        317k22257336




        317k22257336






























            draft saved

            draft discarded




















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.





            Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


            Please pay close attention to the following guidance:


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53387122%2fcounting-frequency-of-values-by-date-using-pandas-part-ii%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Create new schema in PostgreSQL using DBeaver

            Deepest pit of an array with Javascript: test on Codility

            Costa Masnaga