find longest length and value in repetitive sequence in data.table












2















dt<-fread(  "V1  V2  V3  V4  V5  V6  V7  V8  V9 V10
116 116 116 102 96 96 106 116 116 144
114 114 114 114 114 114 121 111 98 108
88 78 78 77 72 96 96 95 95 95
118 77 77 86 139 127 127 103 93 84
154 154 154 121 121 114 111 111 111 111
175 175 125 125 125 125 164 125 125 141
174 174 125 118 117 116 139 116 102 104
95 95 175 175 176 176 139 123 140 141
140 106 174 162 162 169 140 112 112 112
178 178 178 178 116 95 178 178 178 178")


what I'm trying to do is



find longest sequence in row value and length like this :



         V1  V2  V3  V4  V5  V6  V7  V8  V9 V10 value  length
116 116 116 102 96 96 106 116 116 144 116 3
114 114 114 114 114 114 121 111 98 108 114 6
88 78 78 77 72 96 96 95 95 95 95 3
118 77 77 86 139 127 127 127 93 84 127 3
154 154 154 121 121 114 111 111 111 111 111 4
175 175 125 125 125 125 164 125 125 141 125 4
174 174 125 118 117 116 139 116 102 104 174 2
* 95 95 175 175 176 176 139 123 140 141 95 2*
140 106 174 162 162 169 140 112 112 112 112 3
178 178 178 178 116 95 178 178 178 178 178 5


and if length is same (95, 175, 176)*, choose lowest value



I think rle is one of way but I don't get it.










share|improve this question





























    2















    dt<-fread(  "V1  V2  V3  V4  V5  V6  V7  V8  V9 V10
    116 116 116 102 96 96 106 116 116 144
    114 114 114 114 114 114 121 111 98 108
    88 78 78 77 72 96 96 95 95 95
    118 77 77 86 139 127 127 103 93 84
    154 154 154 121 121 114 111 111 111 111
    175 175 125 125 125 125 164 125 125 141
    174 174 125 118 117 116 139 116 102 104
    95 95 175 175 176 176 139 123 140 141
    140 106 174 162 162 169 140 112 112 112
    178 178 178 178 116 95 178 178 178 178")


    what I'm trying to do is



    find longest sequence in row value and length like this :



             V1  V2  V3  V4  V5  V6  V7  V8  V9 V10 value  length
    116 116 116 102 96 96 106 116 116 144 116 3
    114 114 114 114 114 114 121 111 98 108 114 6
    88 78 78 77 72 96 96 95 95 95 95 3
    118 77 77 86 139 127 127 127 93 84 127 3
    154 154 154 121 121 114 111 111 111 111 111 4
    175 175 125 125 125 125 164 125 125 141 125 4
    174 174 125 118 117 116 139 116 102 104 174 2
    * 95 95 175 175 176 176 139 123 140 141 95 2*
    140 106 174 162 162 169 140 112 112 112 112 3
    178 178 178 178 116 95 178 178 178 178 178 5


    and if length is same (95, 175, 176)*, choose lowest value



    I think rle is one of way but I don't get it.










    share|improve this question



























      2












      2








      2








      dt<-fread(  "V1  V2  V3  V4  V5  V6  V7  V8  V9 V10
      116 116 116 102 96 96 106 116 116 144
      114 114 114 114 114 114 121 111 98 108
      88 78 78 77 72 96 96 95 95 95
      118 77 77 86 139 127 127 103 93 84
      154 154 154 121 121 114 111 111 111 111
      175 175 125 125 125 125 164 125 125 141
      174 174 125 118 117 116 139 116 102 104
      95 95 175 175 176 176 139 123 140 141
      140 106 174 162 162 169 140 112 112 112
      178 178 178 178 116 95 178 178 178 178")


      what I'm trying to do is



      find longest sequence in row value and length like this :



               V1  V2  V3  V4  V5  V6  V7  V8  V9 V10 value  length
      116 116 116 102 96 96 106 116 116 144 116 3
      114 114 114 114 114 114 121 111 98 108 114 6
      88 78 78 77 72 96 96 95 95 95 95 3
      118 77 77 86 139 127 127 127 93 84 127 3
      154 154 154 121 121 114 111 111 111 111 111 4
      175 175 125 125 125 125 164 125 125 141 125 4
      174 174 125 118 117 116 139 116 102 104 174 2
      * 95 95 175 175 176 176 139 123 140 141 95 2*
      140 106 174 162 162 169 140 112 112 112 112 3
      178 178 178 178 116 95 178 178 178 178 178 5


      and if length is same (95, 175, 176)*, choose lowest value



      I think rle is one of way but I don't get it.










      share|improve this question
















      dt<-fread(  "V1  V2  V3  V4  V5  V6  V7  V8  V9 V10
      116 116 116 102 96 96 106 116 116 144
      114 114 114 114 114 114 121 111 98 108
      88 78 78 77 72 96 96 95 95 95
      118 77 77 86 139 127 127 103 93 84
      154 154 154 121 121 114 111 111 111 111
      175 175 125 125 125 125 164 125 125 141
      174 174 125 118 117 116 139 116 102 104
      95 95 175 175 176 176 139 123 140 141
      140 106 174 162 162 169 140 112 112 112
      178 178 178 178 116 95 178 178 178 178")


      what I'm trying to do is



      find longest sequence in row value and length like this :



               V1  V2  V3  V4  V5  V6  V7  V8  V9 V10 value  length
      116 116 116 102 96 96 106 116 116 144 116 3
      114 114 114 114 114 114 121 111 98 108 114 6
      88 78 78 77 72 96 96 95 95 95 95 3
      118 77 77 86 139 127 127 127 93 84 127 3
      154 154 154 121 121 114 111 111 111 111 111 4
      175 175 125 125 125 125 164 125 125 141 125 4
      174 174 125 118 117 116 139 116 102 104 174 2
      * 95 95 175 175 176 176 139 123 140 141 95 2*
      140 106 174 162 162 169 140 112 112 112 112 3
      178 178 178 178 116 95 178 178 178 178 178 5


      and if length is same (95, 175, 176)*, choose lowest value



      I think rle is one of way but I don't get it.







      r data.table






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Nov 22 '18 at 2:28









      Ronak Shah

      35.8k103856




      35.8k103856










      asked Nov 22 '18 at 2:10









      zell kimzell kim

      163




      163
























          2 Answers
          2






          active

          oldest

          votes


















          2














          You can convert into a long format before performing your rle. Then, look up the smallest value of the longest length:



          rmax <- melt(dt[, rn:=.I], id.vars="rn")[, 
          {
          r <- rle(value)
          m <- max(r$lengths)
          .(val=min(r$values[r$lengths==m]), len=m)
          },
          by=.(rn)]

          rmax[dt, on=.(rn)]


          output:



               V1  V2  V3  V4  V5  V6  V7  V8  V9 V10 rn val len
          1: 116 116 116 102 96 96 106 116 116 144 1 116 3
          2: 114 114 114 114 114 114 121 111 98 108 2 114 6
          3: 88 78 78 77 72 96 96 95 95 95 3 95 3
          4: 118 77 77 86 139 127 127 103 93 84 4 77 2
          5: 154 154 154 121 121 114 111 111 111 111 5 111 4
          6: 175 175 125 125 125 125 164 125 125 141 6 125 4
          7: 174 174 125 118 117 116 139 116 102 104 7 174 2
          8: 95 95 175 175 176 176 139 123 140 141 8 95 2
          9: 140 106 174 162 162 169 140 112 112 112 9 112 3
          10: 178 178 178 178 116 95 178 178 178 178 10 178 4





          share|improve this answer































            1














            Might not be the efficient solution as it doesn't take advantage of data.table syntax but one method using apply



            library(data.table)
            dt$length <- apply(dt, 1, function(x) max(table(rleid(x))))
            dt

            # V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 length
            # 1: 116 116 116 102 96 96 106 116 116 144 3
            # 2: 114 114 114 114 114 114 121 111 98 108 6
            # 3: 88 78 78 77 72 96 96 95 95 95 3
            # 4: 118 77 77 86 139 127 127 103 93 84 2
            # 5: 154 154 154 121 121 114 111 111 111 111 4
            # 6: 175 175 125 125 125 125 164 125 125 141 4
            # 7: 174 174 125 118 117 116 139 116 102 104 2
            # 8: 95 95 175 175 176 176 139 123 140 141 2
            # 9: 140 106 174 162 162 169 140 112 112 112 3
            #10: 178 178 178 178 116 95 178 178 178 178 4


            For every row we calculate the length of longest continual sequence of value.






            share|improve this answer





















            • 1





              I can't see how you can really avoid doing nrow * rle calls without getting substantially less clean.

              – thelatemail
              Nov 22 '18 at 2:26











            Your Answer






            StackExchange.ifUsing("editor", function () {
            StackExchange.using("externalEditor", function () {
            StackExchange.using("snippets", function () {
            StackExchange.snippets.init();
            });
            });
            }, "code-snippets");

            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "1"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: true,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: 10,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });














            draft saved

            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53422965%2ffind-longest-length-and-value-in-repetitive-sequence-in-data-table%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown

























            2 Answers
            2






            active

            oldest

            votes








            2 Answers
            2






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            2














            You can convert into a long format before performing your rle. Then, look up the smallest value of the longest length:



            rmax <- melt(dt[, rn:=.I], id.vars="rn")[, 
            {
            r <- rle(value)
            m <- max(r$lengths)
            .(val=min(r$values[r$lengths==m]), len=m)
            },
            by=.(rn)]

            rmax[dt, on=.(rn)]


            output:



                 V1  V2  V3  V4  V5  V6  V7  V8  V9 V10 rn val len
            1: 116 116 116 102 96 96 106 116 116 144 1 116 3
            2: 114 114 114 114 114 114 121 111 98 108 2 114 6
            3: 88 78 78 77 72 96 96 95 95 95 3 95 3
            4: 118 77 77 86 139 127 127 103 93 84 4 77 2
            5: 154 154 154 121 121 114 111 111 111 111 5 111 4
            6: 175 175 125 125 125 125 164 125 125 141 6 125 4
            7: 174 174 125 118 117 116 139 116 102 104 7 174 2
            8: 95 95 175 175 176 176 139 123 140 141 8 95 2
            9: 140 106 174 162 162 169 140 112 112 112 9 112 3
            10: 178 178 178 178 116 95 178 178 178 178 10 178 4





            share|improve this answer




























              2














              You can convert into a long format before performing your rle. Then, look up the smallest value of the longest length:



              rmax <- melt(dt[, rn:=.I], id.vars="rn")[, 
              {
              r <- rle(value)
              m <- max(r$lengths)
              .(val=min(r$values[r$lengths==m]), len=m)
              },
              by=.(rn)]

              rmax[dt, on=.(rn)]


              output:



                   V1  V2  V3  V4  V5  V6  V7  V8  V9 V10 rn val len
              1: 116 116 116 102 96 96 106 116 116 144 1 116 3
              2: 114 114 114 114 114 114 121 111 98 108 2 114 6
              3: 88 78 78 77 72 96 96 95 95 95 3 95 3
              4: 118 77 77 86 139 127 127 103 93 84 4 77 2
              5: 154 154 154 121 121 114 111 111 111 111 5 111 4
              6: 175 175 125 125 125 125 164 125 125 141 6 125 4
              7: 174 174 125 118 117 116 139 116 102 104 7 174 2
              8: 95 95 175 175 176 176 139 123 140 141 8 95 2
              9: 140 106 174 162 162 169 140 112 112 112 9 112 3
              10: 178 178 178 178 116 95 178 178 178 178 10 178 4





              share|improve this answer


























                2












                2








                2







                You can convert into a long format before performing your rle. Then, look up the smallest value of the longest length:



                rmax <- melt(dt[, rn:=.I], id.vars="rn")[, 
                {
                r <- rle(value)
                m <- max(r$lengths)
                .(val=min(r$values[r$lengths==m]), len=m)
                },
                by=.(rn)]

                rmax[dt, on=.(rn)]


                output:



                     V1  V2  V3  V4  V5  V6  V7  V8  V9 V10 rn val len
                1: 116 116 116 102 96 96 106 116 116 144 1 116 3
                2: 114 114 114 114 114 114 121 111 98 108 2 114 6
                3: 88 78 78 77 72 96 96 95 95 95 3 95 3
                4: 118 77 77 86 139 127 127 103 93 84 4 77 2
                5: 154 154 154 121 121 114 111 111 111 111 5 111 4
                6: 175 175 125 125 125 125 164 125 125 141 6 125 4
                7: 174 174 125 118 117 116 139 116 102 104 7 174 2
                8: 95 95 175 175 176 176 139 123 140 141 8 95 2
                9: 140 106 174 162 162 169 140 112 112 112 9 112 3
                10: 178 178 178 178 116 95 178 178 178 178 10 178 4





                share|improve this answer













                You can convert into a long format before performing your rle. Then, look up the smallest value of the longest length:



                rmax <- melt(dt[, rn:=.I], id.vars="rn")[, 
                {
                r <- rle(value)
                m <- max(r$lengths)
                .(val=min(r$values[r$lengths==m]), len=m)
                },
                by=.(rn)]

                rmax[dt, on=.(rn)]


                output:



                     V1  V2  V3  V4  V5  V6  V7  V8  V9 V10 rn val len
                1: 116 116 116 102 96 96 106 116 116 144 1 116 3
                2: 114 114 114 114 114 114 121 111 98 108 2 114 6
                3: 88 78 78 77 72 96 96 95 95 95 3 95 3
                4: 118 77 77 86 139 127 127 103 93 84 4 77 2
                5: 154 154 154 121 121 114 111 111 111 111 5 111 4
                6: 175 175 125 125 125 125 164 125 125 141 6 125 4
                7: 174 174 125 118 117 116 139 116 102 104 7 174 2
                8: 95 95 175 175 176 176 139 123 140 141 8 95 2
                9: 140 106 174 162 162 169 140 112 112 112 9 112 3
                10: 178 178 178 178 116 95 178 178 178 178 10 178 4






                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered Nov 22 '18 at 2:21









                chinsoon12chinsoon12

                8,66111219




                8,66111219

























                    1














                    Might not be the efficient solution as it doesn't take advantage of data.table syntax but one method using apply



                    library(data.table)
                    dt$length <- apply(dt, 1, function(x) max(table(rleid(x))))
                    dt

                    # V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 length
                    # 1: 116 116 116 102 96 96 106 116 116 144 3
                    # 2: 114 114 114 114 114 114 121 111 98 108 6
                    # 3: 88 78 78 77 72 96 96 95 95 95 3
                    # 4: 118 77 77 86 139 127 127 103 93 84 2
                    # 5: 154 154 154 121 121 114 111 111 111 111 4
                    # 6: 175 175 125 125 125 125 164 125 125 141 4
                    # 7: 174 174 125 118 117 116 139 116 102 104 2
                    # 8: 95 95 175 175 176 176 139 123 140 141 2
                    # 9: 140 106 174 162 162 169 140 112 112 112 3
                    #10: 178 178 178 178 116 95 178 178 178 178 4


                    For every row we calculate the length of longest continual sequence of value.






                    share|improve this answer





















                    • 1





                      I can't see how you can really avoid doing nrow * rle calls without getting substantially less clean.

                      – thelatemail
                      Nov 22 '18 at 2:26
















                    1














                    Might not be the efficient solution as it doesn't take advantage of data.table syntax but one method using apply



                    library(data.table)
                    dt$length <- apply(dt, 1, function(x) max(table(rleid(x))))
                    dt

                    # V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 length
                    # 1: 116 116 116 102 96 96 106 116 116 144 3
                    # 2: 114 114 114 114 114 114 121 111 98 108 6
                    # 3: 88 78 78 77 72 96 96 95 95 95 3
                    # 4: 118 77 77 86 139 127 127 103 93 84 2
                    # 5: 154 154 154 121 121 114 111 111 111 111 4
                    # 6: 175 175 125 125 125 125 164 125 125 141 4
                    # 7: 174 174 125 118 117 116 139 116 102 104 2
                    # 8: 95 95 175 175 176 176 139 123 140 141 2
                    # 9: 140 106 174 162 162 169 140 112 112 112 3
                    #10: 178 178 178 178 116 95 178 178 178 178 4


                    For every row we calculate the length of longest continual sequence of value.






                    share|improve this answer





















                    • 1





                      I can't see how you can really avoid doing nrow * rle calls without getting substantially less clean.

                      – thelatemail
                      Nov 22 '18 at 2:26














                    1












                    1








                    1







                    Might not be the efficient solution as it doesn't take advantage of data.table syntax but one method using apply



                    library(data.table)
                    dt$length <- apply(dt, 1, function(x) max(table(rleid(x))))
                    dt

                    # V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 length
                    # 1: 116 116 116 102 96 96 106 116 116 144 3
                    # 2: 114 114 114 114 114 114 121 111 98 108 6
                    # 3: 88 78 78 77 72 96 96 95 95 95 3
                    # 4: 118 77 77 86 139 127 127 103 93 84 2
                    # 5: 154 154 154 121 121 114 111 111 111 111 4
                    # 6: 175 175 125 125 125 125 164 125 125 141 4
                    # 7: 174 174 125 118 117 116 139 116 102 104 2
                    # 8: 95 95 175 175 176 176 139 123 140 141 2
                    # 9: 140 106 174 162 162 169 140 112 112 112 3
                    #10: 178 178 178 178 116 95 178 178 178 178 4


                    For every row we calculate the length of longest continual sequence of value.






                    share|improve this answer















                    Might not be the efficient solution as it doesn't take advantage of data.table syntax but one method using apply



                    library(data.table)
                    dt$length <- apply(dt, 1, function(x) max(table(rleid(x))))
                    dt

                    # V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 length
                    # 1: 116 116 116 102 96 96 106 116 116 144 3
                    # 2: 114 114 114 114 114 114 121 111 98 108 6
                    # 3: 88 78 78 77 72 96 96 95 95 95 3
                    # 4: 118 77 77 86 139 127 127 103 93 84 2
                    # 5: 154 154 154 121 121 114 111 111 111 111 4
                    # 6: 175 175 125 125 125 125 164 125 125 141 4
                    # 7: 174 174 125 118 117 116 139 116 102 104 2
                    # 8: 95 95 175 175 176 176 139 123 140 141 2
                    # 9: 140 106 174 162 162 169 140 112 112 112 3
                    #10: 178 178 178 178 116 95 178 178 178 178 4


                    For every row we calculate the length of longest continual sequence of value.







                    share|improve this answer














                    share|improve this answer



                    share|improve this answer








                    edited Nov 22 '18 at 2:27

























                    answered Nov 22 '18 at 2:25









                    Ronak ShahRonak Shah

                    35.8k103856




                    35.8k103856








                    • 1





                      I can't see how you can really avoid doing nrow * rle calls without getting substantially less clean.

                      – thelatemail
                      Nov 22 '18 at 2:26














                    • 1





                      I can't see how you can really avoid doing nrow * rle calls without getting substantially less clean.

                      – thelatemail
                      Nov 22 '18 at 2:26








                    1




                    1





                    I can't see how you can really avoid doing nrow * rle calls without getting substantially less clean.

                    – thelatemail
                    Nov 22 '18 at 2:26





                    I can't see how you can really avoid doing nrow * rle calls without getting substantially less clean.

                    – thelatemail
                    Nov 22 '18 at 2:26


















                    draft saved

                    draft discarded




















































                    Thanks for contributing an answer to Stack Overflow!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid



                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.


                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function () {
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53422965%2ffind-longest-length-and-value-in-repetitive-sequence-in-data-table%23new-answer', 'question_page');
                    }
                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    Costa Masnaga

                    Fotorealismo

                    Sidney Franklin