find longest length and value in repetitive sequence in data.table
dt<-fread( "V1 V2 V3 V4 V5 V6 V7 V8 V9 V10
116 116 116 102 96 96 106 116 116 144
114 114 114 114 114 114 121 111 98 108
88 78 78 77 72 96 96 95 95 95
118 77 77 86 139 127 127 103 93 84
154 154 154 121 121 114 111 111 111 111
175 175 125 125 125 125 164 125 125 141
174 174 125 118 117 116 139 116 102 104
95 95 175 175 176 176 139 123 140 141
140 106 174 162 162 169 140 112 112 112
178 178 178 178 116 95 178 178 178 178")
what I'm trying to do is
find longest sequence in row value and length like this :
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 value length
116 116 116 102 96 96 106 116 116 144 116 3
114 114 114 114 114 114 121 111 98 108 114 6
88 78 78 77 72 96 96 95 95 95 95 3
118 77 77 86 139 127 127 127 93 84 127 3
154 154 154 121 121 114 111 111 111 111 111 4
175 175 125 125 125 125 164 125 125 141 125 4
174 174 125 118 117 116 139 116 102 104 174 2
* 95 95 175 175 176 176 139 123 140 141 95 2*
140 106 174 162 162 169 140 112 112 112 112 3
178 178 178 178 116 95 178 178 178 178 178 5
and if length is same (95, 175, 176)*, choose lowest value
I think rle
is one of way but I don't get it.
r data.table
add a comment |
dt<-fread( "V1 V2 V3 V4 V5 V6 V7 V8 V9 V10
116 116 116 102 96 96 106 116 116 144
114 114 114 114 114 114 121 111 98 108
88 78 78 77 72 96 96 95 95 95
118 77 77 86 139 127 127 103 93 84
154 154 154 121 121 114 111 111 111 111
175 175 125 125 125 125 164 125 125 141
174 174 125 118 117 116 139 116 102 104
95 95 175 175 176 176 139 123 140 141
140 106 174 162 162 169 140 112 112 112
178 178 178 178 116 95 178 178 178 178")
what I'm trying to do is
find longest sequence in row value and length like this :
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 value length
116 116 116 102 96 96 106 116 116 144 116 3
114 114 114 114 114 114 121 111 98 108 114 6
88 78 78 77 72 96 96 95 95 95 95 3
118 77 77 86 139 127 127 127 93 84 127 3
154 154 154 121 121 114 111 111 111 111 111 4
175 175 125 125 125 125 164 125 125 141 125 4
174 174 125 118 117 116 139 116 102 104 174 2
* 95 95 175 175 176 176 139 123 140 141 95 2*
140 106 174 162 162 169 140 112 112 112 112 3
178 178 178 178 116 95 178 178 178 178 178 5
and if length is same (95, 175, 176)*, choose lowest value
I think rle
is one of way but I don't get it.
r data.table
add a comment |
dt<-fread( "V1 V2 V3 V4 V5 V6 V7 V8 V9 V10
116 116 116 102 96 96 106 116 116 144
114 114 114 114 114 114 121 111 98 108
88 78 78 77 72 96 96 95 95 95
118 77 77 86 139 127 127 103 93 84
154 154 154 121 121 114 111 111 111 111
175 175 125 125 125 125 164 125 125 141
174 174 125 118 117 116 139 116 102 104
95 95 175 175 176 176 139 123 140 141
140 106 174 162 162 169 140 112 112 112
178 178 178 178 116 95 178 178 178 178")
what I'm trying to do is
find longest sequence in row value and length like this :
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 value length
116 116 116 102 96 96 106 116 116 144 116 3
114 114 114 114 114 114 121 111 98 108 114 6
88 78 78 77 72 96 96 95 95 95 95 3
118 77 77 86 139 127 127 127 93 84 127 3
154 154 154 121 121 114 111 111 111 111 111 4
175 175 125 125 125 125 164 125 125 141 125 4
174 174 125 118 117 116 139 116 102 104 174 2
* 95 95 175 175 176 176 139 123 140 141 95 2*
140 106 174 162 162 169 140 112 112 112 112 3
178 178 178 178 116 95 178 178 178 178 178 5
and if length is same (95, 175, 176)*, choose lowest value
I think rle
is one of way but I don't get it.
r data.table
dt<-fread( "V1 V2 V3 V4 V5 V6 V7 V8 V9 V10
116 116 116 102 96 96 106 116 116 144
114 114 114 114 114 114 121 111 98 108
88 78 78 77 72 96 96 95 95 95
118 77 77 86 139 127 127 103 93 84
154 154 154 121 121 114 111 111 111 111
175 175 125 125 125 125 164 125 125 141
174 174 125 118 117 116 139 116 102 104
95 95 175 175 176 176 139 123 140 141
140 106 174 162 162 169 140 112 112 112
178 178 178 178 116 95 178 178 178 178")
what I'm trying to do is
find longest sequence in row value and length like this :
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 value length
116 116 116 102 96 96 106 116 116 144 116 3
114 114 114 114 114 114 121 111 98 108 114 6
88 78 78 77 72 96 96 95 95 95 95 3
118 77 77 86 139 127 127 127 93 84 127 3
154 154 154 121 121 114 111 111 111 111 111 4
175 175 125 125 125 125 164 125 125 141 125 4
174 174 125 118 117 116 139 116 102 104 174 2
* 95 95 175 175 176 176 139 123 140 141 95 2*
140 106 174 162 162 169 140 112 112 112 112 3
178 178 178 178 116 95 178 178 178 178 178 5
and if length is same (95, 175, 176)*, choose lowest value
I think rle
is one of way but I don't get it.
r data.table
r data.table
edited Nov 22 '18 at 2:28
Ronak Shah
35.8k103856
35.8k103856
asked Nov 22 '18 at 2:10
zell kimzell kim
163
163
add a comment |
add a comment |
2 Answers
2
active
oldest
votes
You can convert into a long format before performing your rle
. Then, look up the smallest value of the longest length:
rmax <- melt(dt[, rn:=.I], id.vars="rn")[,
{
r <- rle(value)
m <- max(r$lengths)
.(val=min(r$values[r$lengths==m]), len=m)
},
by=.(rn)]
rmax[dt, on=.(rn)]
output:
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 rn val len
1: 116 116 116 102 96 96 106 116 116 144 1 116 3
2: 114 114 114 114 114 114 121 111 98 108 2 114 6
3: 88 78 78 77 72 96 96 95 95 95 3 95 3
4: 118 77 77 86 139 127 127 103 93 84 4 77 2
5: 154 154 154 121 121 114 111 111 111 111 5 111 4
6: 175 175 125 125 125 125 164 125 125 141 6 125 4
7: 174 174 125 118 117 116 139 116 102 104 7 174 2
8: 95 95 175 175 176 176 139 123 140 141 8 95 2
9: 140 106 174 162 162 169 140 112 112 112 9 112 3
10: 178 178 178 178 116 95 178 178 178 178 10 178 4
add a comment |
Might not be the efficient solution as it doesn't take advantage of data.table
syntax but one method using apply
library(data.table)
dt$length <- apply(dt, 1, function(x) max(table(rleid(x))))
dt
# V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 length
# 1: 116 116 116 102 96 96 106 116 116 144 3
# 2: 114 114 114 114 114 114 121 111 98 108 6
# 3: 88 78 78 77 72 96 96 95 95 95 3
# 4: 118 77 77 86 139 127 127 103 93 84 2
# 5: 154 154 154 121 121 114 111 111 111 111 4
# 6: 175 175 125 125 125 125 164 125 125 141 4
# 7: 174 174 125 118 117 116 139 116 102 104 2
# 8: 95 95 175 175 176 176 139 123 140 141 2
# 9: 140 106 174 162 162 169 140 112 112 112 3
#10: 178 178 178 178 116 95 178 178 178 178 4
For every row we calculate the length of longest continual sequence of value.
1
I can't see how you can really avoid doingnrow
*rle
calls without getting substantially less clean.
– thelatemail
Nov 22 '18 at 2:26
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53422965%2ffind-longest-length-and-value-in-repetitive-sequence-in-data-table%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
You can convert into a long format before performing your rle
. Then, look up the smallest value of the longest length:
rmax <- melt(dt[, rn:=.I], id.vars="rn")[,
{
r <- rle(value)
m <- max(r$lengths)
.(val=min(r$values[r$lengths==m]), len=m)
},
by=.(rn)]
rmax[dt, on=.(rn)]
output:
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 rn val len
1: 116 116 116 102 96 96 106 116 116 144 1 116 3
2: 114 114 114 114 114 114 121 111 98 108 2 114 6
3: 88 78 78 77 72 96 96 95 95 95 3 95 3
4: 118 77 77 86 139 127 127 103 93 84 4 77 2
5: 154 154 154 121 121 114 111 111 111 111 5 111 4
6: 175 175 125 125 125 125 164 125 125 141 6 125 4
7: 174 174 125 118 117 116 139 116 102 104 7 174 2
8: 95 95 175 175 176 176 139 123 140 141 8 95 2
9: 140 106 174 162 162 169 140 112 112 112 9 112 3
10: 178 178 178 178 116 95 178 178 178 178 10 178 4
add a comment |
You can convert into a long format before performing your rle
. Then, look up the smallest value of the longest length:
rmax <- melt(dt[, rn:=.I], id.vars="rn")[,
{
r <- rle(value)
m <- max(r$lengths)
.(val=min(r$values[r$lengths==m]), len=m)
},
by=.(rn)]
rmax[dt, on=.(rn)]
output:
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 rn val len
1: 116 116 116 102 96 96 106 116 116 144 1 116 3
2: 114 114 114 114 114 114 121 111 98 108 2 114 6
3: 88 78 78 77 72 96 96 95 95 95 3 95 3
4: 118 77 77 86 139 127 127 103 93 84 4 77 2
5: 154 154 154 121 121 114 111 111 111 111 5 111 4
6: 175 175 125 125 125 125 164 125 125 141 6 125 4
7: 174 174 125 118 117 116 139 116 102 104 7 174 2
8: 95 95 175 175 176 176 139 123 140 141 8 95 2
9: 140 106 174 162 162 169 140 112 112 112 9 112 3
10: 178 178 178 178 116 95 178 178 178 178 10 178 4
add a comment |
You can convert into a long format before performing your rle
. Then, look up the smallest value of the longest length:
rmax <- melt(dt[, rn:=.I], id.vars="rn")[,
{
r <- rle(value)
m <- max(r$lengths)
.(val=min(r$values[r$lengths==m]), len=m)
},
by=.(rn)]
rmax[dt, on=.(rn)]
output:
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 rn val len
1: 116 116 116 102 96 96 106 116 116 144 1 116 3
2: 114 114 114 114 114 114 121 111 98 108 2 114 6
3: 88 78 78 77 72 96 96 95 95 95 3 95 3
4: 118 77 77 86 139 127 127 103 93 84 4 77 2
5: 154 154 154 121 121 114 111 111 111 111 5 111 4
6: 175 175 125 125 125 125 164 125 125 141 6 125 4
7: 174 174 125 118 117 116 139 116 102 104 7 174 2
8: 95 95 175 175 176 176 139 123 140 141 8 95 2
9: 140 106 174 162 162 169 140 112 112 112 9 112 3
10: 178 178 178 178 116 95 178 178 178 178 10 178 4
You can convert into a long format before performing your rle
. Then, look up the smallest value of the longest length:
rmax <- melt(dt[, rn:=.I], id.vars="rn")[,
{
r <- rle(value)
m <- max(r$lengths)
.(val=min(r$values[r$lengths==m]), len=m)
},
by=.(rn)]
rmax[dt, on=.(rn)]
output:
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 rn val len
1: 116 116 116 102 96 96 106 116 116 144 1 116 3
2: 114 114 114 114 114 114 121 111 98 108 2 114 6
3: 88 78 78 77 72 96 96 95 95 95 3 95 3
4: 118 77 77 86 139 127 127 103 93 84 4 77 2
5: 154 154 154 121 121 114 111 111 111 111 5 111 4
6: 175 175 125 125 125 125 164 125 125 141 6 125 4
7: 174 174 125 118 117 116 139 116 102 104 7 174 2
8: 95 95 175 175 176 176 139 123 140 141 8 95 2
9: 140 106 174 162 162 169 140 112 112 112 9 112 3
10: 178 178 178 178 116 95 178 178 178 178 10 178 4
answered Nov 22 '18 at 2:21
chinsoon12chinsoon12
8,66111219
8,66111219
add a comment |
add a comment |
Might not be the efficient solution as it doesn't take advantage of data.table
syntax but one method using apply
library(data.table)
dt$length <- apply(dt, 1, function(x) max(table(rleid(x))))
dt
# V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 length
# 1: 116 116 116 102 96 96 106 116 116 144 3
# 2: 114 114 114 114 114 114 121 111 98 108 6
# 3: 88 78 78 77 72 96 96 95 95 95 3
# 4: 118 77 77 86 139 127 127 103 93 84 2
# 5: 154 154 154 121 121 114 111 111 111 111 4
# 6: 175 175 125 125 125 125 164 125 125 141 4
# 7: 174 174 125 118 117 116 139 116 102 104 2
# 8: 95 95 175 175 176 176 139 123 140 141 2
# 9: 140 106 174 162 162 169 140 112 112 112 3
#10: 178 178 178 178 116 95 178 178 178 178 4
For every row we calculate the length of longest continual sequence of value.
1
I can't see how you can really avoid doingnrow
*rle
calls without getting substantially less clean.
– thelatemail
Nov 22 '18 at 2:26
add a comment |
Might not be the efficient solution as it doesn't take advantage of data.table
syntax but one method using apply
library(data.table)
dt$length <- apply(dt, 1, function(x) max(table(rleid(x))))
dt
# V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 length
# 1: 116 116 116 102 96 96 106 116 116 144 3
# 2: 114 114 114 114 114 114 121 111 98 108 6
# 3: 88 78 78 77 72 96 96 95 95 95 3
# 4: 118 77 77 86 139 127 127 103 93 84 2
# 5: 154 154 154 121 121 114 111 111 111 111 4
# 6: 175 175 125 125 125 125 164 125 125 141 4
# 7: 174 174 125 118 117 116 139 116 102 104 2
# 8: 95 95 175 175 176 176 139 123 140 141 2
# 9: 140 106 174 162 162 169 140 112 112 112 3
#10: 178 178 178 178 116 95 178 178 178 178 4
For every row we calculate the length of longest continual sequence of value.
1
I can't see how you can really avoid doingnrow
*rle
calls without getting substantially less clean.
– thelatemail
Nov 22 '18 at 2:26
add a comment |
Might not be the efficient solution as it doesn't take advantage of data.table
syntax but one method using apply
library(data.table)
dt$length <- apply(dt, 1, function(x) max(table(rleid(x))))
dt
# V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 length
# 1: 116 116 116 102 96 96 106 116 116 144 3
# 2: 114 114 114 114 114 114 121 111 98 108 6
# 3: 88 78 78 77 72 96 96 95 95 95 3
# 4: 118 77 77 86 139 127 127 103 93 84 2
# 5: 154 154 154 121 121 114 111 111 111 111 4
# 6: 175 175 125 125 125 125 164 125 125 141 4
# 7: 174 174 125 118 117 116 139 116 102 104 2
# 8: 95 95 175 175 176 176 139 123 140 141 2
# 9: 140 106 174 162 162 169 140 112 112 112 3
#10: 178 178 178 178 116 95 178 178 178 178 4
For every row we calculate the length of longest continual sequence of value.
Might not be the efficient solution as it doesn't take advantage of data.table
syntax but one method using apply
library(data.table)
dt$length <- apply(dt, 1, function(x) max(table(rleid(x))))
dt
# V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 length
# 1: 116 116 116 102 96 96 106 116 116 144 3
# 2: 114 114 114 114 114 114 121 111 98 108 6
# 3: 88 78 78 77 72 96 96 95 95 95 3
# 4: 118 77 77 86 139 127 127 103 93 84 2
# 5: 154 154 154 121 121 114 111 111 111 111 4
# 6: 175 175 125 125 125 125 164 125 125 141 4
# 7: 174 174 125 118 117 116 139 116 102 104 2
# 8: 95 95 175 175 176 176 139 123 140 141 2
# 9: 140 106 174 162 162 169 140 112 112 112 3
#10: 178 178 178 178 116 95 178 178 178 178 4
For every row we calculate the length of longest continual sequence of value.
edited Nov 22 '18 at 2:27
answered Nov 22 '18 at 2:25
Ronak ShahRonak Shah
35.8k103856
35.8k103856
1
I can't see how you can really avoid doingnrow
*rle
calls without getting substantially less clean.
– thelatemail
Nov 22 '18 at 2:26
add a comment |
1
I can't see how you can really avoid doingnrow
*rle
calls without getting substantially less clean.
– thelatemail
Nov 22 '18 at 2:26
1
1
I can't see how you can really avoid doing
nrow
* rle
calls without getting substantially less clean.– thelatemail
Nov 22 '18 at 2:26
I can't see how you can really avoid doing
nrow
* rle
calls without getting substantially less clean.– thelatemail
Nov 22 '18 at 2:26
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53422965%2ffind-longest-length-and-value-in-repetitive-sequence-in-data-table%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown