How do you delete a column by name in data.table?
To get rid of a column named "foo" in a data.frame
, I can do:
df <- df[-grep('foo', colnames(df))]
However, once df
is converted to a data.table
object, there is no way to just remove a column.
Example:
df <- data.frame(id = 1:100, foo = rnorm(100))
df2 <- df[-grep('foo', colnames(df))] # works
df3 <- data.table(df)
df3[-grep('foo', colnames(df3))]
But once it is converted to a data.table
object, this no longer works.
r data.table
add a comment |
To get rid of a column named "foo" in a data.frame
, I can do:
df <- df[-grep('foo', colnames(df))]
However, once df
is converted to a data.table
object, there is no way to just remove a column.
Example:
df <- data.frame(id = 1:100, foo = rnorm(100))
df2 <- df[-grep('foo', colnames(df))] # works
df3 <- data.table(df)
df3[-grep('foo', colnames(df3))]
But once it is converted to a data.table
object, this no longer works.
r data.table
1
It would have been clearer to name the data.tabledt
instead ofdf3
...
– PatrickT
Dec 19 '15 at 8:38
add a comment |
To get rid of a column named "foo" in a data.frame
, I can do:
df <- df[-grep('foo', colnames(df))]
However, once df
is converted to a data.table
object, there is no way to just remove a column.
Example:
df <- data.frame(id = 1:100, foo = rnorm(100))
df2 <- df[-grep('foo', colnames(df))] # works
df3 <- data.table(df)
df3[-grep('foo', colnames(df3))]
But once it is converted to a data.table
object, this no longer works.
r data.table
To get rid of a column named "foo" in a data.frame
, I can do:
df <- df[-grep('foo', colnames(df))]
However, once df
is converted to a data.table
object, there is no way to just remove a column.
Example:
df <- data.frame(id = 1:100, foo = rnorm(100))
df2 <- df[-grep('foo', colnames(df))] # works
df3 <- data.table(df)
df3[-grep('foo', colnames(df3))]
But once it is converted to a data.table
object, this no longer works.
r data.table
r data.table
edited Feb 26 '16 at 21:22
Henrik
42.1k994110
42.1k994110
asked Feb 8 '12 at 22:20
MaiasauraMaiasaura
17.6k168499
17.6k168499
1
It would have been clearer to name the data.tabledt
instead ofdf3
...
– PatrickT
Dec 19 '15 at 8:38
add a comment |
1
It would have been clearer to name the data.tabledt
instead ofdf3
...
– PatrickT
Dec 19 '15 at 8:38
1
1
It would have been clearer to name the data.table
dt
instead of df3
...– PatrickT
Dec 19 '15 at 8:38
It would have been clearer to name the data.table
dt
instead of df3
...– PatrickT
Dec 19 '15 at 8:38
add a comment |
8 Answers
8
active
oldest
votes
Any of the following will remove column foo
from the data.table df3
:
# Method 1 (and preferred as it takes 0.00s even on a 20GB data.table)
df3[,foo:=NULL]
df3[, c("foo","bar"):=NULL] # remove two columns
myVar = "foo"
df3[, (myVar):=NULL] # lookup myVar contents
# Method 2a -- A safe idiom for excluding (possibly multiple)
# columns matching a regex
df3[, grep("^foo$", colnames(df3)):=NULL]
# Method 2b -- An alternative to 2a, also "safe" in the sense described below
df3[, which(grepl("^foo$", colnames(df3))):=NULL]
data.table also supports the following syntax:
## Method 3 (could then assign to df3,
df3[, !"foo", with=FALSE]
though if you were actually wanting to remove column "foo"
from df3
(as opposed to just printing a view of df3
minus column "foo"
) you'd really want to use Method 1 instead.
(Do note that if you use a method relying on grep()
or grepl()
, you need to set pattern="^foo$"
rather than "foo"
, if you don't want columns with names like "fool"
and "buffoon"
(i.e. those containing foo
as a substring) to also be matched and removed.)
Less safe options, fine for interactive use:
The next two idioms will also work -- if df3
contains a column matching "foo"
-- but will fail in a probably-unexpected way if it does not. If, for instance, you use any of them to search for the non-existent column "bar"
, you'll end up with a zero-row data.table.
As a consequence, they are really best suited for interactive use where one might, e.g., want to display a data.table minus any columns with names containing the substring "foo"
. For programming purposes (or if you are wanting to actually remove the column(s) from df3
rather than from a copy of it), Methods 1, 2a, and 2b are really the best options.
# Method 4a:
df3[, -grep("^foo$", colnames(df3)), with=FALSE]
# Method 4b:
df3[, !grepl("^foo$", colnames(df3)), with=FALSE]
2
See my comment to the OP regarding-grep
versus!grepl
.
– Joshua Ulrich
Feb 8 '12 at 22:36
1
@JoshuaUlrich -- Good point. I triedgrepl()
initally and it didn't work, as data.table columns can't be indexed by a logical vector. But I now realize thatgrepl()
can be made to work by wrapping it withwhich()
, so that it returns an integer vector.
– Josh O'Brien
Feb 8 '12 at 23:38
1
I didn't know that about indexing withdata.table
, but wrapping it inwhich
is clever!
– Joshua Ulrich
Feb 8 '12 at 23:59
6
I didn't know that aboutdata.table
either; added FR#1797. But, method 1 is (almost) infinitely faster than the others. Method 1 removes the column by reference with no copy at all. I doubt you get it above 0.005 seconds for any size data.table. In contrast, the others might not work at all if the table is near 50% of RAM because they copy all but the one to delete.
– Matt Dowle
Feb 9 '12 at 9:27
1
@user3969377 if you want to remove a column based on the contents of a character variable you'd simply wrap it in parenthesis. Ie. df[,(afoo):=NULL]
– Dean MacGregor
Jul 9 '15 at 19:26
|
show 13 more comments
You can also use set
for this, which avoids the overhead of [.data.table
in loops:
dt <- data.table( a=letters, b=LETTERS, c=seq(26), d=letters, e=letters )
set( dt, j=c(1L,3L,5L), value=NULL )
> dt[1:5]
b d
1: A a
2: B b
3: C c
4: D d
5: E e
If you want to do it by column name, which(colnames(dt) %in% c("a","c","e"))
should work for j
.
Indata.table
1.11.8, if you want to do it by column name, you can do directlyrm.col = c("a","b")
anddt[, (rm.col):=NULL]
– Duccio A
Dec 10 '18 at 11:08
add a comment |
I simply do it in the data frame kind of way:
DT$col = NULL
Works fast and as far as I could see doesn't cause any problems.
UPDATE: not the best method if your DT is very large, as using the $<-
operator will lead to object copying. So better use:
DT[, col:=NULL]
add a comment |
Very simple option in case you have many individual columns to delete in a data table and you want to avoid typing in all column names #careadviced
dt <- dt[, -c(1,4,6,17,83,104), with =F]
This will remove columns based on column number instead.
It's obviously not as efficient because it bypasses data.table advantages but if you're working with less than say 500,000 rows it works fine
Plus 1. And in v1.9.8 soon on CRAN, you no longer need thewith=F
part.
– Matt Dowle
Nov 15 '16 at 2:43
add a comment |
Suppose your dt has columns col1
, col2
, col3
, col4
, col5
, coln
.
To delete a subset of them:
vx <- as.character(bquote(c(col1, col2, col3, coln)))[-1]
DT[, paste0(vx):=NULL]
this should be comment
– Sachila Ranawaka
Feb 24 '17 at 3:30
add a comment |
Here is a way when you want to set a # of columns to NULL given their column names
a function for your usage :)
deleteColsFromDataTable <- function (train, toDeleteColNames) {
for (myNm in toDeleteColNames)
train <- train [,(myNm):=NULL,with=F]
return (train)
}
add a comment |
DT[,c:=NULL] # remove column c
add a comment |
For a data.table, assigning the column to NULL removes it:
DT[,c("col1", "col1", "col2", "col2")] <- NULL
^
|---- Notice the extra comma if DT is a data.table
... which is the equivalent of:
DT$col1 <- NULL
DT$col2 <- NULL
DT$col3 <- NULL
DT$col4 <- NULL
The equivalent for a data.frame is:
DF[c("col1", "col1", "col2", "col2")] <- NULL
^
|---- Notice the missing comma if DF is a data.frame
Q. Why is there a comma in the version for data.table, and no comma in the version for data.frame?
A. As data.frames are stored as a list of columns, you can skip the comma. You could also add it in, however then you will need to assign them to a list of NULL
s, DF[, c("col1", "col2", "col3")] <- list(NULL)
.
As to your question -- please read the data table FAQ
– mnel
Mar 31 '14 at 22:05
@Arun I can't think of any situation withdata.frames
where the row and columns would be switched. That would be illogical.
– duHaas
Mar 31 '14 at 22:42
@Arun I tagged you because your first comment made it seem like there were times at which you might callDF[column,row]
so I just wanted to see if there actually were any instances where this happened.
– duHaas
Mar 31 '14 at 22:57
Updated the answer to remove a typo.
– Contango
Apr 2 '14 at 7:30
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f9202413%2fhow-do-you-delete-a-column-by-name-in-data-table%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
8 Answers
8
active
oldest
votes
8 Answers
8
active
oldest
votes
active
oldest
votes
active
oldest
votes
Any of the following will remove column foo
from the data.table df3
:
# Method 1 (and preferred as it takes 0.00s even on a 20GB data.table)
df3[,foo:=NULL]
df3[, c("foo","bar"):=NULL] # remove two columns
myVar = "foo"
df3[, (myVar):=NULL] # lookup myVar contents
# Method 2a -- A safe idiom for excluding (possibly multiple)
# columns matching a regex
df3[, grep("^foo$", colnames(df3)):=NULL]
# Method 2b -- An alternative to 2a, also "safe" in the sense described below
df3[, which(grepl("^foo$", colnames(df3))):=NULL]
data.table also supports the following syntax:
## Method 3 (could then assign to df3,
df3[, !"foo", with=FALSE]
though if you were actually wanting to remove column "foo"
from df3
(as opposed to just printing a view of df3
minus column "foo"
) you'd really want to use Method 1 instead.
(Do note that if you use a method relying on grep()
or grepl()
, you need to set pattern="^foo$"
rather than "foo"
, if you don't want columns with names like "fool"
and "buffoon"
(i.e. those containing foo
as a substring) to also be matched and removed.)
Less safe options, fine for interactive use:
The next two idioms will also work -- if df3
contains a column matching "foo"
-- but will fail in a probably-unexpected way if it does not. If, for instance, you use any of them to search for the non-existent column "bar"
, you'll end up with a zero-row data.table.
As a consequence, they are really best suited for interactive use where one might, e.g., want to display a data.table minus any columns with names containing the substring "foo"
. For programming purposes (or if you are wanting to actually remove the column(s) from df3
rather than from a copy of it), Methods 1, 2a, and 2b are really the best options.
# Method 4a:
df3[, -grep("^foo$", colnames(df3)), with=FALSE]
# Method 4b:
df3[, !grepl("^foo$", colnames(df3)), with=FALSE]
2
See my comment to the OP regarding-grep
versus!grepl
.
– Joshua Ulrich
Feb 8 '12 at 22:36
1
@JoshuaUlrich -- Good point. I triedgrepl()
initally and it didn't work, as data.table columns can't be indexed by a logical vector. But I now realize thatgrepl()
can be made to work by wrapping it withwhich()
, so that it returns an integer vector.
– Josh O'Brien
Feb 8 '12 at 23:38
1
I didn't know that about indexing withdata.table
, but wrapping it inwhich
is clever!
– Joshua Ulrich
Feb 8 '12 at 23:59
6
I didn't know that aboutdata.table
either; added FR#1797. But, method 1 is (almost) infinitely faster than the others. Method 1 removes the column by reference with no copy at all. I doubt you get it above 0.005 seconds for any size data.table. In contrast, the others might not work at all if the table is near 50% of RAM because they copy all but the one to delete.
– Matt Dowle
Feb 9 '12 at 9:27
1
@user3969377 if you want to remove a column based on the contents of a character variable you'd simply wrap it in parenthesis. Ie. df[,(afoo):=NULL]
– Dean MacGregor
Jul 9 '15 at 19:26
|
show 13 more comments
Any of the following will remove column foo
from the data.table df3
:
# Method 1 (and preferred as it takes 0.00s even on a 20GB data.table)
df3[,foo:=NULL]
df3[, c("foo","bar"):=NULL] # remove two columns
myVar = "foo"
df3[, (myVar):=NULL] # lookup myVar contents
# Method 2a -- A safe idiom for excluding (possibly multiple)
# columns matching a regex
df3[, grep("^foo$", colnames(df3)):=NULL]
# Method 2b -- An alternative to 2a, also "safe" in the sense described below
df3[, which(grepl("^foo$", colnames(df3))):=NULL]
data.table also supports the following syntax:
## Method 3 (could then assign to df3,
df3[, !"foo", with=FALSE]
though if you were actually wanting to remove column "foo"
from df3
(as opposed to just printing a view of df3
minus column "foo"
) you'd really want to use Method 1 instead.
(Do note that if you use a method relying on grep()
or grepl()
, you need to set pattern="^foo$"
rather than "foo"
, if you don't want columns with names like "fool"
and "buffoon"
(i.e. those containing foo
as a substring) to also be matched and removed.)
Less safe options, fine for interactive use:
The next two idioms will also work -- if df3
contains a column matching "foo"
-- but will fail in a probably-unexpected way if it does not. If, for instance, you use any of them to search for the non-existent column "bar"
, you'll end up with a zero-row data.table.
As a consequence, they are really best suited for interactive use where one might, e.g., want to display a data.table minus any columns with names containing the substring "foo"
. For programming purposes (or if you are wanting to actually remove the column(s) from df3
rather than from a copy of it), Methods 1, 2a, and 2b are really the best options.
# Method 4a:
df3[, -grep("^foo$", colnames(df3)), with=FALSE]
# Method 4b:
df3[, !grepl("^foo$", colnames(df3)), with=FALSE]
2
See my comment to the OP regarding-grep
versus!grepl
.
– Joshua Ulrich
Feb 8 '12 at 22:36
1
@JoshuaUlrich -- Good point. I triedgrepl()
initally and it didn't work, as data.table columns can't be indexed by a logical vector. But I now realize thatgrepl()
can be made to work by wrapping it withwhich()
, so that it returns an integer vector.
– Josh O'Brien
Feb 8 '12 at 23:38
1
I didn't know that about indexing withdata.table
, but wrapping it inwhich
is clever!
– Joshua Ulrich
Feb 8 '12 at 23:59
6
I didn't know that aboutdata.table
either; added FR#1797. But, method 1 is (almost) infinitely faster than the others. Method 1 removes the column by reference with no copy at all. I doubt you get it above 0.005 seconds for any size data.table. In contrast, the others might not work at all if the table is near 50% of RAM because they copy all but the one to delete.
– Matt Dowle
Feb 9 '12 at 9:27
1
@user3969377 if you want to remove a column based on the contents of a character variable you'd simply wrap it in parenthesis. Ie. df[,(afoo):=NULL]
– Dean MacGregor
Jul 9 '15 at 19:26
|
show 13 more comments
Any of the following will remove column foo
from the data.table df3
:
# Method 1 (and preferred as it takes 0.00s even on a 20GB data.table)
df3[,foo:=NULL]
df3[, c("foo","bar"):=NULL] # remove two columns
myVar = "foo"
df3[, (myVar):=NULL] # lookup myVar contents
# Method 2a -- A safe idiom for excluding (possibly multiple)
# columns matching a regex
df3[, grep("^foo$", colnames(df3)):=NULL]
# Method 2b -- An alternative to 2a, also "safe" in the sense described below
df3[, which(grepl("^foo$", colnames(df3))):=NULL]
data.table also supports the following syntax:
## Method 3 (could then assign to df3,
df3[, !"foo", with=FALSE]
though if you were actually wanting to remove column "foo"
from df3
(as opposed to just printing a view of df3
minus column "foo"
) you'd really want to use Method 1 instead.
(Do note that if you use a method relying on grep()
or grepl()
, you need to set pattern="^foo$"
rather than "foo"
, if you don't want columns with names like "fool"
and "buffoon"
(i.e. those containing foo
as a substring) to also be matched and removed.)
Less safe options, fine for interactive use:
The next two idioms will also work -- if df3
contains a column matching "foo"
-- but will fail in a probably-unexpected way if it does not. If, for instance, you use any of them to search for the non-existent column "bar"
, you'll end up with a zero-row data.table.
As a consequence, they are really best suited for interactive use where one might, e.g., want to display a data.table minus any columns with names containing the substring "foo"
. For programming purposes (or if you are wanting to actually remove the column(s) from df3
rather than from a copy of it), Methods 1, 2a, and 2b are really the best options.
# Method 4a:
df3[, -grep("^foo$", colnames(df3)), with=FALSE]
# Method 4b:
df3[, !grepl("^foo$", colnames(df3)), with=FALSE]
Any of the following will remove column foo
from the data.table df3
:
# Method 1 (and preferred as it takes 0.00s even on a 20GB data.table)
df3[,foo:=NULL]
df3[, c("foo","bar"):=NULL] # remove two columns
myVar = "foo"
df3[, (myVar):=NULL] # lookup myVar contents
# Method 2a -- A safe idiom for excluding (possibly multiple)
# columns matching a regex
df3[, grep("^foo$", colnames(df3)):=NULL]
# Method 2b -- An alternative to 2a, also "safe" in the sense described below
df3[, which(grepl("^foo$", colnames(df3))):=NULL]
data.table also supports the following syntax:
## Method 3 (could then assign to df3,
df3[, !"foo", with=FALSE]
though if you were actually wanting to remove column "foo"
from df3
(as opposed to just printing a view of df3
minus column "foo"
) you'd really want to use Method 1 instead.
(Do note that if you use a method relying on grep()
or grepl()
, you need to set pattern="^foo$"
rather than "foo"
, if you don't want columns with names like "fool"
and "buffoon"
(i.e. those containing foo
as a substring) to also be matched and removed.)
Less safe options, fine for interactive use:
The next two idioms will also work -- if df3
contains a column matching "foo"
-- but will fail in a probably-unexpected way if it does not. If, for instance, you use any of them to search for the non-existent column "bar"
, you'll end up with a zero-row data.table.
As a consequence, they are really best suited for interactive use where one might, e.g., want to display a data.table minus any columns with names containing the substring "foo"
. For programming purposes (or if you are wanting to actually remove the column(s) from df3
rather than from a copy of it), Methods 1, 2a, and 2b are really the best options.
# Method 4a:
df3[, -grep("^foo$", colnames(df3)), with=FALSE]
# Method 4b:
df3[, !grepl("^foo$", colnames(df3)), with=FALSE]
edited Nov 15 '16 at 2:39
Matt Dowle
47.1k16136202
47.1k16136202
answered Feb 8 '12 at 22:27
Josh O'BrienJosh O'Brien
129k18277388
129k18277388
2
See my comment to the OP regarding-grep
versus!grepl
.
– Joshua Ulrich
Feb 8 '12 at 22:36
1
@JoshuaUlrich -- Good point. I triedgrepl()
initally and it didn't work, as data.table columns can't be indexed by a logical vector. But I now realize thatgrepl()
can be made to work by wrapping it withwhich()
, so that it returns an integer vector.
– Josh O'Brien
Feb 8 '12 at 23:38
1
I didn't know that about indexing withdata.table
, but wrapping it inwhich
is clever!
– Joshua Ulrich
Feb 8 '12 at 23:59
6
I didn't know that aboutdata.table
either; added FR#1797. But, method 1 is (almost) infinitely faster than the others. Method 1 removes the column by reference with no copy at all. I doubt you get it above 0.005 seconds for any size data.table. In contrast, the others might not work at all if the table is near 50% of RAM because they copy all but the one to delete.
– Matt Dowle
Feb 9 '12 at 9:27
1
@user3969377 if you want to remove a column based on the contents of a character variable you'd simply wrap it in parenthesis. Ie. df[,(afoo):=NULL]
– Dean MacGregor
Jul 9 '15 at 19:26
|
show 13 more comments
2
See my comment to the OP regarding-grep
versus!grepl
.
– Joshua Ulrich
Feb 8 '12 at 22:36
1
@JoshuaUlrich -- Good point. I triedgrepl()
initally and it didn't work, as data.table columns can't be indexed by a logical vector. But I now realize thatgrepl()
can be made to work by wrapping it withwhich()
, so that it returns an integer vector.
– Josh O'Brien
Feb 8 '12 at 23:38
1
I didn't know that about indexing withdata.table
, but wrapping it inwhich
is clever!
– Joshua Ulrich
Feb 8 '12 at 23:59
6
I didn't know that aboutdata.table
either; added FR#1797. But, method 1 is (almost) infinitely faster than the others. Method 1 removes the column by reference with no copy at all. I doubt you get it above 0.005 seconds for any size data.table. In contrast, the others might not work at all if the table is near 50% of RAM because they copy all but the one to delete.
– Matt Dowle
Feb 9 '12 at 9:27
1
@user3969377 if you want to remove a column based on the contents of a character variable you'd simply wrap it in parenthesis. Ie. df[,(afoo):=NULL]
– Dean MacGregor
Jul 9 '15 at 19:26
2
2
See my comment to the OP regarding
-grep
versus !grepl
.– Joshua Ulrich
Feb 8 '12 at 22:36
See my comment to the OP regarding
-grep
versus !grepl
.– Joshua Ulrich
Feb 8 '12 at 22:36
1
1
@JoshuaUlrich -- Good point. I tried
grepl()
initally and it didn't work, as data.table columns can't be indexed by a logical vector. But I now realize that grepl()
can be made to work by wrapping it with which()
, so that it returns an integer vector.– Josh O'Brien
Feb 8 '12 at 23:38
@JoshuaUlrich -- Good point. I tried
grepl()
initally and it didn't work, as data.table columns can't be indexed by a logical vector. But I now realize that grepl()
can be made to work by wrapping it with which()
, so that it returns an integer vector.– Josh O'Brien
Feb 8 '12 at 23:38
1
1
I didn't know that about indexing with
data.table
, but wrapping it in which
is clever!– Joshua Ulrich
Feb 8 '12 at 23:59
I didn't know that about indexing with
data.table
, but wrapping it in which
is clever!– Joshua Ulrich
Feb 8 '12 at 23:59
6
6
I didn't know that about
data.table
either; added FR#1797. But, method 1 is (almost) infinitely faster than the others. Method 1 removes the column by reference with no copy at all. I doubt you get it above 0.005 seconds for any size data.table. In contrast, the others might not work at all if the table is near 50% of RAM because they copy all but the one to delete.– Matt Dowle
Feb 9 '12 at 9:27
I didn't know that about
data.table
either; added FR#1797. But, method 1 is (almost) infinitely faster than the others. Method 1 removes the column by reference with no copy at all. I doubt you get it above 0.005 seconds for any size data.table. In contrast, the others might not work at all if the table is near 50% of RAM because they copy all but the one to delete.– Matt Dowle
Feb 9 '12 at 9:27
1
1
@user3969377 if you want to remove a column based on the contents of a character variable you'd simply wrap it in parenthesis. Ie. df[,(afoo):=NULL]
– Dean MacGregor
Jul 9 '15 at 19:26
@user3969377 if you want to remove a column based on the contents of a character variable you'd simply wrap it in parenthesis. Ie. df[,(afoo):=NULL]
– Dean MacGregor
Jul 9 '15 at 19:26
|
show 13 more comments
You can also use set
for this, which avoids the overhead of [.data.table
in loops:
dt <- data.table( a=letters, b=LETTERS, c=seq(26), d=letters, e=letters )
set( dt, j=c(1L,3L,5L), value=NULL )
> dt[1:5]
b d
1: A a
2: B b
3: C c
4: D d
5: E e
If you want to do it by column name, which(colnames(dt) %in% c("a","c","e"))
should work for j
.
Indata.table
1.11.8, if you want to do it by column name, you can do directlyrm.col = c("a","b")
anddt[, (rm.col):=NULL]
– Duccio A
Dec 10 '18 at 11:08
add a comment |
You can also use set
for this, which avoids the overhead of [.data.table
in loops:
dt <- data.table( a=letters, b=LETTERS, c=seq(26), d=letters, e=letters )
set( dt, j=c(1L,3L,5L), value=NULL )
> dt[1:5]
b d
1: A a
2: B b
3: C c
4: D d
5: E e
If you want to do it by column name, which(colnames(dt) %in% c("a","c","e"))
should work for j
.
Indata.table
1.11.8, if you want to do it by column name, you can do directlyrm.col = c("a","b")
anddt[, (rm.col):=NULL]
– Duccio A
Dec 10 '18 at 11:08
add a comment |
You can also use set
for this, which avoids the overhead of [.data.table
in loops:
dt <- data.table( a=letters, b=LETTERS, c=seq(26), d=letters, e=letters )
set( dt, j=c(1L,3L,5L), value=NULL )
> dt[1:5]
b d
1: A a
2: B b
3: C c
4: D d
5: E e
If you want to do it by column name, which(colnames(dt) %in% c("a","c","e"))
should work for j
.
You can also use set
for this, which avoids the overhead of [.data.table
in loops:
dt <- data.table( a=letters, b=LETTERS, c=seq(26), d=letters, e=letters )
set( dt, j=c(1L,3L,5L), value=NULL )
> dt[1:5]
b d
1: A a
2: B b
3: C c
4: D d
5: E e
If you want to do it by column name, which(colnames(dt) %in% c("a","c","e"))
should work for j
.
edited Feb 1 at 9:37
SeGa
4,6643936
4,6643936
answered Oct 21 '13 at 20:42
Ari B. FriedmanAri B. Friedman
47.9k26151211
47.9k26151211
Indata.table
1.11.8, if you want to do it by column name, you can do directlyrm.col = c("a","b")
anddt[, (rm.col):=NULL]
– Duccio A
Dec 10 '18 at 11:08
add a comment |
Indata.table
1.11.8, if you want to do it by column name, you can do directlyrm.col = c("a","b")
anddt[, (rm.col):=NULL]
– Duccio A
Dec 10 '18 at 11:08
In
data.table
1.11.8, if you want to do it by column name, you can do directly rm.col = c("a","b")
and dt[, (rm.col):=NULL]
– Duccio A
Dec 10 '18 at 11:08
In
data.table
1.11.8, if you want to do it by column name, you can do directly rm.col = c("a","b")
and dt[, (rm.col):=NULL]
– Duccio A
Dec 10 '18 at 11:08
add a comment |
I simply do it in the data frame kind of way:
DT$col = NULL
Works fast and as far as I could see doesn't cause any problems.
UPDATE: not the best method if your DT is very large, as using the $<-
operator will lead to object copying. So better use:
DT[, col:=NULL]
add a comment |
I simply do it in the data frame kind of way:
DT$col = NULL
Works fast and as far as I could see doesn't cause any problems.
UPDATE: not the best method if your DT is very large, as using the $<-
operator will lead to object copying. So better use:
DT[, col:=NULL]
add a comment |
I simply do it in the data frame kind of way:
DT$col = NULL
Works fast and as far as I could see doesn't cause any problems.
UPDATE: not the best method if your DT is very large, as using the $<-
operator will lead to object copying. So better use:
DT[, col:=NULL]
I simply do it in the data frame kind of way:
DT$col = NULL
Works fast and as far as I could see doesn't cause any problems.
UPDATE: not the best method if your DT is very large, as using the $<-
operator will lead to object copying. So better use:
DT[, col:=NULL]
edited May 21 '15 at 9:34
answered May 19 '13 at 20:39
mspmsp
6341715
6341715
add a comment |
add a comment |
Very simple option in case you have many individual columns to delete in a data table and you want to avoid typing in all column names #careadviced
dt <- dt[, -c(1,4,6,17,83,104), with =F]
This will remove columns based on column number instead.
It's obviously not as efficient because it bypasses data.table advantages but if you're working with less than say 500,000 rows it works fine
Plus 1. And in v1.9.8 soon on CRAN, you no longer need thewith=F
part.
– Matt Dowle
Nov 15 '16 at 2:43
add a comment |
Very simple option in case you have many individual columns to delete in a data table and you want to avoid typing in all column names #careadviced
dt <- dt[, -c(1,4,6,17,83,104), with =F]
This will remove columns based on column number instead.
It's obviously not as efficient because it bypasses data.table advantages but if you're working with less than say 500,000 rows it works fine
Plus 1. And in v1.9.8 soon on CRAN, you no longer need thewith=F
part.
– Matt Dowle
Nov 15 '16 at 2:43
add a comment |
Very simple option in case you have many individual columns to delete in a data table and you want to avoid typing in all column names #careadviced
dt <- dt[, -c(1,4,6,17,83,104), with =F]
This will remove columns based on column number instead.
It's obviously not as efficient because it bypasses data.table advantages but if you're working with less than say 500,000 rows it works fine
Very simple option in case you have many individual columns to delete in a data table and you want to avoid typing in all column names #careadviced
dt <- dt[, -c(1,4,6,17,83,104), with =F]
This will remove columns based on column number instead.
It's obviously not as efficient because it bypasses data.table advantages but if you're working with less than say 500,000 rows it works fine
answered Jul 3 '15 at 2:02
SJDSSJDS
7691023
7691023
Plus 1. And in v1.9.8 soon on CRAN, you no longer need thewith=F
part.
– Matt Dowle
Nov 15 '16 at 2:43
add a comment |
Plus 1. And in v1.9.8 soon on CRAN, you no longer need thewith=F
part.
– Matt Dowle
Nov 15 '16 at 2:43
Plus 1. And in v1.9.8 soon on CRAN, you no longer need the
with=F
part.– Matt Dowle
Nov 15 '16 at 2:43
Plus 1. And in v1.9.8 soon on CRAN, you no longer need the
with=F
part.– Matt Dowle
Nov 15 '16 at 2:43
add a comment |
Suppose your dt has columns col1
, col2
, col3
, col4
, col5
, coln
.
To delete a subset of them:
vx <- as.character(bquote(c(col1, col2, col3, coln)))[-1]
DT[, paste0(vx):=NULL]
this should be comment
– Sachila Ranawaka
Feb 24 '17 at 3:30
add a comment |
Suppose your dt has columns col1
, col2
, col3
, col4
, col5
, coln
.
To delete a subset of them:
vx <- as.character(bquote(c(col1, col2, col3, coln)))[-1]
DT[, paste0(vx):=NULL]
this should be comment
– Sachila Ranawaka
Feb 24 '17 at 3:30
add a comment |
Suppose your dt has columns col1
, col2
, col3
, col4
, col5
, coln
.
To delete a subset of them:
vx <- as.character(bquote(c(col1, col2, col3, coln)))[-1]
DT[, paste0(vx):=NULL]
Suppose your dt has columns col1
, col2
, col3
, col4
, col5
, coln
.
To delete a subset of them:
vx <- as.character(bquote(c(col1, col2, col3, coln)))[-1]
DT[, paste0(vx):=NULL]
edited Feb 24 '17 at 7:32
iled
1,63622235
1,63622235
answered Feb 24 '17 at 2:30
Ricardo PaixaoRicardo Paixao
11
11
this should be comment
– Sachila Ranawaka
Feb 24 '17 at 3:30
add a comment |
this should be comment
– Sachila Ranawaka
Feb 24 '17 at 3:30
this should be comment
– Sachila Ranawaka
Feb 24 '17 at 3:30
this should be comment
– Sachila Ranawaka
Feb 24 '17 at 3:30
add a comment |
Here is a way when you want to set a # of columns to NULL given their column names
a function for your usage :)
deleteColsFromDataTable <- function (train, toDeleteColNames) {
for (myNm in toDeleteColNames)
train <- train [,(myNm):=NULL,with=F]
return (train)
}
add a comment |
Here is a way when you want to set a # of columns to NULL given their column names
a function for your usage :)
deleteColsFromDataTable <- function (train, toDeleteColNames) {
for (myNm in toDeleteColNames)
train <- train [,(myNm):=NULL,with=F]
return (train)
}
add a comment |
Here is a way when you want to set a # of columns to NULL given their column names
a function for your usage :)
deleteColsFromDataTable <- function (train, toDeleteColNames) {
for (myNm in toDeleteColNames)
train <- train [,(myNm):=NULL,with=F]
return (train)
}
Here is a way when you want to set a # of columns to NULL given their column names
a function for your usage :)
deleteColsFromDataTable <- function (train, toDeleteColNames) {
for (myNm in toDeleteColNames)
train <- train [,(myNm):=NULL,with=F]
return (train)
}
answered Apr 14 '14 at 9:22
user3531326user3531326
9
9
add a comment |
add a comment |
DT[,c:=NULL] # remove column c
add a comment |
DT[,c:=NULL] # remove column c
add a comment |
DT[,c:=NULL] # remove column c
DT[,c:=NULL] # remove column c
edited Nov 15 '16 at 4:50
Serjik
5,19964260
5,19964260
answered Nov 15 '16 at 2:25
Durga GaddamDurga Gaddam
7910
7910
add a comment |
add a comment |
For a data.table, assigning the column to NULL removes it:
DT[,c("col1", "col1", "col2", "col2")] <- NULL
^
|---- Notice the extra comma if DT is a data.table
... which is the equivalent of:
DT$col1 <- NULL
DT$col2 <- NULL
DT$col3 <- NULL
DT$col4 <- NULL
The equivalent for a data.frame is:
DF[c("col1", "col1", "col2", "col2")] <- NULL
^
|---- Notice the missing comma if DF is a data.frame
Q. Why is there a comma in the version for data.table, and no comma in the version for data.frame?
A. As data.frames are stored as a list of columns, you can skip the comma. You could also add it in, however then you will need to assign them to a list of NULL
s, DF[, c("col1", "col2", "col3")] <- list(NULL)
.
As to your question -- please read the data table FAQ
– mnel
Mar 31 '14 at 22:05
@Arun I can't think of any situation withdata.frames
where the row and columns would be switched. That would be illogical.
– duHaas
Mar 31 '14 at 22:42
@Arun I tagged you because your first comment made it seem like there were times at which you might callDF[column,row]
so I just wanted to see if there actually were any instances where this happened.
– duHaas
Mar 31 '14 at 22:57
Updated the answer to remove a typo.
– Contango
Apr 2 '14 at 7:30
add a comment |
For a data.table, assigning the column to NULL removes it:
DT[,c("col1", "col1", "col2", "col2")] <- NULL
^
|---- Notice the extra comma if DT is a data.table
... which is the equivalent of:
DT$col1 <- NULL
DT$col2 <- NULL
DT$col3 <- NULL
DT$col4 <- NULL
The equivalent for a data.frame is:
DF[c("col1", "col1", "col2", "col2")] <- NULL
^
|---- Notice the missing comma if DF is a data.frame
Q. Why is there a comma in the version for data.table, and no comma in the version for data.frame?
A. As data.frames are stored as a list of columns, you can skip the comma. You could also add it in, however then you will need to assign them to a list of NULL
s, DF[, c("col1", "col2", "col3")] <- list(NULL)
.
As to your question -- please read the data table FAQ
– mnel
Mar 31 '14 at 22:05
@Arun I can't think of any situation withdata.frames
where the row and columns would be switched. That would be illogical.
– duHaas
Mar 31 '14 at 22:42
@Arun I tagged you because your first comment made it seem like there were times at which you might callDF[column,row]
so I just wanted to see if there actually were any instances where this happened.
– duHaas
Mar 31 '14 at 22:57
Updated the answer to remove a typo.
– Contango
Apr 2 '14 at 7:30
add a comment |
For a data.table, assigning the column to NULL removes it:
DT[,c("col1", "col1", "col2", "col2")] <- NULL
^
|---- Notice the extra comma if DT is a data.table
... which is the equivalent of:
DT$col1 <- NULL
DT$col2 <- NULL
DT$col3 <- NULL
DT$col4 <- NULL
The equivalent for a data.frame is:
DF[c("col1", "col1", "col2", "col2")] <- NULL
^
|---- Notice the missing comma if DF is a data.frame
Q. Why is there a comma in the version for data.table, and no comma in the version for data.frame?
A. As data.frames are stored as a list of columns, you can skip the comma. You could also add it in, however then you will need to assign them to a list of NULL
s, DF[, c("col1", "col2", "col3")] <- list(NULL)
.
For a data.table, assigning the column to NULL removes it:
DT[,c("col1", "col1", "col2", "col2")] <- NULL
^
|---- Notice the extra comma if DT is a data.table
... which is the equivalent of:
DT$col1 <- NULL
DT$col2 <- NULL
DT$col3 <- NULL
DT$col4 <- NULL
The equivalent for a data.frame is:
DF[c("col1", "col1", "col2", "col2")] <- NULL
^
|---- Notice the missing comma if DF is a data.frame
Q. Why is there a comma in the version for data.table, and no comma in the version for data.frame?
A. As data.frames are stored as a list of columns, you can skip the comma. You could also add it in, however then you will need to assign them to a list of NULL
s, DF[, c("col1", "col2", "col3")] <- list(NULL)
.
edited May 4 '14 at 13:39
community wiki
4 revs, 2 users 98%
Contango
As to your question -- please read the data table FAQ
– mnel
Mar 31 '14 at 22:05
@Arun I can't think of any situation withdata.frames
where the row and columns would be switched. That would be illogical.
– duHaas
Mar 31 '14 at 22:42
@Arun I tagged you because your first comment made it seem like there were times at which you might callDF[column,row]
so I just wanted to see if there actually were any instances where this happened.
– duHaas
Mar 31 '14 at 22:57
Updated the answer to remove a typo.
– Contango
Apr 2 '14 at 7:30
add a comment |
As to your question -- please read the data table FAQ
– mnel
Mar 31 '14 at 22:05
@Arun I can't think of any situation withdata.frames
where the row and columns would be switched. That would be illogical.
– duHaas
Mar 31 '14 at 22:42
@Arun I tagged you because your first comment made it seem like there were times at which you might callDF[column,row]
so I just wanted to see if there actually were any instances where this happened.
– duHaas
Mar 31 '14 at 22:57
Updated the answer to remove a typo.
– Contango
Apr 2 '14 at 7:30
As to your question -- please read the data table FAQ
– mnel
Mar 31 '14 at 22:05
As to your question -- please read the data table FAQ
– mnel
Mar 31 '14 at 22:05
@Arun I can't think of any situation with
data.frames
where the row and columns would be switched. That would be illogical.– duHaas
Mar 31 '14 at 22:42
@Arun I can't think of any situation with
data.frames
where the row and columns would be switched. That would be illogical.– duHaas
Mar 31 '14 at 22:42
@Arun I tagged you because your first comment made it seem like there were times at which you might call
DF[column,row]
so I just wanted to see if there actually were any instances where this happened.– duHaas
Mar 31 '14 at 22:57
@Arun I tagged you because your first comment made it seem like there were times at which you might call
DF[column,row]
so I just wanted to see if there actually were any instances where this happened.– duHaas
Mar 31 '14 at 22:57
Updated the answer to remove a typo.
– Contango
Apr 2 '14 at 7:30
Updated the answer to remove a typo.
– Contango
Apr 2 '14 at 7:30
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f9202413%2fhow-do-you-delete-a-column-by-name-in-data-table%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
It would have been clearer to name the data.table
dt
instead ofdf3
...– PatrickT
Dec 19 '15 at 8:38