How do you delete a column by name in data.table?












167















To get rid of a column named "foo" in a data.frame, I can do:



df <- df[-grep('foo', colnames(df))]



However, once df is converted to a data.table object, there is no way to just remove a column.



Example:



df <- data.frame(id = 1:100, foo = rnorm(100))
df2 <- df[-grep('foo', colnames(df))] # works
df3 <- data.table(df)
df3[-grep('foo', colnames(df3))]


But once it is converted to a data.table object, this no longer works.










share|improve this question




















  • 1





    It would have been clearer to name the data.table dt instead of df3 ...

    – PatrickT
    Dec 19 '15 at 8:38


















167















To get rid of a column named "foo" in a data.frame, I can do:



df <- df[-grep('foo', colnames(df))]



However, once df is converted to a data.table object, there is no way to just remove a column.



Example:



df <- data.frame(id = 1:100, foo = rnorm(100))
df2 <- df[-grep('foo', colnames(df))] # works
df3 <- data.table(df)
df3[-grep('foo', colnames(df3))]


But once it is converted to a data.table object, this no longer works.










share|improve this question




















  • 1





    It would have been clearer to name the data.table dt instead of df3 ...

    – PatrickT
    Dec 19 '15 at 8:38
















167












167








167


47






To get rid of a column named "foo" in a data.frame, I can do:



df <- df[-grep('foo', colnames(df))]



However, once df is converted to a data.table object, there is no way to just remove a column.



Example:



df <- data.frame(id = 1:100, foo = rnorm(100))
df2 <- df[-grep('foo', colnames(df))] # works
df3 <- data.table(df)
df3[-grep('foo', colnames(df3))]


But once it is converted to a data.table object, this no longer works.










share|improve this question
















To get rid of a column named "foo" in a data.frame, I can do:



df <- df[-grep('foo', colnames(df))]



However, once df is converted to a data.table object, there is no way to just remove a column.



Example:



df <- data.frame(id = 1:100, foo = rnorm(100))
df2 <- df[-grep('foo', colnames(df))] # works
df3 <- data.table(df)
df3[-grep('foo', colnames(df3))]


But once it is converted to a data.table object, this no longer works.







r data.table






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Feb 26 '16 at 21:22









Henrik

42.1k994110




42.1k994110










asked Feb 8 '12 at 22:20









MaiasauraMaiasaura

17.6k168499




17.6k168499








  • 1





    It would have been clearer to name the data.table dt instead of df3 ...

    – PatrickT
    Dec 19 '15 at 8:38
















  • 1





    It would have been clearer to name the data.table dt instead of df3 ...

    – PatrickT
    Dec 19 '15 at 8:38










1




1





It would have been clearer to name the data.table dt instead of df3 ...

– PatrickT
Dec 19 '15 at 8:38







It would have been clearer to name the data.table dt instead of df3 ...

– PatrickT
Dec 19 '15 at 8:38














8 Answers
8






active

oldest

votes


















242














Any of the following will remove column foo from the data.table df3:



# Method 1 (and preferred as it takes 0.00s even on a 20GB data.table)
df3[,foo:=NULL]

df3[, c("foo","bar"):=NULL] # remove two columns

myVar = "foo"
df3[, (myVar):=NULL] # lookup myVar contents

# Method 2a -- A safe idiom for excluding (possibly multiple)
# columns matching a regex
df3[, grep("^foo$", colnames(df3)):=NULL]

# Method 2b -- An alternative to 2a, also "safe" in the sense described below
df3[, which(grepl("^foo$", colnames(df3))):=NULL]


data.table also supports the following syntax:



## Method 3 (could then assign to df3, 
df3[, !"foo", with=FALSE]


though if you were actually wanting to remove column "foo" from df3 (as opposed to just printing a view of df3 minus column "foo") you'd really want to use Method 1 instead.



(Do note that if you use a method relying on grep() or grepl(), you need to set pattern="^foo$" rather than "foo", if you don't want columns with names like "fool" and "buffoon" (i.e. those containing foo as a substring) to also be matched and removed.)



Less safe options, fine for interactive use:



The next two idioms will also work -- if df3 contains a column matching "foo" -- but will fail in a probably-unexpected way if it does not. If, for instance, you use any of them to search for the non-existent column "bar", you'll end up with a zero-row data.table.



As a consequence, they are really best suited for interactive use where one might, e.g., want to display a data.table minus any columns with names containing the substring "foo". For programming purposes (or if you are wanting to actually remove the column(s) from df3 rather than from a copy of it), Methods 1, 2a, and 2b are really the best options.



# Method 4a:
df3[, -grep("^foo$", colnames(df3)), with=FALSE]

# Method 4b:
df3[, !grepl("^foo$", colnames(df3)), with=FALSE]





share|improve this answer





















  • 2





    See my comment to the OP regarding -grep versus !grepl.

    – Joshua Ulrich
    Feb 8 '12 at 22:36






  • 1





    @JoshuaUlrich -- Good point. I tried grepl() initally and it didn't work, as data.table columns can't be indexed by a logical vector. But I now realize that grepl() can be made to work by wrapping it with which(), so that it returns an integer vector.

    – Josh O'Brien
    Feb 8 '12 at 23:38






  • 1





    I didn't know that about indexing with data.table, but wrapping it in which is clever!

    – Joshua Ulrich
    Feb 8 '12 at 23:59






  • 6





    I didn't know that about data.table either; added FR#1797. But, method 1 is (almost) infinitely faster than the others. Method 1 removes the column by reference with no copy at all. I doubt you get it above 0.005 seconds for any size data.table. In contrast, the others might not work at all if the table is near 50% of RAM because they copy all but the one to delete.

    – Matt Dowle
    Feb 9 '12 at 9:27








  • 1





    @user3969377 if you want to remove a column based on the contents of a character variable you'd simply wrap it in parenthesis. Ie. df[,(afoo):=NULL]

    – Dean MacGregor
    Jul 9 '15 at 19:26



















29














You can also use set for this, which avoids the overhead of [.data.table in loops:



dt <- data.table( a=letters, b=LETTERS, c=seq(26), d=letters, e=letters )
set( dt, j=c(1L,3L,5L), value=NULL )
> dt[1:5]
b d
1: A a
2: B b
3: C c
4: D d
5: E e


If you want to do it by column name, which(colnames(dt) %in% c("a","c","e")) should work for j.






share|improve this answer


























  • In data.table 1.11.8, if you want to do it by column name, you can do directly rm.col = c("a","b") and dt[, (rm.col):=NULL]

    – Duccio A
    Dec 10 '18 at 11:08



















16














I simply do it in the data frame kind of way:



DT$col = NULL


Works fast and as far as I could see doesn't cause any problems.



UPDATE: not the best method if your DT is very large, as using the $<- operator will lead to object copying. So better use:



DT[, col:=NULL]





share|improve this answer

































    4














    Very simple option in case you have many individual columns to delete in a data table and you want to avoid typing in all column names #careadviced



    dt <- dt[, -c(1,4,6,17,83,104), with =F]


    This will remove columns based on column number instead.



    It's obviously not as efficient because it bypasses data.table advantages but if you're working with less than say 500,000 rows it works fine






    share|improve this answer
























    • Plus 1. And in v1.9.8 soon on CRAN, you no longer need the with=F part.

      – Matt Dowle
      Nov 15 '16 at 2:43



















    0














    Suppose your dt has columns col1, col2, col3, col4, col5, coln.



    To delete a subset of them:



    vx <- as.character(bquote(c(col1, col2, col3, coln)))[-1]
    DT[, paste0(vx):=NULL]





    share|improve this answer


























    • this should be comment

      – Sachila Ranawaka
      Feb 24 '17 at 3:30



















    -2














    Here is a way when you want to set a # of columns to NULL given their column names
    a function for your usage :)



    deleteColsFromDataTable <- function (train, toDeleteColNames) {



       for (myNm in toDeleteColNames)

    train <- train [,(myNm):=NULL,with=F]

    return (train)


    }






    share|improve this answer































      -2














      DT[,c:=NULL] # remove column c





      share|improve this answer

































        -7














        For a data.table, assigning the column to NULL removes it:



        DT[,c("col1", "col1", "col2", "col2")] <- NULL
        ^
        |---- Notice the extra comma if DT is a data.table


        ... which is the equivalent of:



        DT$col1 <- NULL
        DT$col2 <- NULL
        DT$col3 <- NULL
        DT$col4 <- NULL


        The equivalent for a data.frame is:



        DF[c("col1", "col1", "col2", "col2")] <- NULL
        ^
        |---- Notice the missing comma if DF is a data.frame


        Q. Why is there a comma in the version for data.table, and no comma in the version for data.frame?



        A. As data.frames are stored as a list of columns, you can skip the comma. You could also add it in, however then you will need to assign them to a list of NULLs, DF[, c("col1", "col2", "col3")] <- list(NULL).






        share|improve this answer


























        • As to your question -- please read the data table FAQ

          – mnel
          Mar 31 '14 at 22:05











        • @Arun I can't think of any situation with data.frames where the row and columns would be switched. That would be illogical.

          – duHaas
          Mar 31 '14 at 22:42











        • @Arun I tagged you because your first comment made it seem like there were times at which you might call DF[column,row] so I just wanted to see if there actually were any instances where this happened.

          – duHaas
          Mar 31 '14 at 22:57











        • Updated the answer to remove a typo.

          – Contango
          Apr 2 '14 at 7:30











        Your Answer






        StackExchange.ifUsing("editor", function () {
        StackExchange.using("externalEditor", function () {
        StackExchange.using("snippets", function () {
        StackExchange.snippets.init();
        });
        });
        }, "code-snippets");

        StackExchange.ready(function() {
        var channelOptions = {
        tags: "".split(" "),
        id: "1"
        };
        initTagRenderer("".split(" "), "".split(" "), channelOptions);

        StackExchange.using("externalEditor", function() {
        // Have to fire editor after snippets, if snippets enabled
        if (StackExchange.settings.snippets.snippetsEnabled) {
        StackExchange.using("snippets", function() {
        createEditor();
        });
        }
        else {
        createEditor();
        }
        });

        function createEditor() {
        StackExchange.prepareEditor({
        heartbeatType: 'answer',
        autoActivateHeartbeat: false,
        convertImagesToLinks: true,
        noModals: true,
        showLowRepImageUploadWarning: true,
        reputationToPostImages: 10,
        bindNavPrevention: true,
        postfix: "",
        imageUploader: {
        brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
        contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
        allowUrls: true
        },
        onDemand: true,
        discardSelector: ".discard-answer"
        ,immediatelyShowMarkdownHelp:true
        });


        }
        });














        draft saved

        draft discarded


















        StackExchange.ready(
        function () {
        StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f9202413%2fhow-do-you-delete-a-column-by-name-in-data-table%23new-answer', 'question_page');
        }
        );

        Post as a guest















        Required, but never shown

























        8 Answers
        8






        active

        oldest

        votes








        8 Answers
        8






        active

        oldest

        votes









        active

        oldest

        votes






        active

        oldest

        votes









        242














        Any of the following will remove column foo from the data.table df3:



        # Method 1 (and preferred as it takes 0.00s even on a 20GB data.table)
        df3[,foo:=NULL]

        df3[, c("foo","bar"):=NULL] # remove two columns

        myVar = "foo"
        df3[, (myVar):=NULL] # lookup myVar contents

        # Method 2a -- A safe idiom for excluding (possibly multiple)
        # columns matching a regex
        df3[, grep("^foo$", colnames(df3)):=NULL]

        # Method 2b -- An alternative to 2a, also "safe" in the sense described below
        df3[, which(grepl("^foo$", colnames(df3))):=NULL]


        data.table also supports the following syntax:



        ## Method 3 (could then assign to df3, 
        df3[, !"foo", with=FALSE]


        though if you were actually wanting to remove column "foo" from df3 (as opposed to just printing a view of df3 minus column "foo") you'd really want to use Method 1 instead.



        (Do note that if you use a method relying on grep() or grepl(), you need to set pattern="^foo$" rather than "foo", if you don't want columns with names like "fool" and "buffoon" (i.e. those containing foo as a substring) to also be matched and removed.)



        Less safe options, fine for interactive use:



        The next two idioms will also work -- if df3 contains a column matching "foo" -- but will fail in a probably-unexpected way if it does not. If, for instance, you use any of them to search for the non-existent column "bar", you'll end up with a zero-row data.table.



        As a consequence, they are really best suited for interactive use where one might, e.g., want to display a data.table minus any columns with names containing the substring "foo". For programming purposes (or if you are wanting to actually remove the column(s) from df3 rather than from a copy of it), Methods 1, 2a, and 2b are really the best options.



        # Method 4a:
        df3[, -grep("^foo$", colnames(df3)), with=FALSE]

        # Method 4b:
        df3[, !grepl("^foo$", colnames(df3)), with=FALSE]





        share|improve this answer





















        • 2





          See my comment to the OP regarding -grep versus !grepl.

          – Joshua Ulrich
          Feb 8 '12 at 22:36






        • 1





          @JoshuaUlrich -- Good point. I tried grepl() initally and it didn't work, as data.table columns can't be indexed by a logical vector. But I now realize that grepl() can be made to work by wrapping it with which(), so that it returns an integer vector.

          – Josh O'Brien
          Feb 8 '12 at 23:38






        • 1





          I didn't know that about indexing with data.table, but wrapping it in which is clever!

          – Joshua Ulrich
          Feb 8 '12 at 23:59






        • 6





          I didn't know that about data.table either; added FR#1797. But, method 1 is (almost) infinitely faster than the others. Method 1 removes the column by reference with no copy at all. I doubt you get it above 0.005 seconds for any size data.table. In contrast, the others might not work at all if the table is near 50% of RAM because they copy all but the one to delete.

          – Matt Dowle
          Feb 9 '12 at 9:27








        • 1





          @user3969377 if you want to remove a column based on the contents of a character variable you'd simply wrap it in parenthesis. Ie. df[,(afoo):=NULL]

          – Dean MacGregor
          Jul 9 '15 at 19:26
















        242














        Any of the following will remove column foo from the data.table df3:



        # Method 1 (and preferred as it takes 0.00s even on a 20GB data.table)
        df3[,foo:=NULL]

        df3[, c("foo","bar"):=NULL] # remove two columns

        myVar = "foo"
        df3[, (myVar):=NULL] # lookup myVar contents

        # Method 2a -- A safe idiom for excluding (possibly multiple)
        # columns matching a regex
        df3[, grep("^foo$", colnames(df3)):=NULL]

        # Method 2b -- An alternative to 2a, also "safe" in the sense described below
        df3[, which(grepl("^foo$", colnames(df3))):=NULL]


        data.table also supports the following syntax:



        ## Method 3 (could then assign to df3, 
        df3[, !"foo", with=FALSE]


        though if you were actually wanting to remove column "foo" from df3 (as opposed to just printing a view of df3 minus column "foo") you'd really want to use Method 1 instead.



        (Do note that if you use a method relying on grep() or grepl(), you need to set pattern="^foo$" rather than "foo", if you don't want columns with names like "fool" and "buffoon" (i.e. those containing foo as a substring) to also be matched and removed.)



        Less safe options, fine for interactive use:



        The next two idioms will also work -- if df3 contains a column matching "foo" -- but will fail in a probably-unexpected way if it does not. If, for instance, you use any of them to search for the non-existent column "bar", you'll end up with a zero-row data.table.



        As a consequence, they are really best suited for interactive use where one might, e.g., want to display a data.table minus any columns with names containing the substring "foo". For programming purposes (or if you are wanting to actually remove the column(s) from df3 rather than from a copy of it), Methods 1, 2a, and 2b are really the best options.



        # Method 4a:
        df3[, -grep("^foo$", colnames(df3)), with=FALSE]

        # Method 4b:
        df3[, !grepl("^foo$", colnames(df3)), with=FALSE]





        share|improve this answer





















        • 2





          See my comment to the OP regarding -grep versus !grepl.

          – Joshua Ulrich
          Feb 8 '12 at 22:36






        • 1





          @JoshuaUlrich -- Good point. I tried grepl() initally and it didn't work, as data.table columns can't be indexed by a logical vector. But I now realize that grepl() can be made to work by wrapping it with which(), so that it returns an integer vector.

          – Josh O'Brien
          Feb 8 '12 at 23:38






        • 1





          I didn't know that about indexing with data.table, but wrapping it in which is clever!

          – Joshua Ulrich
          Feb 8 '12 at 23:59






        • 6





          I didn't know that about data.table either; added FR#1797. But, method 1 is (almost) infinitely faster than the others. Method 1 removes the column by reference with no copy at all. I doubt you get it above 0.005 seconds for any size data.table. In contrast, the others might not work at all if the table is near 50% of RAM because they copy all but the one to delete.

          – Matt Dowle
          Feb 9 '12 at 9:27








        • 1





          @user3969377 if you want to remove a column based on the contents of a character variable you'd simply wrap it in parenthesis. Ie. df[,(afoo):=NULL]

          – Dean MacGregor
          Jul 9 '15 at 19:26














        242












        242








        242







        Any of the following will remove column foo from the data.table df3:



        # Method 1 (and preferred as it takes 0.00s even on a 20GB data.table)
        df3[,foo:=NULL]

        df3[, c("foo","bar"):=NULL] # remove two columns

        myVar = "foo"
        df3[, (myVar):=NULL] # lookup myVar contents

        # Method 2a -- A safe idiom for excluding (possibly multiple)
        # columns matching a regex
        df3[, grep("^foo$", colnames(df3)):=NULL]

        # Method 2b -- An alternative to 2a, also "safe" in the sense described below
        df3[, which(grepl("^foo$", colnames(df3))):=NULL]


        data.table also supports the following syntax:



        ## Method 3 (could then assign to df3, 
        df3[, !"foo", with=FALSE]


        though if you were actually wanting to remove column "foo" from df3 (as opposed to just printing a view of df3 minus column "foo") you'd really want to use Method 1 instead.



        (Do note that if you use a method relying on grep() or grepl(), you need to set pattern="^foo$" rather than "foo", if you don't want columns with names like "fool" and "buffoon" (i.e. those containing foo as a substring) to also be matched and removed.)



        Less safe options, fine for interactive use:



        The next two idioms will also work -- if df3 contains a column matching "foo" -- but will fail in a probably-unexpected way if it does not. If, for instance, you use any of them to search for the non-existent column "bar", you'll end up with a zero-row data.table.



        As a consequence, they are really best suited for interactive use where one might, e.g., want to display a data.table minus any columns with names containing the substring "foo". For programming purposes (or if you are wanting to actually remove the column(s) from df3 rather than from a copy of it), Methods 1, 2a, and 2b are really the best options.



        # Method 4a:
        df3[, -grep("^foo$", colnames(df3)), with=FALSE]

        # Method 4b:
        df3[, !grepl("^foo$", colnames(df3)), with=FALSE]





        share|improve this answer















        Any of the following will remove column foo from the data.table df3:



        # Method 1 (and preferred as it takes 0.00s even on a 20GB data.table)
        df3[,foo:=NULL]

        df3[, c("foo","bar"):=NULL] # remove two columns

        myVar = "foo"
        df3[, (myVar):=NULL] # lookup myVar contents

        # Method 2a -- A safe idiom for excluding (possibly multiple)
        # columns matching a regex
        df3[, grep("^foo$", colnames(df3)):=NULL]

        # Method 2b -- An alternative to 2a, also "safe" in the sense described below
        df3[, which(grepl("^foo$", colnames(df3))):=NULL]


        data.table also supports the following syntax:



        ## Method 3 (could then assign to df3, 
        df3[, !"foo", with=FALSE]


        though if you were actually wanting to remove column "foo" from df3 (as opposed to just printing a view of df3 minus column "foo") you'd really want to use Method 1 instead.



        (Do note that if you use a method relying on grep() or grepl(), you need to set pattern="^foo$" rather than "foo", if you don't want columns with names like "fool" and "buffoon" (i.e. those containing foo as a substring) to also be matched and removed.)



        Less safe options, fine for interactive use:



        The next two idioms will also work -- if df3 contains a column matching "foo" -- but will fail in a probably-unexpected way if it does not. If, for instance, you use any of them to search for the non-existent column "bar", you'll end up with a zero-row data.table.



        As a consequence, they are really best suited for interactive use where one might, e.g., want to display a data.table minus any columns with names containing the substring "foo". For programming purposes (or if you are wanting to actually remove the column(s) from df3 rather than from a copy of it), Methods 1, 2a, and 2b are really the best options.



        # Method 4a:
        df3[, -grep("^foo$", colnames(df3)), with=FALSE]

        # Method 4b:
        df3[, !grepl("^foo$", colnames(df3)), with=FALSE]






        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited Nov 15 '16 at 2:39









        Matt Dowle

        47.1k16136202




        47.1k16136202










        answered Feb 8 '12 at 22:27









        Josh O'BrienJosh O'Brien

        129k18277388




        129k18277388








        • 2





          See my comment to the OP regarding -grep versus !grepl.

          – Joshua Ulrich
          Feb 8 '12 at 22:36






        • 1





          @JoshuaUlrich -- Good point. I tried grepl() initally and it didn't work, as data.table columns can't be indexed by a logical vector. But I now realize that grepl() can be made to work by wrapping it with which(), so that it returns an integer vector.

          – Josh O'Brien
          Feb 8 '12 at 23:38






        • 1





          I didn't know that about indexing with data.table, but wrapping it in which is clever!

          – Joshua Ulrich
          Feb 8 '12 at 23:59






        • 6





          I didn't know that about data.table either; added FR#1797. But, method 1 is (almost) infinitely faster than the others. Method 1 removes the column by reference with no copy at all. I doubt you get it above 0.005 seconds for any size data.table. In contrast, the others might not work at all if the table is near 50% of RAM because they copy all but the one to delete.

          – Matt Dowle
          Feb 9 '12 at 9:27








        • 1





          @user3969377 if you want to remove a column based on the contents of a character variable you'd simply wrap it in parenthesis. Ie. df[,(afoo):=NULL]

          – Dean MacGregor
          Jul 9 '15 at 19:26














        • 2





          See my comment to the OP regarding -grep versus !grepl.

          – Joshua Ulrich
          Feb 8 '12 at 22:36






        • 1





          @JoshuaUlrich -- Good point. I tried grepl() initally and it didn't work, as data.table columns can't be indexed by a logical vector. But I now realize that grepl() can be made to work by wrapping it with which(), so that it returns an integer vector.

          – Josh O'Brien
          Feb 8 '12 at 23:38






        • 1





          I didn't know that about indexing with data.table, but wrapping it in which is clever!

          – Joshua Ulrich
          Feb 8 '12 at 23:59






        • 6





          I didn't know that about data.table either; added FR#1797. But, method 1 is (almost) infinitely faster than the others. Method 1 removes the column by reference with no copy at all. I doubt you get it above 0.005 seconds for any size data.table. In contrast, the others might not work at all if the table is near 50% of RAM because they copy all but the one to delete.

          – Matt Dowle
          Feb 9 '12 at 9:27








        • 1





          @user3969377 if you want to remove a column based on the contents of a character variable you'd simply wrap it in parenthesis. Ie. df[,(afoo):=NULL]

          – Dean MacGregor
          Jul 9 '15 at 19:26








        2




        2





        See my comment to the OP regarding -grep versus !grepl.

        – Joshua Ulrich
        Feb 8 '12 at 22:36





        See my comment to the OP regarding -grep versus !grepl.

        – Joshua Ulrich
        Feb 8 '12 at 22:36




        1




        1





        @JoshuaUlrich -- Good point. I tried grepl() initally and it didn't work, as data.table columns can't be indexed by a logical vector. But I now realize that grepl() can be made to work by wrapping it with which(), so that it returns an integer vector.

        – Josh O'Brien
        Feb 8 '12 at 23:38





        @JoshuaUlrich -- Good point. I tried grepl() initally and it didn't work, as data.table columns can't be indexed by a logical vector. But I now realize that grepl() can be made to work by wrapping it with which(), so that it returns an integer vector.

        – Josh O'Brien
        Feb 8 '12 at 23:38




        1




        1





        I didn't know that about indexing with data.table, but wrapping it in which is clever!

        – Joshua Ulrich
        Feb 8 '12 at 23:59





        I didn't know that about indexing with data.table, but wrapping it in which is clever!

        – Joshua Ulrich
        Feb 8 '12 at 23:59




        6




        6





        I didn't know that about data.table either; added FR#1797. But, method 1 is (almost) infinitely faster than the others. Method 1 removes the column by reference with no copy at all. I doubt you get it above 0.005 seconds for any size data.table. In contrast, the others might not work at all if the table is near 50% of RAM because they copy all but the one to delete.

        – Matt Dowle
        Feb 9 '12 at 9:27







        I didn't know that about data.table either; added FR#1797. But, method 1 is (almost) infinitely faster than the others. Method 1 removes the column by reference with no copy at all. I doubt you get it above 0.005 seconds for any size data.table. In contrast, the others might not work at all if the table is near 50% of RAM because they copy all but the one to delete.

        – Matt Dowle
        Feb 9 '12 at 9:27






        1




        1





        @user3969377 if you want to remove a column based on the contents of a character variable you'd simply wrap it in parenthesis. Ie. df[,(afoo):=NULL]

        – Dean MacGregor
        Jul 9 '15 at 19:26





        @user3969377 if you want to remove a column based on the contents of a character variable you'd simply wrap it in parenthesis. Ie. df[,(afoo):=NULL]

        – Dean MacGregor
        Jul 9 '15 at 19:26













        29














        You can also use set for this, which avoids the overhead of [.data.table in loops:



        dt <- data.table( a=letters, b=LETTERS, c=seq(26), d=letters, e=letters )
        set( dt, j=c(1L,3L,5L), value=NULL )
        > dt[1:5]
        b d
        1: A a
        2: B b
        3: C c
        4: D d
        5: E e


        If you want to do it by column name, which(colnames(dt) %in% c("a","c","e")) should work for j.






        share|improve this answer


























        • In data.table 1.11.8, if you want to do it by column name, you can do directly rm.col = c("a","b") and dt[, (rm.col):=NULL]

          – Duccio A
          Dec 10 '18 at 11:08
















        29














        You can also use set for this, which avoids the overhead of [.data.table in loops:



        dt <- data.table( a=letters, b=LETTERS, c=seq(26), d=letters, e=letters )
        set( dt, j=c(1L,3L,5L), value=NULL )
        > dt[1:5]
        b d
        1: A a
        2: B b
        3: C c
        4: D d
        5: E e


        If you want to do it by column name, which(colnames(dt) %in% c("a","c","e")) should work for j.






        share|improve this answer


























        • In data.table 1.11.8, if you want to do it by column name, you can do directly rm.col = c("a","b") and dt[, (rm.col):=NULL]

          – Duccio A
          Dec 10 '18 at 11:08














        29












        29








        29







        You can also use set for this, which avoids the overhead of [.data.table in loops:



        dt <- data.table( a=letters, b=LETTERS, c=seq(26), d=letters, e=letters )
        set( dt, j=c(1L,3L,5L), value=NULL )
        > dt[1:5]
        b d
        1: A a
        2: B b
        3: C c
        4: D d
        5: E e


        If you want to do it by column name, which(colnames(dt) %in% c("a","c","e")) should work for j.






        share|improve this answer















        You can also use set for this, which avoids the overhead of [.data.table in loops:



        dt <- data.table( a=letters, b=LETTERS, c=seq(26), d=letters, e=letters )
        set( dt, j=c(1L,3L,5L), value=NULL )
        > dt[1:5]
        b d
        1: A a
        2: B b
        3: C c
        4: D d
        5: E e


        If you want to do it by column name, which(colnames(dt) %in% c("a","c","e")) should work for j.







        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited Feb 1 at 9:37









        SeGa

        4,6643936




        4,6643936










        answered Oct 21 '13 at 20:42









        Ari B. FriedmanAri B. Friedman

        47.9k26151211




        47.9k26151211













        • In data.table 1.11.8, if you want to do it by column name, you can do directly rm.col = c("a","b") and dt[, (rm.col):=NULL]

          – Duccio A
          Dec 10 '18 at 11:08



















        • In data.table 1.11.8, if you want to do it by column name, you can do directly rm.col = c("a","b") and dt[, (rm.col):=NULL]

          – Duccio A
          Dec 10 '18 at 11:08

















        In data.table 1.11.8, if you want to do it by column name, you can do directly rm.col = c("a","b") and dt[, (rm.col):=NULL]

        – Duccio A
        Dec 10 '18 at 11:08





        In data.table 1.11.8, if you want to do it by column name, you can do directly rm.col = c("a","b") and dt[, (rm.col):=NULL]

        – Duccio A
        Dec 10 '18 at 11:08











        16














        I simply do it in the data frame kind of way:



        DT$col = NULL


        Works fast and as far as I could see doesn't cause any problems.



        UPDATE: not the best method if your DT is very large, as using the $<- operator will lead to object copying. So better use:



        DT[, col:=NULL]





        share|improve this answer






























          16














          I simply do it in the data frame kind of way:



          DT$col = NULL


          Works fast and as far as I could see doesn't cause any problems.



          UPDATE: not the best method if your DT is very large, as using the $<- operator will lead to object copying. So better use:



          DT[, col:=NULL]





          share|improve this answer




























            16












            16








            16







            I simply do it in the data frame kind of way:



            DT$col = NULL


            Works fast and as far as I could see doesn't cause any problems.



            UPDATE: not the best method if your DT is very large, as using the $<- operator will lead to object copying. So better use:



            DT[, col:=NULL]





            share|improve this answer















            I simply do it in the data frame kind of way:



            DT$col = NULL


            Works fast and as far as I could see doesn't cause any problems.



            UPDATE: not the best method if your DT is very large, as using the $<- operator will lead to object copying. So better use:



            DT[, col:=NULL]






            share|improve this answer














            share|improve this answer



            share|improve this answer








            edited May 21 '15 at 9:34

























            answered May 19 '13 at 20:39









            mspmsp

            6341715




            6341715























                4














                Very simple option in case you have many individual columns to delete in a data table and you want to avoid typing in all column names #careadviced



                dt <- dt[, -c(1,4,6,17,83,104), with =F]


                This will remove columns based on column number instead.



                It's obviously not as efficient because it bypasses data.table advantages but if you're working with less than say 500,000 rows it works fine






                share|improve this answer
























                • Plus 1. And in v1.9.8 soon on CRAN, you no longer need the with=F part.

                  – Matt Dowle
                  Nov 15 '16 at 2:43
















                4














                Very simple option in case you have many individual columns to delete in a data table and you want to avoid typing in all column names #careadviced



                dt <- dt[, -c(1,4,6,17,83,104), with =F]


                This will remove columns based on column number instead.



                It's obviously not as efficient because it bypasses data.table advantages but if you're working with less than say 500,000 rows it works fine






                share|improve this answer
























                • Plus 1. And in v1.9.8 soon on CRAN, you no longer need the with=F part.

                  – Matt Dowle
                  Nov 15 '16 at 2:43














                4












                4








                4







                Very simple option in case you have many individual columns to delete in a data table and you want to avoid typing in all column names #careadviced



                dt <- dt[, -c(1,4,6,17,83,104), with =F]


                This will remove columns based on column number instead.



                It's obviously not as efficient because it bypasses data.table advantages but if you're working with less than say 500,000 rows it works fine






                share|improve this answer













                Very simple option in case you have many individual columns to delete in a data table and you want to avoid typing in all column names #careadviced



                dt <- dt[, -c(1,4,6,17,83,104), with =F]


                This will remove columns based on column number instead.



                It's obviously not as efficient because it bypasses data.table advantages but if you're working with less than say 500,000 rows it works fine







                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered Jul 3 '15 at 2:02









                SJDSSJDS

                7691023




                7691023













                • Plus 1. And in v1.9.8 soon on CRAN, you no longer need the with=F part.

                  – Matt Dowle
                  Nov 15 '16 at 2:43



















                • Plus 1. And in v1.9.8 soon on CRAN, you no longer need the with=F part.

                  – Matt Dowle
                  Nov 15 '16 at 2:43

















                Plus 1. And in v1.9.8 soon on CRAN, you no longer need the with=F part.

                – Matt Dowle
                Nov 15 '16 at 2:43





                Plus 1. And in v1.9.8 soon on CRAN, you no longer need the with=F part.

                – Matt Dowle
                Nov 15 '16 at 2:43











                0














                Suppose your dt has columns col1, col2, col3, col4, col5, coln.



                To delete a subset of them:



                vx <- as.character(bquote(c(col1, col2, col3, coln)))[-1]
                DT[, paste0(vx):=NULL]





                share|improve this answer


























                • this should be comment

                  – Sachila Ranawaka
                  Feb 24 '17 at 3:30
















                0














                Suppose your dt has columns col1, col2, col3, col4, col5, coln.



                To delete a subset of them:



                vx <- as.character(bquote(c(col1, col2, col3, coln)))[-1]
                DT[, paste0(vx):=NULL]





                share|improve this answer


























                • this should be comment

                  – Sachila Ranawaka
                  Feb 24 '17 at 3:30














                0












                0








                0







                Suppose your dt has columns col1, col2, col3, col4, col5, coln.



                To delete a subset of them:



                vx <- as.character(bquote(c(col1, col2, col3, coln)))[-1]
                DT[, paste0(vx):=NULL]





                share|improve this answer















                Suppose your dt has columns col1, col2, col3, col4, col5, coln.



                To delete a subset of them:



                vx <- as.character(bquote(c(col1, col2, col3, coln)))[-1]
                DT[, paste0(vx):=NULL]






                share|improve this answer














                share|improve this answer



                share|improve this answer








                edited Feb 24 '17 at 7:32









                iled

                1,63622235




                1,63622235










                answered Feb 24 '17 at 2:30









                Ricardo PaixaoRicardo Paixao

                11




                11













                • this should be comment

                  – Sachila Ranawaka
                  Feb 24 '17 at 3:30



















                • this should be comment

                  – Sachila Ranawaka
                  Feb 24 '17 at 3:30

















                this should be comment

                – Sachila Ranawaka
                Feb 24 '17 at 3:30





                this should be comment

                – Sachila Ranawaka
                Feb 24 '17 at 3:30











                -2














                Here is a way when you want to set a # of columns to NULL given their column names
                a function for your usage :)



                deleteColsFromDataTable <- function (train, toDeleteColNames) {



                   for (myNm in toDeleteColNames)

                train <- train [,(myNm):=NULL,with=F]

                return (train)


                }






                share|improve this answer




























                  -2














                  Here is a way when you want to set a # of columns to NULL given their column names
                  a function for your usage :)



                  deleteColsFromDataTable <- function (train, toDeleteColNames) {



                     for (myNm in toDeleteColNames)

                  train <- train [,(myNm):=NULL,with=F]

                  return (train)


                  }






                  share|improve this answer


























                    -2












                    -2








                    -2







                    Here is a way when you want to set a # of columns to NULL given their column names
                    a function for your usage :)



                    deleteColsFromDataTable <- function (train, toDeleteColNames) {



                       for (myNm in toDeleteColNames)

                    train <- train [,(myNm):=NULL,with=F]

                    return (train)


                    }






                    share|improve this answer













                    Here is a way when you want to set a # of columns to NULL given their column names
                    a function for your usage :)



                    deleteColsFromDataTable <- function (train, toDeleteColNames) {



                       for (myNm in toDeleteColNames)

                    train <- train [,(myNm):=NULL,with=F]

                    return (train)


                    }







                    share|improve this answer












                    share|improve this answer



                    share|improve this answer










                    answered Apr 14 '14 at 9:22









                    user3531326user3531326

                    9




                    9























                        -2














                        DT[,c:=NULL] # remove column c





                        share|improve this answer






























                          -2














                          DT[,c:=NULL] # remove column c





                          share|improve this answer




























                            -2












                            -2








                            -2







                            DT[,c:=NULL] # remove column c





                            share|improve this answer















                            DT[,c:=NULL] # remove column c






                            share|improve this answer














                            share|improve this answer



                            share|improve this answer








                            edited Nov 15 '16 at 4:50









                            Serjik

                            5,19964260




                            5,19964260










                            answered Nov 15 '16 at 2:25









                            Durga GaddamDurga Gaddam

                            7910




                            7910























                                -7














                                For a data.table, assigning the column to NULL removes it:



                                DT[,c("col1", "col1", "col2", "col2")] <- NULL
                                ^
                                |---- Notice the extra comma if DT is a data.table


                                ... which is the equivalent of:



                                DT$col1 <- NULL
                                DT$col2 <- NULL
                                DT$col3 <- NULL
                                DT$col4 <- NULL


                                The equivalent for a data.frame is:



                                DF[c("col1", "col1", "col2", "col2")] <- NULL
                                ^
                                |---- Notice the missing comma if DF is a data.frame


                                Q. Why is there a comma in the version for data.table, and no comma in the version for data.frame?



                                A. As data.frames are stored as a list of columns, you can skip the comma. You could also add it in, however then you will need to assign them to a list of NULLs, DF[, c("col1", "col2", "col3")] <- list(NULL).






                                share|improve this answer


























                                • As to your question -- please read the data table FAQ

                                  – mnel
                                  Mar 31 '14 at 22:05











                                • @Arun I can't think of any situation with data.frames where the row and columns would be switched. That would be illogical.

                                  – duHaas
                                  Mar 31 '14 at 22:42











                                • @Arun I tagged you because your first comment made it seem like there were times at which you might call DF[column,row] so I just wanted to see if there actually were any instances where this happened.

                                  – duHaas
                                  Mar 31 '14 at 22:57











                                • Updated the answer to remove a typo.

                                  – Contango
                                  Apr 2 '14 at 7:30
















                                -7














                                For a data.table, assigning the column to NULL removes it:



                                DT[,c("col1", "col1", "col2", "col2")] <- NULL
                                ^
                                |---- Notice the extra comma if DT is a data.table


                                ... which is the equivalent of:



                                DT$col1 <- NULL
                                DT$col2 <- NULL
                                DT$col3 <- NULL
                                DT$col4 <- NULL


                                The equivalent for a data.frame is:



                                DF[c("col1", "col1", "col2", "col2")] <- NULL
                                ^
                                |---- Notice the missing comma if DF is a data.frame


                                Q. Why is there a comma in the version for data.table, and no comma in the version for data.frame?



                                A. As data.frames are stored as a list of columns, you can skip the comma. You could also add it in, however then you will need to assign them to a list of NULLs, DF[, c("col1", "col2", "col3")] <- list(NULL).






                                share|improve this answer


























                                • As to your question -- please read the data table FAQ

                                  – mnel
                                  Mar 31 '14 at 22:05











                                • @Arun I can't think of any situation with data.frames where the row and columns would be switched. That would be illogical.

                                  – duHaas
                                  Mar 31 '14 at 22:42











                                • @Arun I tagged you because your first comment made it seem like there were times at which you might call DF[column,row] so I just wanted to see if there actually were any instances where this happened.

                                  – duHaas
                                  Mar 31 '14 at 22:57











                                • Updated the answer to remove a typo.

                                  – Contango
                                  Apr 2 '14 at 7:30














                                -7












                                -7








                                -7







                                For a data.table, assigning the column to NULL removes it:



                                DT[,c("col1", "col1", "col2", "col2")] <- NULL
                                ^
                                |---- Notice the extra comma if DT is a data.table


                                ... which is the equivalent of:



                                DT$col1 <- NULL
                                DT$col2 <- NULL
                                DT$col3 <- NULL
                                DT$col4 <- NULL


                                The equivalent for a data.frame is:



                                DF[c("col1", "col1", "col2", "col2")] <- NULL
                                ^
                                |---- Notice the missing comma if DF is a data.frame


                                Q. Why is there a comma in the version for data.table, and no comma in the version for data.frame?



                                A. As data.frames are stored as a list of columns, you can skip the comma. You could also add it in, however then you will need to assign them to a list of NULLs, DF[, c("col1", "col2", "col3")] <- list(NULL).






                                share|improve this answer















                                For a data.table, assigning the column to NULL removes it:



                                DT[,c("col1", "col1", "col2", "col2")] <- NULL
                                ^
                                |---- Notice the extra comma if DT is a data.table


                                ... which is the equivalent of:



                                DT$col1 <- NULL
                                DT$col2 <- NULL
                                DT$col3 <- NULL
                                DT$col4 <- NULL


                                The equivalent for a data.frame is:



                                DF[c("col1", "col1", "col2", "col2")] <- NULL
                                ^
                                |---- Notice the missing comma if DF is a data.frame


                                Q. Why is there a comma in the version for data.table, and no comma in the version for data.frame?



                                A. As data.frames are stored as a list of columns, you can skip the comma. You could also add it in, however then you will need to assign them to a list of NULLs, DF[, c("col1", "col2", "col3")] <- list(NULL).







                                share|improve this answer














                                share|improve this answer



                                share|improve this answer








                                edited May 4 '14 at 13:39


























                                community wiki





                                4 revs, 2 users 98%
                                Contango














                                • As to your question -- please read the data table FAQ

                                  – mnel
                                  Mar 31 '14 at 22:05











                                • @Arun I can't think of any situation with data.frames where the row and columns would be switched. That would be illogical.

                                  – duHaas
                                  Mar 31 '14 at 22:42











                                • @Arun I tagged you because your first comment made it seem like there were times at which you might call DF[column,row] so I just wanted to see if there actually were any instances where this happened.

                                  – duHaas
                                  Mar 31 '14 at 22:57











                                • Updated the answer to remove a typo.

                                  – Contango
                                  Apr 2 '14 at 7:30



















                                • As to your question -- please read the data table FAQ

                                  – mnel
                                  Mar 31 '14 at 22:05











                                • @Arun I can't think of any situation with data.frames where the row and columns would be switched. That would be illogical.

                                  – duHaas
                                  Mar 31 '14 at 22:42











                                • @Arun I tagged you because your first comment made it seem like there were times at which you might call DF[column,row] so I just wanted to see if there actually were any instances where this happened.

                                  – duHaas
                                  Mar 31 '14 at 22:57











                                • Updated the answer to remove a typo.

                                  – Contango
                                  Apr 2 '14 at 7:30

















                                As to your question -- please read the data table FAQ

                                – mnel
                                Mar 31 '14 at 22:05





                                As to your question -- please read the data table FAQ

                                – mnel
                                Mar 31 '14 at 22:05













                                @Arun I can't think of any situation with data.frames where the row and columns would be switched. That would be illogical.

                                – duHaas
                                Mar 31 '14 at 22:42





                                @Arun I can't think of any situation with data.frames where the row and columns would be switched. That would be illogical.

                                – duHaas
                                Mar 31 '14 at 22:42













                                @Arun I tagged you because your first comment made it seem like there were times at which you might call DF[column,row] so I just wanted to see if there actually were any instances where this happened.

                                – duHaas
                                Mar 31 '14 at 22:57





                                @Arun I tagged you because your first comment made it seem like there were times at which you might call DF[column,row] so I just wanted to see if there actually were any instances where this happened.

                                – duHaas
                                Mar 31 '14 at 22:57













                                Updated the answer to remove a typo.

                                – Contango
                                Apr 2 '14 at 7:30





                                Updated the answer to remove a typo.

                                – Contango
                                Apr 2 '14 at 7:30


















                                draft saved

                                draft discarded




















































                                Thanks for contributing an answer to Stack Overflow!


                                • Please be sure to answer the question. Provide details and share your research!

                                But avoid



                                • Asking for help, clarification, or responding to other answers.

                                • Making statements based on opinion; back them up with references or personal experience.


                                To learn more, see our tips on writing great answers.




                                draft saved


                                draft discarded














                                StackExchange.ready(
                                function () {
                                StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f9202413%2fhow-do-you-delete-a-column-by-name-in-data-table%23new-answer', 'question_page');
                                }
                                );

                                Post as a guest















                                Required, but never shown





















































                                Required, but never shown














                                Required, but never shown












                                Required, but never shown







                                Required, but never shown

































                                Required, but never shown














                                Required, but never shown












                                Required, but never shown







                                Required, but never shown







                                Popular posts from this blog

                                Costa Masnaga

                                Fotorealismo

                                Sidney Franklin