Concatenating select columns of a panda, while ignoring blanks in columns












1















I have a data frame which looks like this.



key A1 A2 A3 BX CX DX
1 X1 Y1 B1 C1 D1
2 X2 Z2 B2 C2 D2
3 X3 B3 C3 D3
4 X4 B4 C4 D4
5 B5 C5 D5


I am trying to form a new col 'NC' which is concatenated from columns A1,A2 and A3. If there is no entry in a certain column, the next column needs to step forward.The seperator can be a "," or a "_"



The final df looks like



key A1 A2 A3 BX CX DX NC
1 X1 Y1 B1 C1 D1 X1_Y1
2 X2 Z2 B2 C2 D2 X2_Z2
3 X3 B3 C3 D3 X3
4 X4 B4 C4 D4 X4
5 B5 C5 D5


If there are no entries in A1-A3, then the entry in NC remains blank.I have looked at other posts in SO and have tried other ways, but I cant seemed to get it right. The entries in A1-A3 columns are floats which sometimes have a 0 after the number(X2.0). I also want to drop the decimal and the 0. Hoping someone more knowledgeable can show me the way.



edit Q to change data type in the data frame



key A1  A2  A3  BX CX DX 
1 1.0 2.0 B1 C1 D1
2 3 4 B2 C2 D2
3 7.0 B3 C3 D3
4 5 6.0 7.0 B4 C4 D4
5 B5 C5 D5


new df looks like



key A1  A2  A3  BX CX DX NC
1 1.0 2.0 B1 C1 D1 1_2
2 3 4 B2 C2 D2 3_4
3 7.0 B3 C3 D3 7
4 5 6.0 7.0 B4 C4 D4 5_6_7
5 B5 C5 D5









share|improve this question





























    1















    I have a data frame which looks like this.



    key A1 A2 A3 BX CX DX
    1 X1 Y1 B1 C1 D1
    2 X2 Z2 B2 C2 D2
    3 X3 B3 C3 D3
    4 X4 B4 C4 D4
    5 B5 C5 D5


    I am trying to form a new col 'NC' which is concatenated from columns A1,A2 and A3. If there is no entry in a certain column, the next column needs to step forward.The seperator can be a "," or a "_"



    The final df looks like



    key A1 A2 A3 BX CX DX NC
    1 X1 Y1 B1 C1 D1 X1_Y1
    2 X2 Z2 B2 C2 D2 X2_Z2
    3 X3 B3 C3 D3 X3
    4 X4 B4 C4 D4 X4
    5 B5 C5 D5


    If there are no entries in A1-A3, then the entry in NC remains blank.I have looked at other posts in SO and have tried other ways, but I cant seemed to get it right. The entries in A1-A3 columns are floats which sometimes have a 0 after the number(X2.0). I also want to drop the decimal and the 0. Hoping someone more knowledgeable can show me the way.



    edit Q to change data type in the data frame



    key A1  A2  A3  BX CX DX 
    1 1.0 2.0 B1 C1 D1
    2 3 4 B2 C2 D2
    3 7.0 B3 C3 D3
    4 5 6.0 7.0 B4 C4 D4
    5 B5 C5 D5


    new df looks like



    key A1  A2  A3  BX CX DX NC
    1 1.0 2.0 B1 C1 D1 1_2
    2 3 4 B2 C2 D2 3_4
    3 7.0 B3 C3 D3 7
    4 5 6.0 7.0 B4 C4 D4 5_6_7
    5 B5 C5 D5









    share|improve this question



























      1












      1








      1








      I have a data frame which looks like this.



      key A1 A2 A3 BX CX DX
      1 X1 Y1 B1 C1 D1
      2 X2 Z2 B2 C2 D2
      3 X3 B3 C3 D3
      4 X4 B4 C4 D4
      5 B5 C5 D5


      I am trying to form a new col 'NC' which is concatenated from columns A1,A2 and A3. If there is no entry in a certain column, the next column needs to step forward.The seperator can be a "," or a "_"



      The final df looks like



      key A1 A2 A3 BX CX DX NC
      1 X1 Y1 B1 C1 D1 X1_Y1
      2 X2 Z2 B2 C2 D2 X2_Z2
      3 X3 B3 C3 D3 X3
      4 X4 B4 C4 D4 X4
      5 B5 C5 D5


      If there are no entries in A1-A3, then the entry in NC remains blank.I have looked at other posts in SO and have tried other ways, but I cant seemed to get it right. The entries in A1-A3 columns are floats which sometimes have a 0 after the number(X2.0). I also want to drop the decimal and the 0. Hoping someone more knowledgeable can show me the way.



      edit Q to change data type in the data frame



      key A1  A2  A3  BX CX DX 
      1 1.0 2.0 B1 C1 D1
      2 3 4 B2 C2 D2
      3 7.0 B3 C3 D3
      4 5 6.0 7.0 B4 C4 D4
      5 B5 C5 D5


      new df looks like



      key A1  A2  A3  BX CX DX NC
      1 1.0 2.0 B1 C1 D1 1_2
      2 3 4 B2 C2 D2 3_4
      3 7.0 B3 C3 D3 7
      4 5 6.0 7.0 B4 C4 D4 5_6_7
      5 B5 C5 D5









      share|improve this question
















      I have a data frame which looks like this.



      key A1 A2 A3 BX CX DX
      1 X1 Y1 B1 C1 D1
      2 X2 Z2 B2 C2 D2
      3 X3 B3 C3 D3
      4 X4 B4 C4 D4
      5 B5 C5 D5


      I am trying to form a new col 'NC' which is concatenated from columns A1,A2 and A3. If there is no entry in a certain column, the next column needs to step forward.The seperator can be a "," or a "_"



      The final df looks like



      key A1 A2 A3 BX CX DX NC
      1 X1 Y1 B1 C1 D1 X1_Y1
      2 X2 Z2 B2 C2 D2 X2_Z2
      3 X3 B3 C3 D3 X3
      4 X4 B4 C4 D4 X4
      5 B5 C5 D5


      If there are no entries in A1-A3, then the entry in NC remains blank.I have looked at other posts in SO and have tried other ways, but I cant seemed to get it right. The entries in A1-A3 columns are floats which sometimes have a 0 after the number(X2.0). I also want to drop the decimal and the 0. Hoping someone more knowledgeable can show me the way.



      edit Q to change data type in the data frame



      key A1  A2  A3  BX CX DX 
      1 1.0 2.0 B1 C1 D1
      2 3 4 B2 C2 D2
      3 7.0 B3 C3 D3
      4 5 6.0 7.0 B4 C4 D4
      5 B5 C5 D5


      new df looks like



      key A1  A2  A3  BX CX DX NC
      1 1.0 2.0 B1 C1 D1 1_2
      2 3 4 B2 C2 D2 3_4
      3 7.0 B3 C3 D3 7
      4 5 6.0 7.0 B4 C4 D4 5_6_7
      5 B5 C5 D5






      python pandas






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Nov 25 '18 at 1:24







      Acinonyx

















      asked Nov 25 '18 at 1:01









      AcinonyxAcinonyx

      399




      399
























          1 Answer
          1






          active

          oldest

          votes


















          2














          You can use filter to filter your columns, and agg to join:



          # Extract columns
          v = df.filter(like='A')
          # Convert blanks to NaNs so we can call Series.dropna later.
          df['NC'] = v[v.astype(bool)].agg(lambda x: '_'.join(x.dropna()), axis=1)
          # Or,
          # df['NC'] = v[v.astype(bool)].agg(
          # lambda x: x.dropna().str.cat(sep='_'), axis=1)
          print(df)
          key A1 A2 A3 BX CX DX NC
          0 1 X1 Y1 B1 C1 D1 X1_Y1
          1 2 X2 Z2 B2 C2 D2 X2_Z2
          2 3 X3 B3 C3 D3 X3
          3 4 X4 B4 C4 D4 X4
          4 5 B5 C5 D5





          share|improve this answer
























          • Can you take a look at the Q again. Changed the data type. thx

            – Acinonyx
            Nov 25 '18 at 1:45











          • @Acinonyx use '_'.join(x.dropna().astype(int).astype(str)) inside the lambda function (keeping everything else the same) instead? If that doesn't work, try something simpler: '_'.join(x.dropna().astype(str))

            – coldspeed
            Nov 25 '18 at 2:11













          Your Answer






          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "1"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53463798%2fconcatenating-select-columns-of-a-panda-while-ignoring-blanks-in-columns%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          2














          You can use filter to filter your columns, and agg to join:



          # Extract columns
          v = df.filter(like='A')
          # Convert blanks to NaNs so we can call Series.dropna later.
          df['NC'] = v[v.astype(bool)].agg(lambda x: '_'.join(x.dropna()), axis=1)
          # Or,
          # df['NC'] = v[v.astype(bool)].agg(
          # lambda x: x.dropna().str.cat(sep='_'), axis=1)
          print(df)
          key A1 A2 A3 BX CX DX NC
          0 1 X1 Y1 B1 C1 D1 X1_Y1
          1 2 X2 Z2 B2 C2 D2 X2_Z2
          2 3 X3 B3 C3 D3 X3
          3 4 X4 B4 C4 D4 X4
          4 5 B5 C5 D5





          share|improve this answer
























          • Can you take a look at the Q again. Changed the data type. thx

            – Acinonyx
            Nov 25 '18 at 1:45











          • @Acinonyx use '_'.join(x.dropna().astype(int).astype(str)) inside the lambda function (keeping everything else the same) instead? If that doesn't work, try something simpler: '_'.join(x.dropna().astype(str))

            – coldspeed
            Nov 25 '18 at 2:11


















          2














          You can use filter to filter your columns, and agg to join:



          # Extract columns
          v = df.filter(like='A')
          # Convert blanks to NaNs so we can call Series.dropna later.
          df['NC'] = v[v.astype(bool)].agg(lambda x: '_'.join(x.dropna()), axis=1)
          # Or,
          # df['NC'] = v[v.astype(bool)].agg(
          # lambda x: x.dropna().str.cat(sep='_'), axis=1)
          print(df)
          key A1 A2 A3 BX CX DX NC
          0 1 X1 Y1 B1 C1 D1 X1_Y1
          1 2 X2 Z2 B2 C2 D2 X2_Z2
          2 3 X3 B3 C3 D3 X3
          3 4 X4 B4 C4 D4 X4
          4 5 B5 C5 D5





          share|improve this answer
























          • Can you take a look at the Q again. Changed the data type. thx

            – Acinonyx
            Nov 25 '18 at 1:45











          • @Acinonyx use '_'.join(x.dropna().astype(int).astype(str)) inside the lambda function (keeping everything else the same) instead? If that doesn't work, try something simpler: '_'.join(x.dropna().astype(str))

            – coldspeed
            Nov 25 '18 at 2:11
















          2












          2








          2







          You can use filter to filter your columns, and agg to join:



          # Extract columns
          v = df.filter(like='A')
          # Convert blanks to NaNs so we can call Series.dropna later.
          df['NC'] = v[v.astype(bool)].agg(lambda x: '_'.join(x.dropna()), axis=1)
          # Or,
          # df['NC'] = v[v.astype(bool)].agg(
          # lambda x: x.dropna().str.cat(sep='_'), axis=1)
          print(df)
          key A1 A2 A3 BX CX DX NC
          0 1 X1 Y1 B1 C1 D1 X1_Y1
          1 2 X2 Z2 B2 C2 D2 X2_Z2
          2 3 X3 B3 C3 D3 X3
          3 4 X4 B4 C4 D4 X4
          4 5 B5 C5 D5





          share|improve this answer













          You can use filter to filter your columns, and agg to join:



          # Extract columns
          v = df.filter(like='A')
          # Convert blanks to NaNs so we can call Series.dropna later.
          df['NC'] = v[v.astype(bool)].agg(lambda x: '_'.join(x.dropna()), axis=1)
          # Or,
          # df['NC'] = v[v.astype(bool)].agg(
          # lambda x: x.dropna().str.cat(sep='_'), axis=1)
          print(df)
          key A1 A2 A3 BX CX DX NC
          0 1 X1 Y1 B1 C1 D1 X1_Y1
          1 2 X2 Z2 B2 C2 D2 X2_Z2
          2 3 X3 B3 C3 D3 X3
          3 4 X4 B4 C4 D4 X4
          4 5 B5 C5 D5






          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Nov 25 '18 at 1:06









          coldspeedcoldspeed

          135k23145230




          135k23145230













          • Can you take a look at the Q again. Changed the data type. thx

            – Acinonyx
            Nov 25 '18 at 1:45











          • @Acinonyx use '_'.join(x.dropna().astype(int).astype(str)) inside the lambda function (keeping everything else the same) instead? If that doesn't work, try something simpler: '_'.join(x.dropna().astype(str))

            – coldspeed
            Nov 25 '18 at 2:11





















          • Can you take a look at the Q again. Changed the data type. thx

            – Acinonyx
            Nov 25 '18 at 1:45











          • @Acinonyx use '_'.join(x.dropna().astype(int).astype(str)) inside the lambda function (keeping everything else the same) instead? If that doesn't work, try something simpler: '_'.join(x.dropna().astype(str))

            – coldspeed
            Nov 25 '18 at 2:11



















          Can you take a look at the Q again. Changed the data type. thx

          – Acinonyx
          Nov 25 '18 at 1:45





          Can you take a look at the Q again. Changed the data type. thx

          – Acinonyx
          Nov 25 '18 at 1:45













          @Acinonyx use '_'.join(x.dropna().astype(int).astype(str)) inside the lambda function (keeping everything else the same) instead? If that doesn't work, try something simpler: '_'.join(x.dropna().astype(str))

          – coldspeed
          Nov 25 '18 at 2:11







          @Acinonyx use '_'.join(x.dropna().astype(int).astype(str)) inside the lambda function (keeping everything else the same) instead? If that doesn't work, try something simpler: '_'.join(x.dropna().astype(str))

          – coldspeed
          Nov 25 '18 at 2:11






















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53463798%2fconcatenating-select-columns-of-a-panda-while-ignoring-blanks-in-columns%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Ottavio Pratesi

          Tricia Helfer

          15 giugno