How to extract substrings between brackets while ignoring those between nested brackets in Python?












-1















I have a string:



phy = '(s1:0.6507212936,((s2:0.4186036213,s3:0.4186036213):0.1428084058,((s4:0.1429514535,s5:0.1429514535):0.1695879844,s6:0.3125394379):0.2488725892):0.08930926654);'


How can I extract only the substrings that are enclosed between brackets and that do not contain any brackets within each substring? So, from my example I require two outputs: "s2:0.4186036213,s3:0.4186036213" and "s4:0.1429514535,s5:0.1429514535".










share|improve this question



























    -1















    I have a string:



    phy = '(s1:0.6507212936,((s2:0.4186036213,s3:0.4186036213):0.1428084058,((s4:0.1429514535,s5:0.1429514535):0.1695879844,s6:0.3125394379):0.2488725892):0.08930926654);'


    How can I extract only the substrings that are enclosed between brackets and that do not contain any brackets within each substring? So, from my example I require two outputs: "s2:0.4186036213,s3:0.4186036213" and "s4:0.1429514535,s5:0.1429514535".










    share|improve this question

























      -1












      -1








      -1








      I have a string:



      phy = '(s1:0.6507212936,((s2:0.4186036213,s3:0.4186036213):0.1428084058,((s4:0.1429514535,s5:0.1429514535):0.1695879844,s6:0.3125394379):0.2488725892):0.08930926654);'


      How can I extract only the substrings that are enclosed between brackets and that do not contain any brackets within each substring? So, from my example I require two outputs: "s2:0.4186036213,s3:0.4186036213" and "s4:0.1429514535,s5:0.1429514535".










      share|improve this question














      I have a string:



      phy = '(s1:0.6507212936,((s2:0.4186036213,s3:0.4186036213):0.1428084058,((s4:0.1429514535,s5:0.1429514535):0.1695879844,s6:0.3125394379):0.2488725892):0.08930926654);'


      How can I extract only the substrings that are enclosed between brackets and that do not contain any brackets within each substring? So, from my example I require two outputs: "s2:0.4186036213,s3:0.4186036213" and "s4:0.1429514535,s5:0.1429514535".







      python






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Nov 23 '18 at 4:55









      user3329732user3329732

      405




      405
























          3 Answers
          3






          active

          oldest

          votes


















          1














          You can use regular rexpressions:



          import re

          phy = '(s1:0.6507212936,((s2:0.4186036213,s3:0.4186036213):0.1428084058,((s4:0.1429514535,s5:0.1429514535):0.1695879844,s6:0.3125394379):0.2488725892):0.08930926654);'
          re.findall(r'(([^()]*))', phy)
          # ['s2:0.4186036213,s3:0.4186036213', 's4:0.1429514535,s5:0.1429514535']


          This captures everything non-brackety enclosed in opening-closing brackets. It does not, however, validate correct nesting levels.






          share|improve this answer































            1














            Try this:



            from collections import defaultdict
            bracket_dict = defaultdict(int)
            bracket_dict_ ={
            '(':')',
            '{':'}',
            '[':']'
            }
            bracket_dict.update(bracket_dict_)
            bracket_list = bracket_dict.keys()

            phy = '(s1:0.6507212936,((s2:0.4186036213,s3:0.4186036213):0.1428084058,((s4:0.1429514535,s5:0.1429514535):0.1695879844,s6:0.3125394379):0.2488725892):0.08930926654);'
            inner_items=
            brackets =
            start_index = None

            for i in range(len(phy)):
            if phy[i] in bracket_list:
            start_index = i
            brackets.append(phy[i])

            if brackets:
            if phy[i] == bracket_dict[brackets[-1]]:
            inner_items.append(phy[start_index+1 : i])
            brackets.append(phy[i])
            print(inner_items)
            #['s2:0.4186036213,s3:0.4186036213', 's4:0.1429514535,s5:0.1429514535']





            share|improve this answer































              1














              Use regex:



              import re

              reg = re.compile(r'[(]([^()]+)[)]')

              phy = '(s1:0.6507212936,((s2:0.4186036213,s3:0.4186036213):0.1428084058,((s4:0.1429514535,s5:0.1429514535):0.1695879844,s6:0.3125394379):0.2488725892):0.08930926654)'

              print(reg.findall(phy))


              Output :



              C:UsersDesktop>py x.py
              ['s2:0.4186036213,s3:0.4186036213', 's4:0.1429514535,s5:0.1429514535']





              share|improve this answer























                Your Answer






                StackExchange.ifUsing("editor", function () {
                StackExchange.using("externalEditor", function () {
                StackExchange.using("snippets", function () {
                StackExchange.snippets.init();
                });
                });
                }, "code-snippets");

                StackExchange.ready(function() {
                var channelOptions = {
                tags: "".split(" "),
                id: "1"
                };
                initTagRenderer("".split(" "), "".split(" "), channelOptions);

                StackExchange.using("externalEditor", function() {
                // Have to fire editor after snippets, if snippets enabled
                if (StackExchange.settings.snippets.snippetsEnabled) {
                StackExchange.using("snippets", function() {
                createEditor();
                });
                }
                else {
                createEditor();
                }
                });

                function createEditor() {
                StackExchange.prepareEditor({
                heartbeatType: 'answer',
                autoActivateHeartbeat: false,
                convertImagesToLinks: true,
                noModals: true,
                showLowRepImageUploadWarning: true,
                reputationToPostImages: 10,
                bindNavPrevention: true,
                postfix: "",
                imageUploader: {
                brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
                contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
                allowUrls: true
                },
                onDemand: true,
                discardSelector: ".discard-answer"
                ,immediatelyShowMarkdownHelp:true
                });


                }
                });














                draft saved

                draft discarded


















                StackExchange.ready(
                function () {
                StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53440839%2fhow-to-extract-substrings-between-brackets-while-ignoring-those-between-nested-b%23new-answer', 'question_page');
                }
                );

                Post as a guest















                Required, but never shown

























                3 Answers
                3






                active

                oldest

                votes








                3 Answers
                3






                active

                oldest

                votes









                active

                oldest

                votes






                active

                oldest

                votes









                1














                You can use regular rexpressions:



                import re

                phy = '(s1:0.6507212936,((s2:0.4186036213,s3:0.4186036213):0.1428084058,((s4:0.1429514535,s5:0.1429514535):0.1695879844,s6:0.3125394379):0.2488725892):0.08930926654);'
                re.findall(r'(([^()]*))', phy)
                # ['s2:0.4186036213,s3:0.4186036213', 's4:0.1429514535,s5:0.1429514535']


                This captures everything non-brackety enclosed in opening-closing brackets. It does not, however, validate correct nesting levels.






                share|improve this answer




























                  1














                  You can use regular rexpressions:



                  import re

                  phy = '(s1:0.6507212936,((s2:0.4186036213,s3:0.4186036213):0.1428084058,((s4:0.1429514535,s5:0.1429514535):0.1695879844,s6:0.3125394379):0.2488725892):0.08930926654);'
                  re.findall(r'(([^()]*))', phy)
                  # ['s2:0.4186036213,s3:0.4186036213', 's4:0.1429514535,s5:0.1429514535']


                  This captures everything non-brackety enclosed in opening-closing brackets. It does not, however, validate correct nesting levels.






                  share|improve this answer


























                    1












                    1








                    1







                    You can use regular rexpressions:



                    import re

                    phy = '(s1:0.6507212936,((s2:0.4186036213,s3:0.4186036213):0.1428084058,((s4:0.1429514535,s5:0.1429514535):0.1695879844,s6:0.3125394379):0.2488725892):0.08930926654);'
                    re.findall(r'(([^()]*))', phy)
                    # ['s2:0.4186036213,s3:0.4186036213', 's4:0.1429514535,s5:0.1429514535']


                    This captures everything non-brackety enclosed in opening-closing brackets. It does not, however, validate correct nesting levels.






                    share|improve this answer













                    You can use regular rexpressions:



                    import re

                    phy = '(s1:0.6507212936,((s2:0.4186036213,s3:0.4186036213):0.1428084058,((s4:0.1429514535,s5:0.1429514535):0.1695879844,s6:0.3125394379):0.2488725892):0.08930926654);'
                    re.findall(r'(([^()]*))', phy)
                    # ['s2:0.4186036213,s3:0.4186036213', 's4:0.1429514535,s5:0.1429514535']


                    This captures everything non-brackety enclosed in opening-closing brackets. It does not, however, validate correct nesting levels.







                    share|improve this answer












                    share|improve this answer



                    share|improve this answer










                    answered Nov 23 '18 at 5:52









                    schwobasegglschwobaseggl

                    37.2k32442




                    37.2k32442

























                        1














                        Try this:



                        from collections import defaultdict
                        bracket_dict = defaultdict(int)
                        bracket_dict_ ={
                        '(':')',
                        '{':'}',
                        '[':']'
                        }
                        bracket_dict.update(bracket_dict_)
                        bracket_list = bracket_dict.keys()

                        phy = '(s1:0.6507212936,((s2:0.4186036213,s3:0.4186036213):0.1428084058,((s4:0.1429514535,s5:0.1429514535):0.1695879844,s6:0.3125394379):0.2488725892):0.08930926654);'
                        inner_items=
                        brackets =
                        start_index = None

                        for i in range(len(phy)):
                        if phy[i] in bracket_list:
                        start_index = i
                        brackets.append(phy[i])

                        if brackets:
                        if phy[i] == bracket_dict[brackets[-1]]:
                        inner_items.append(phy[start_index+1 : i])
                        brackets.append(phy[i])
                        print(inner_items)
                        #['s2:0.4186036213,s3:0.4186036213', 's4:0.1429514535,s5:0.1429514535']





                        share|improve this answer




























                          1














                          Try this:



                          from collections import defaultdict
                          bracket_dict = defaultdict(int)
                          bracket_dict_ ={
                          '(':')',
                          '{':'}',
                          '[':']'
                          }
                          bracket_dict.update(bracket_dict_)
                          bracket_list = bracket_dict.keys()

                          phy = '(s1:0.6507212936,((s2:0.4186036213,s3:0.4186036213):0.1428084058,((s4:0.1429514535,s5:0.1429514535):0.1695879844,s6:0.3125394379):0.2488725892):0.08930926654);'
                          inner_items=
                          brackets =
                          start_index = None

                          for i in range(len(phy)):
                          if phy[i] in bracket_list:
                          start_index = i
                          brackets.append(phy[i])

                          if brackets:
                          if phy[i] == bracket_dict[brackets[-1]]:
                          inner_items.append(phy[start_index+1 : i])
                          brackets.append(phy[i])
                          print(inner_items)
                          #['s2:0.4186036213,s3:0.4186036213', 's4:0.1429514535,s5:0.1429514535']





                          share|improve this answer


























                            1












                            1








                            1







                            Try this:



                            from collections import defaultdict
                            bracket_dict = defaultdict(int)
                            bracket_dict_ ={
                            '(':')',
                            '{':'}',
                            '[':']'
                            }
                            bracket_dict.update(bracket_dict_)
                            bracket_list = bracket_dict.keys()

                            phy = '(s1:0.6507212936,((s2:0.4186036213,s3:0.4186036213):0.1428084058,((s4:0.1429514535,s5:0.1429514535):0.1695879844,s6:0.3125394379):0.2488725892):0.08930926654);'
                            inner_items=
                            brackets =
                            start_index = None

                            for i in range(len(phy)):
                            if phy[i] in bracket_list:
                            start_index = i
                            brackets.append(phy[i])

                            if brackets:
                            if phy[i] == bracket_dict[brackets[-1]]:
                            inner_items.append(phy[start_index+1 : i])
                            brackets.append(phy[i])
                            print(inner_items)
                            #['s2:0.4186036213,s3:0.4186036213', 's4:0.1429514535,s5:0.1429514535']





                            share|improve this answer













                            Try this:



                            from collections import defaultdict
                            bracket_dict = defaultdict(int)
                            bracket_dict_ ={
                            '(':')',
                            '{':'}',
                            '[':']'
                            }
                            bracket_dict.update(bracket_dict_)
                            bracket_list = bracket_dict.keys()

                            phy = '(s1:0.6507212936,((s2:0.4186036213,s3:0.4186036213):0.1428084058,((s4:0.1429514535,s5:0.1429514535):0.1695879844,s6:0.3125394379):0.2488725892):0.08930926654);'
                            inner_items=
                            brackets =
                            start_index = None

                            for i in range(len(phy)):
                            if phy[i] in bracket_list:
                            start_index = i
                            brackets.append(phy[i])

                            if brackets:
                            if phy[i] == bracket_dict[brackets[-1]]:
                            inner_items.append(phy[start_index+1 : i])
                            brackets.append(phy[i])
                            print(inner_items)
                            #['s2:0.4186036213,s3:0.4186036213', 's4:0.1429514535,s5:0.1429514535']






                            share|improve this answer












                            share|improve this answer



                            share|improve this answer










                            answered Nov 23 '18 at 5:59









                            Biswadip MandalBiswadip Mandal

                            1809




                            1809























                                1














                                Use regex:



                                import re

                                reg = re.compile(r'[(]([^()]+)[)]')

                                phy = '(s1:0.6507212936,((s2:0.4186036213,s3:0.4186036213):0.1428084058,((s4:0.1429514535,s5:0.1429514535):0.1695879844,s6:0.3125394379):0.2488725892):0.08930926654)'

                                print(reg.findall(phy))


                                Output :



                                C:UsersDesktop>py x.py
                                ['s2:0.4186036213,s3:0.4186036213', 's4:0.1429514535,s5:0.1429514535']





                                share|improve this answer




























                                  1














                                  Use regex:



                                  import re

                                  reg = re.compile(r'[(]([^()]+)[)]')

                                  phy = '(s1:0.6507212936,((s2:0.4186036213,s3:0.4186036213):0.1428084058,((s4:0.1429514535,s5:0.1429514535):0.1695879844,s6:0.3125394379):0.2488725892):0.08930926654)'

                                  print(reg.findall(phy))


                                  Output :



                                  C:UsersDesktop>py x.py
                                  ['s2:0.4186036213,s3:0.4186036213', 's4:0.1429514535,s5:0.1429514535']





                                  share|improve this answer


























                                    1












                                    1








                                    1







                                    Use regex:



                                    import re

                                    reg = re.compile(r'[(]([^()]+)[)]')

                                    phy = '(s1:0.6507212936,((s2:0.4186036213,s3:0.4186036213):0.1428084058,((s4:0.1429514535,s5:0.1429514535):0.1695879844,s6:0.3125394379):0.2488725892):0.08930926654)'

                                    print(reg.findall(phy))


                                    Output :



                                    C:UsersDesktop>py x.py
                                    ['s2:0.4186036213,s3:0.4186036213', 's4:0.1429514535,s5:0.1429514535']





                                    share|improve this answer













                                    Use regex:



                                    import re

                                    reg = re.compile(r'[(]([^()]+)[)]')

                                    phy = '(s1:0.6507212936,((s2:0.4186036213,s3:0.4186036213):0.1428084058,((s4:0.1429514535,s5:0.1429514535):0.1695879844,s6:0.3125394379):0.2488725892):0.08930926654)'

                                    print(reg.findall(phy))


                                    Output :



                                    C:UsersDesktop>py x.py
                                    ['s2:0.4186036213,s3:0.4186036213', 's4:0.1429514535,s5:0.1429514535']






                                    share|improve this answer












                                    share|improve this answer



                                    share|improve this answer










                                    answered Nov 23 '18 at 6:07









                                    RarblackRarblack

                                    2,82241025




                                    2,82241025






























                                        draft saved

                                        draft discarded




















































                                        Thanks for contributing an answer to Stack Overflow!


                                        • Please be sure to answer the question. Provide details and share your research!

                                        But avoid



                                        • Asking for help, clarification, or responding to other answers.

                                        • Making statements based on opinion; back them up with references or personal experience.


                                        To learn more, see our tips on writing great answers.




                                        draft saved


                                        draft discarded














                                        StackExchange.ready(
                                        function () {
                                        StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53440839%2fhow-to-extract-substrings-between-brackets-while-ignoring-those-between-nested-b%23new-answer', 'question_page');
                                        }
                                        );

                                        Post as a guest















                                        Required, but never shown





















































                                        Required, but never shown














                                        Required, but never shown












                                        Required, but never shown







                                        Required, but never shown

































                                        Required, but never shown














                                        Required, but never shown












                                        Required, but never shown







                                        Required, but never shown







                                        Popular posts from this blog

                                        Costa Masnaga

                                        Fotorealismo

                                        Sidney Franklin