Regex to match words in a sentence by its prefix












3















I have this regex on mongodb query to match words by prefix:



{sentence: new RegExp('^'+key,'gi')}


What would be the right regex pattern if I want it to match a sentence that has at least a word starting with key prefix? For example:



If I have a sentence




"This is a dog"




when key is 'do', then it should match that sentence since prefix 'do' is a substring of 'dog'.



My solution as of now only works for the first word of the sentence. It so far only matches that sentence if I type in 't' or 'th' or 'this'. It wouldnt match that sentence whenever I type in 'i' (prefix for 'is') or 'do' (prefix for 'dog').










share|improve this question





























    3















    I have this regex on mongodb query to match words by prefix:



    {sentence: new RegExp('^'+key,'gi')}


    What would be the right regex pattern if I want it to match a sentence that has at least a word starting with key prefix? For example:



    If I have a sentence




    "This is a dog"




    when key is 'do', then it should match that sentence since prefix 'do' is a substring of 'dog'.



    My solution as of now only works for the first word of the sentence. It so far only matches that sentence if I type in 't' or 'th' or 'this'. It wouldnt match that sentence whenever I type in 'i' (prefix for 'is') or 'do' (prefix for 'dog').










    share|improve this question



























      3












      3








      3








      I have this regex on mongodb query to match words by prefix:



      {sentence: new RegExp('^'+key,'gi')}


      What would be the right regex pattern if I want it to match a sentence that has at least a word starting with key prefix? For example:



      If I have a sentence




      "This is a dog"




      when key is 'do', then it should match that sentence since prefix 'do' is a substring of 'dog'.



      My solution as of now only works for the first word of the sentence. It so far only matches that sentence if I type in 't' or 'th' or 'this'. It wouldnt match that sentence whenever I type in 'i' (prefix for 'is') or 'do' (prefix for 'dog').










      share|improve this question
















      I have this regex on mongodb query to match words by prefix:



      {sentence: new RegExp('^'+key,'gi')}


      What would be the right regex pattern if I want it to match a sentence that has at least a word starting with key prefix? For example:



      If I have a sentence




      "This is a dog"




      when key is 'do', then it should match that sentence since prefix 'do' is a substring of 'dog'.



      My solution as of now only works for the first word of the sentence. It so far only matches that sentence if I type in 't' or 'th' or 'this'. It wouldnt match that sentence whenever I type in 'i' (prefix for 'is') or 'do' (prefix for 'dog').







      javascript regex mongodb






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Jan 29 '12 at 9:22









      gdoron

      110k41247309




      110k41247309










      asked Jan 29 '12 at 9:20









      Benny TjiaBenny Tjia

      3,734103147




      3,734103147
























          4 Answers
          4






          active

          oldest

          votes


















          5














          Use the b anchor to match word boundaries:



          bdo


          finds 'do' in 'nice dog', but doesn't match 'much ado about nothing'.






          share|improve this answer



















          • 5





            If you're actually going to have it in a string (as opposed to a regex literal) you should escape the backslash: '\bdo'.

            – Tikhon Jelvis
            Jan 29 '12 at 9:25











          • Good point, thanks

            – wutz
            Jan 29 '12 at 9:25











          • thanks for the help. I upvoted all the answers since they are all correct. I dont know which one to accept, so I go with the earliest :)

            – Benny Tjia
            Jan 29 '12 at 9:42



















          4














          You can use the expression /bprefixw+/. This should match any word starting with "prefix". Here the b represents a word boundary and w is any word character.



          If you don't want to get the whole word, you can just do /bprefix/. If you want to put this in a string, you also have to escape the : '\bprefix'.






          share|improve this answer





















          • 1





            Do you need the b after w+? Surely the word boundary isn't a word character...

            – nnnnnn
            Jan 29 '12 at 9:32











          • @nnnnnn Yeah, you're right.

            – Tikhon Jelvis
            Jan 29 '12 at 9:33



















          2














          The other answers suggesting the word boundary matching are neat, but will mean that an index isn't used efficiently. If you need fast lookups, you might want to consider adding a field "words" with each of your words broken up, i.e.



          {sentence: "This is a dog",
          words: ["This", "is", "a", "dog"]}


          After putting an index on the words field, you can go back to using:



          {words: new RegExp('^'+key,'gi')}


          and a key of "do" will now match this object and use an index.






          share|improve this answer































            1














            ^ matches beginning of the string (or beginning of a line if the multiline flag is set).



            b matches a word boundary.



            bdo matches words beginning with "do".



            So for your example:



            {sentence: new RegExp('\b'+key,'gi')}


            (Noting that in a JavaScript string you have to escape backslashes.)



            If you will be needing to capture the match(es) to find out what word(s) matched the pattern you'll want to wrap the expression in parentheses and add a bit to match the rest of the word:



            new RegExp('(\b' + key + '\w*)','gi')


            Where w is any word character and the * is zero or more. If you want words that have at least one character more than the key then use + instead of *.



            See the many regex guides on the web for more details, e.g., https://developer.mozilla.org/en/JavaScript/Guide/Regular_Expressions






            share|improve this answer


























              Your Answer






              StackExchange.ifUsing("editor", function () {
              StackExchange.using("externalEditor", function () {
              StackExchange.using("snippets", function () {
              StackExchange.snippets.init();
              });
              });
              }, "code-snippets");

              StackExchange.ready(function() {
              var channelOptions = {
              tags: "".split(" "),
              id: "1"
              };
              initTagRenderer("".split(" "), "".split(" "), channelOptions);

              StackExchange.using("externalEditor", function() {
              // Have to fire editor after snippets, if snippets enabled
              if (StackExchange.settings.snippets.snippetsEnabled) {
              StackExchange.using("snippets", function() {
              createEditor();
              });
              }
              else {
              createEditor();
              }
              });

              function createEditor() {
              StackExchange.prepareEditor({
              heartbeatType: 'answer',
              autoActivateHeartbeat: false,
              convertImagesToLinks: true,
              noModals: true,
              showLowRepImageUploadWarning: true,
              reputationToPostImages: 10,
              bindNavPrevention: true,
              postfix: "",
              imageUploader: {
              brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
              contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
              allowUrls: true
              },
              onDemand: true,
              discardSelector: ".discard-answer"
              ,immediatelyShowMarkdownHelp:true
              });


              }
              });














              draft saved

              draft discarded


















              StackExchange.ready(
              function () {
              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f9052148%2fregex-to-match-words-in-a-sentence-by-its-prefix%23new-answer', 'question_page');
              }
              );

              Post as a guest















              Required, but never shown

























              4 Answers
              4






              active

              oldest

              votes








              4 Answers
              4






              active

              oldest

              votes









              active

              oldest

              votes






              active

              oldest

              votes









              5














              Use the b anchor to match word boundaries:



              bdo


              finds 'do' in 'nice dog', but doesn't match 'much ado about nothing'.






              share|improve this answer



















              • 5





                If you're actually going to have it in a string (as opposed to a regex literal) you should escape the backslash: '\bdo'.

                – Tikhon Jelvis
                Jan 29 '12 at 9:25











              • Good point, thanks

                – wutz
                Jan 29 '12 at 9:25











              • thanks for the help. I upvoted all the answers since they are all correct. I dont know which one to accept, so I go with the earliest :)

                – Benny Tjia
                Jan 29 '12 at 9:42
















              5














              Use the b anchor to match word boundaries:



              bdo


              finds 'do' in 'nice dog', but doesn't match 'much ado about nothing'.






              share|improve this answer



















              • 5





                If you're actually going to have it in a string (as opposed to a regex literal) you should escape the backslash: '\bdo'.

                – Tikhon Jelvis
                Jan 29 '12 at 9:25











              • Good point, thanks

                – wutz
                Jan 29 '12 at 9:25











              • thanks for the help. I upvoted all the answers since they are all correct. I dont know which one to accept, so I go with the earliest :)

                – Benny Tjia
                Jan 29 '12 at 9:42














              5












              5








              5







              Use the b anchor to match word boundaries:



              bdo


              finds 'do' in 'nice dog', but doesn't match 'much ado about nothing'.






              share|improve this answer













              Use the b anchor to match word boundaries:



              bdo


              finds 'do' in 'nice dog', but doesn't match 'much ado about nothing'.







              share|improve this answer












              share|improve this answer



              share|improve this answer










              answered Jan 29 '12 at 9:23









              wutzwutz

              2,8691113




              2,8691113








              • 5





                If you're actually going to have it in a string (as opposed to a regex literal) you should escape the backslash: '\bdo'.

                – Tikhon Jelvis
                Jan 29 '12 at 9:25











              • Good point, thanks

                – wutz
                Jan 29 '12 at 9:25











              • thanks for the help. I upvoted all the answers since they are all correct. I dont know which one to accept, so I go with the earliest :)

                – Benny Tjia
                Jan 29 '12 at 9:42














              • 5





                If you're actually going to have it in a string (as opposed to a regex literal) you should escape the backslash: '\bdo'.

                – Tikhon Jelvis
                Jan 29 '12 at 9:25











              • Good point, thanks

                – wutz
                Jan 29 '12 at 9:25











              • thanks for the help. I upvoted all the answers since they are all correct. I dont know which one to accept, so I go with the earliest :)

                – Benny Tjia
                Jan 29 '12 at 9:42








              5




              5





              If you're actually going to have it in a string (as opposed to a regex literal) you should escape the backslash: '\bdo'.

              – Tikhon Jelvis
              Jan 29 '12 at 9:25





              If you're actually going to have it in a string (as opposed to a regex literal) you should escape the backslash: '\bdo'.

              – Tikhon Jelvis
              Jan 29 '12 at 9:25













              Good point, thanks

              – wutz
              Jan 29 '12 at 9:25





              Good point, thanks

              – wutz
              Jan 29 '12 at 9:25













              thanks for the help. I upvoted all the answers since they are all correct. I dont know which one to accept, so I go with the earliest :)

              – Benny Tjia
              Jan 29 '12 at 9:42





              thanks for the help. I upvoted all the answers since they are all correct. I dont know which one to accept, so I go with the earliest :)

              – Benny Tjia
              Jan 29 '12 at 9:42













              4














              You can use the expression /bprefixw+/. This should match any word starting with "prefix". Here the b represents a word boundary and w is any word character.



              If you don't want to get the whole word, you can just do /bprefix/. If you want to put this in a string, you also have to escape the : '\bprefix'.






              share|improve this answer





















              • 1





                Do you need the b after w+? Surely the word boundary isn't a word character...

                – nnnnnn
                Jan 29 '12 at 9:32











              • @nnnnnn Yeah, you're right.

                – Tikhon Jelvis
                Jan 29 '12 at 9:33
















              4














              You can use the expression /bprefixw+/. This should match any word starting with "prefix". Here the b represents a word boundary and w is any word character.



              If you don't want to get the whole word, you can just do /bprefix/. If you want to put this in a string, you also have to escape the : '\bprefix'.






              share|improve this answer





















              • 1





                Do you need the b after w+? Surely the word boundary isn't a word character...

                – nnnnnn
                Jan 29 '12 at 9:32











              • @nnnnnn Yeah, you're right.

                – Tikhon Jelvis
                Jan 29 '12 at 9:33














              4












              4








              4







              You can use the expression /bprefixw+/. This should match any word starting with "prefix". Here the b represents a word boundary and w is any word character.



              If you don't want to get the whole word, you can just do /bprefix/. If you want to put this in a string, you also have to escape the : '\bprefix'.






              share|improve this answer















              You can use the expression /bprefixw+/. This should match any word starting with "prefix". Here the b represents a word boundary and w is any word character.



              If you don't want to get the whole word, you can just do /bprefix/. If you want to put this in a string, you also have to escape the : '\bprefix'.







              share|improve this answer














              share|improve this answer



              share|improve this answer








              edited Jan 29 '12 at 9:33

























              answered Jan 29 '12 at 9:23









              Tikhon JelvisTikhon Jelvis

              55.5k16153194




              55.5k16153194








              • 1





                Do you need the b after w+? Surely the word boundary isn't a word character...

                – nnnnnn
                Jan 29 '12 at 9:32











              • @nnnnnn Yeah, you're right.

                – Tikhon Jelvis
                Jan 29 '12 at 9:33














              • 1





                Do you need the b after w+? Surely the word boundary isn't a word character...

                – nnnnnn
                Jan 29 '12 at 9:32











              • @nnnnnn Yeah, you're right.

                – Tikhon Jelvis
                Jan 29 '12 at 9:33








              1




              1





              Do you need the b after w+? Surely the word boundary isn't a word character...

              – nnnnnn
              Jan 29 '12 at 9:32





              Do you need the b after w+? Surely the word boundary isn't a word character...

              – nnnnnn
              Jan 29 '12 at 9:32













              @nnnnnn Yeah, you're right.

              – Tikhon Jelvis
              Jan 29 '12 at 9:33





              @nnnnnn Yeah, you're right.

              – Tikhon Jelvis
              Jan 29 '12 at 9:33











              2














              The other answers suggesting the word boundary matching are neat, but will mean that an index isn't used efficiently. If you need fast lookups, you might want to consider adding a field "words" with each of your words broken up, i.e.



              {sentence: "This is a dog",
              words: ["This", "is", "a", "dog"]}


              After putting an index on the words field, you can go back to using:



              {words: new RegExp('^'+key,'gi')}


              and a key of "do" will now match this object and use an index.






              share|improve this answer




























                2














                The other answers suggesting the word boundary matching are neat, but will mean that an index isn't used efficiently. If you need fast lookups, you might want to consider adding a field "words" with each of your words broken up, i.e.



                {sentence: "This is a dog",
                words: ["This", "is", "a", "dog"]}


                After putting an index on the words field, you can go back to using:



                {words: new RegExp('^'+key,'gi')}


                and a key of "do" will now match this object and use an index.






                share|improve this answer


























                  2












                  2








                  2







                  The other answers suggesting the word boundary matching are neat, but will mean that an index isn't used efficiently. If you need fast lookups, you might want to consider adding a field "words" with each of your words broken up, i.e.



                  {sentence: "This is a dog",
                  words: ["This", "is", "a", "dog"]}


                  After putting an index on the words field, you can go back to using:



                  {words: new RegExp('^'+key,'gi')}


                  and a key of "do" will now match this object and use an index.






                  share|improve this answer













                  The other answers suggesting the word boundary matching are neat, but will mean that an index isn't used efficiently. If you need fast lookups, you might want to consider adding a field "words" with each of your words broken up, i.e.



                  {sentence: "This is a dog",
                  words: ["This", "is", "a", "dog"]}


                  After putting an index on the words field, you can go back to using:



                  {words: new RegExp('^'+key,'gi')}


                  and a key of "do" will now match this object and use an index.







                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered Jan 30 '12 at 7:15









                  Nic CottrellNic Cottrell

                  5,36233154




                  5,36233154























                      1














                      ^ matches beginning of the string (or beginning of a line if the multiline flag is set).



                      b matches a word boundary.



                      bdo matches words beginning with "do".



                      So for your example:



                      {sentence: new RegExp('\b'+key,'gi')}


                      (Noting that in a JavaScript string you have to escape backslashes.)



                      If you will be needing to capture the match(es) to find out what word(s) matched the pattern you'll want to wrap the expression in parentheses and add a bit to match the rest of the word:



                      new RegExp('(\b' + key + '\w*)','gi')


                      Where w is any word character and the * is zero or more. If you want words that have at least one character more than the key then use + instead of *.



                      See the many regex guides on the web for more details, e.g., https://developer.mozilla.org/en/JavaScript/Guide/Regular_Expressions






                      share|improve this answer






























                        1














                        ^ matches beginning of the string (or beginning of a line if the multiline flag is set).



                        b matches a word boundary.



                        bdo matches words beginning with "do".



                        So for your example:



                        {sentence: new RegExp('\b'+key,'gi')}


                        (Noting that in a JavaScript string you have to escape backslashes.)



                        If you will be needing to capture the match(es) to find out what word(s) matched the pattern you'll want to wrap the expression in parentheses and add a bit to match the rest of the word:



                        new RegExp('(\b' + key + '\w*)','gi')


                        Where w is any word character and the * is zero or more. If you want words that have at least one character more than the key then use + instead of *.



                        See the many regex guides on the web for more details, e.g., https://developer.mozilla.org/en/JavaScript/Guide/Regular_Expressions






                        share|improve this answer




























                          1












                          1








                          1







                          ^ matches beginning of the string (or beginning of a line if the multiline flag is set).



                          b matches a word boundary.



                          bdo matches words beginning with "do".



                          So for your example:



                          {sentence: new RegExp('\b'+key,'gi')}


                          (Noting that in a JavaScript string you have to escape backslashes.)



                          If you will be needing to capture the match(es) to find out what word(s) matched the pattern you'll want to wrap the expression in parentheses and add a bit to match the rest of the word:



                          new RegExp('(\b' + key + '\w*)','gi')


                          Where w is any word character and the * is zero or more. If you want words that have at least one character more than the key then use + instead of *.



                          See the many regex guides on the web for more details, e.g., https://developer.mozilla.org/en/JavaScript/Guide/Regular_Expressions






                          share|improve this answer















                          ^ matches beginning of the string (or beginning of a line if the multiline flag is set).



                          b matches a word boundary.



                          bdo matches words beginning with "do".



                          So for your example:



                          {sentence: new RegExp('\b'+key,'gi')}


                          (Noting that in a JavaScript string you have to escape backslashes.)



                          If you will be needing to capture the match(es) to find out what word(s) matched the pattern you'll want to wrap the expression in parentheses and add a bit to match the rest of the word:



                          new RegExp('(\b' + key + '\w*)','gi')


                          Where w is any word character and the * is zero or more. If you want words that have at least one character more than the key then use + instead of *.



                          See the many regex guides on the web for more details, e.g., https://developer.mozilla.org/en/JavaScript/Guide/Regular_Expressions







                          share|improve this answer














                          share|improve this answer



                          share|improve this answer








                          edited Jan 29 '12 at 9:31

























                          answered Jan 29 '12 at 9:25









                          nnnnnnnnnnnn

                          124k18144198




                          124k18144198






























                              draft saved

                              draft discarded




















































                              Thanks for contributing an answer to Stack Overflow!


                              • Please be sure to answer the question. Provide details and share your research!

                              But avoid



                              • Asking for help, clarification, or responding to other answers.

                              • Making statements based on opinion; back them up with references or personal experience.


                              To learn more, see our tips on writing great answers.




                              draft saved


                              draft discarded














                              StackExchange.ready(
                              function () {
                              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f9052148%2fregex-to-match-words-in-a-sentence-by-its-prefix%23new-answer', 'question_page');
                              }
                              );

                              Post as a guest















                              Required, but never shown





















































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown

































                              Required, but never shown














                              Required, but never shown












                              Required, but never shown







                              Required, but never shown







                              Popular posts from this blog

                              Costa Masnaga

                              Fotorealismo

                              Sidney Franklin