Split or partition string after certain words
up vote
3
down vote
favorite
Let me start by saying I've googled extensively for quite a few hours before asking this here, and I'm quite desperate if I've chosen to post here.
I have a few strings with the following format (approximated):
"firstword text ONE lastword"
"firstword text TWO lastword"
I need to extract the text after the 'firstword' and before 'ONE' or 'TWO'.
So my output for the aforementioned strings would have to be:
"text"
How do I split or partition the string so I can:
- remove the first word (I already know how to do this with str.split(' '))
- retain the text which comes before any of the 'ONE' or 'TWO'. (I thought it was supposed to look something like str.split('ONE' |
'TWO'), but that obviously doesn't work and I haven't managed to find
a solution right now.
If possible, I would like to solve it with split() or partition(), but regex would be fine as well.
Thank you for your help and sorry if this is a dumb question.
python regex string split
add a comment |
up vote
3
down vote
favorite
Let me start by saying I've googled extensively for quite a few hours before asking this here, and I'm quite desperate if I've chosen to post here.
I have a few strings with the following format (approximated):
"firstword text ONE lastword"
"firstword text TWO lastword"
I need to extract the text after the 'firstword' and before 'ONE' or 'TWO'.
So my output for the aforementioned strings would have to be:
"text"
How do I split or partition the string so I can:
- remove the first word (I already know how to do this with str.split(' '))
- retain the text which comes before any of the 'ONE' or 'TWO'. (I thought it was supposed to look something like str.split('ONE' |
'TWO'), but that obviously doesn't work and I haven't managed to find
a solution right now.
If possible, I would like to solve it with split() or partition(), but regex would be fine as well.
Thank you for your help and sorry if this is a dumb question.
python regex string split
Possible duplicate of find-string-between-two-substrings
– Mayank Porwal
Nov 19 at 12:36
add a comment |
up vote
3
down vote
favorite
up vote
3
down vote
favorite
Let me start by saying I've googled extensively for quite a few hours before asking this here, and I'm quite desperate if I've chosen to post here.
I have a few strings with the following format (approximated):
"firstword text ONE lastword"
"firstword text TWO lastword"
I need to extract the text after the 'firstword' and before 'ONE' or 'TWO'.
So my output for the aforementioned strings would have to be:
"text"
How do I split or partition the string so I can:
- remove the first word (I already know how to do this with str.split(' '))
- retain the text which comes before any of the 'ONE' or 'TWO'. (I thought it was supposed to look something like str.split('ONE' |
'TWO'), but that obviously doesn't work and I haven't managed to find
a solution right now.
If possible, I would like to solve it with split() or partition(), but regex would be fine as well.
Thank you for your help and sorry if this is a dumb question.
python regex string split
Let me start by saying I've googled extensively for quite a few hours before asking this here, and I'm quite desperate if I've chosen to post here.
I have a few strings with the following format (approximated):
"firstword text ONE lastword"
"firstword text TWO lastword"
I need to extract the text after the 'firstword' and before 'ONE' or 'TWO'.
So my output for the aforementioned strings would have to be:
"text"
How do I split or partition the string so I can:
- remove the first word (I already know how to do this with str.split(' '))
- retain the text which comes before any of the 'ONE' or 'TWO'. (I thought it was supposed to look something like str.split('ONE' |
'TWO'), but that obviously doesn't work and I haven't managed to find
a solution right now.
If possible, I would like to solve it with split() or partition(), but regex would be fine as well.
Thank you for your help and sorry if this is a dumb question.
python regex string split
python regex string split
asked Nov 19 at 12:28
remus2232
254
254
Possible duplicate of find-string-between-two-substrings
– Mayank Porwal
Nov 19 at 12:36
add a comment |
Possible duplicate of find-string-between-two-substrings
– Mayank Porwal
Nov 19 at 12:36
Possible duplicate of find-string-between-two-substrings
– Mayank Porwal
Nov 19 at 12:36
Possible duplicate of find-string-between-two-substrings
– Mayank Porwal
Nov 19 at 12:36
add a comment |
5 Answers
5
active
oldest
votes
up vote
3
down vote
accepted
You can use this regex, which does a positive lookahead and positive lookbehind,
(?<=firstword)s*(.*?)s*(?=ONE|TWO)
Demo
Explanation:
(?<=firstword)--> Positive look behind to ensure the matched text is followed by firstword
s*--> Eats any white space
(.*?)--> Captures your intended data
s*--> Eats any white space
(?=ONE|TWO)--> Positive lookahead to ensure the matched text is followed by ONE or TWO
1
This is indeed a good solution. I will accept it as the answer as it solved my specific query. It does leave me wondering how I would solve this withsplit()orpartition(), though. Is it possible?
– remus2232
Nov 19 at 13:09
add a comment |
up vote
0
down vote
When you split it with space you have a list of all the words then you can choose which word you want :
s = "firstword text TWO lastword"
l = s.split(" ") # l = ["firstword" , "text" , "TWO" , "lastword"]
print l[1] # l[1] = "text"
or
s = "firstword text TWO lastword"
print s.split(" ")[1]
The problem with this is that my string can have any length after theONEorTWO. I'm looking Remove everything that comes after theONEorTWO, it might be 1 word or 10 words. Sorry for not being more specific. A more realistic example of the string I'm working with isfirstword text ONE extra text which needs to be deleted
– remus2232
Nov 19 at 12:49
add a comment |
up vote
0
down vote
Try This
str_list = ["firstword text ONE lastword","firstword text TWO lastword","any text u entered before firstword text ONE","firstword text TWO any text After"]
end_key_lst = ['ONE','TWO']
print map(lambda x:x.split('firstword')[-1].strip(),[''.join(val.split(end_key)[:-1]) for val in str_list for i,end_key in enumerate(end_key_lst) if end_key in val.split()])
Result:['text', 'text', 'text', 'text']
How i do this:
May You have number of strings like those,So i kept them in list and Arrange Our End Keys like ONE,TWO in one list.
I use list Compression and Map function to get our desired target list.
add a comment |
up vote
0
down vote
You can use regex like:
import re
string = "firstword text TWO lastword"
re.search('firstwords+(w+)s+[ONE|TWO]', string).group(1)
'text'
add a comment |
up vote
0
down vote
Actually there's no need to use regex. You can store required separators into a list and then check if they exist.
orig_text = "firstword text ONE lastword"
first_separator = "firstword"
#Place all "end words" here
last_separators = ["ONE", "TWO"]
output =
#Splitting the original text into list
orig_text = orig_text.split(" ")
#Checking if there's the "firstword" just in case
if first_separator in orig_text:
#Here we check if there's "ONE" or "TWO" in the text
for i in last_separators:
if i in orig_text:
#taking everything between "firstword" and "ONE"/"TWO"
output = orig_text[orig_text.index(first_separator)+1 : orig_text.index(i)]
break
#Converting to string
output = " ".join(output)
print(output)
Here's an example of outputs:
"firstword text TWO lastword" -> "text"
"firstword hello world ONE" -> "hello world"
"first text ONE" -> ""
"firstword text" -> ""
add a comment |
5 Answers
5
active
oldest
votes
5 Answers
5
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
3
down vote
accepted
You can use this regex, which does a positive lookahead and positive lookbehind,
(?<=firstword)s*(.*?)s*(?=ONE|TWO)
Demo
Explanation:
(?<=firstword)--> Positive look behind to ensure the matched text is followed by firstword
s*--> Eats any white space
(.*?)--> Captures your intended data
s*--> Eats any white space
(?=ONE|TWO)--> Positive lookahead to ensure the matched text is followed by ONE or TWO
1
This is indeed a good solution. I will accept it as the answer as it solved my specific query. It does leave me wondering how I would solve this withsplit()orpartition(), though. Is it possible?
– remus2232
Nov 19 at 13:09
add a comment |
up vote
3
down vote
accepted
You can use this regex, which does a positive lookahead and positive lookbehind,
(?<=firstword)s*(.*?)s*(?=ONE|TWO)
Demo
Explanation:
(?<=firstword)--> Positive look behind to ensure the matched text is followed by firstword
s*--> Eats any white space
(.*?)--> Captures your intended data
s*--> Eats any white space
(?=ONE|TWO)--> Positive lookahead to ensure the matched text is followed by ONE or TWO
1
This is indeed a good solution. I will accept it as the answer as it solved my specific query. It does leave me wondering how I would solve this withsplit()orpartition(), though. Is it possible?
– remus2232
Nov 19 at 13:09
add a comment |
up vote
3
down vote
accepted
up vote
3
down vote
accepted
You can use this regex, which does a positive lookahead and positive lookbehind,
(?<=firstword)s*(.*?)s*(?=ONE|TWO)
Demo
Explanation:
(?<=firstword)--> Positive look behind to ensure the matched text is followed by firstword
s*--> Eats any white space
(.*?)--> Captures your intended data
s*--> Eats any white space
(?=ONE|TWO)--> Positive lookahead to ensure the matched text is followed by ONE or TWO
You can use this regex, which does a positive lookahead and positive lookbehind,
(?<=firstword)s*(.*?)s*(?=ONE|TWO)
Demo
Explanation:
(?<=firstword)--> Positive look behind to ensure the matched text is followed by firstword
s*--> Eats any white space
(.*?)--> Captures your intended data
s*--> Eats any white space
(?=ONE|TWO)--> Positive lookahead to ensure the matched text is followed by ONE or TWO
answered Nov 19 at 12:32
Pushpesh Kumar Rajwanshi
3,8151824
3,8151824
1
This is indeed a good solution. I will accept it as the answer as it solved my specific query. It does leave me wondering how I would solve this withsplit()orpartition(), though. Is it possible?
– remus2232
Nov 19 at 13:09
add a comment |
1
This is indeed a good solution. I will accept it as the answer as it solved my specific query. It does leave me wondering how I would solve this withsplit()orpartition(), though. Is it possible?
– remus2232
Nov 19 at 13:09
1
1
This is indeed a good solution. I will accept it as the answer as it solved my specific query. It does leave me wondering how I would solve this with
split() or partition(), though. Is it possible?– remus2232
Nov 19 at 13:09
This is indeed a good solution. I will accept it as the answer as it solved my specific query. It does leave me wondering how I would solve this with
split() or partition(), though. Is it possible?– remus2232
Nov 19 at 13:09
add a comment |
up vote
0
down vote
When you split it with space you have a list of all the words then you can choose which word you want :
s = "firstword text TWO lastword"
l = s.split(" ") # l = ["firstword" , "text" , "TWO" , "lastword"]
print l[1] # l[1] = "text"
or
s = "firstword text TWO lastword"
print s.split(" ")[1]
The problem with this is that my string can have any length after theONEorTWO. I'm looking Remove everything that comes after theONEorTWO, it might be 1 word or 10 words. Sorry for not being more specific. A more realistic example of the string I'm working with isfirstword text ONE extra text which needs to be deleted
– remus2232
Nov 19 at 12:49
add a comment |
up vote
0
down vote
When you split it with space you have a list of all the words then you can choose which word you want :
s = "firstword text TWO lastword"
l = s.split(" ") # l = ["firstword" , "text" , "TWO" , "lastword"]
print l[1] # l[1] = "text"
or
s = "firstword text TWO lastword"
print s.split(" ")[1]
The problem with this is that my string can have any length after theONEorTWO. I'm looking Remove everything that comes after theONEorTWO, it might be 1 word or 10 words. Sorry for not being more specific. A more realistic example of the string I'm working with isfirstword text ONE extra text which needs to be deleted
– remus2232
Nov 19 at 12:49
add a comment |
up vote
0
down vote
up vote
0
down vote
When you split it with space you have a list of all the words then you can choose which word you want :
s = "firstword text TWO lastword"
l = s.split(" ") # l = ["firstword" , "text" , "TWO" , "lastword"]
print l[1] # l[1] = "text"
or
s = "firstword text TWO lastword"
print s.split(" ")[1]
When you split it with space you have a list of all the words then you can choose which word you want :
s = "firstword text TWO lastword"
l = s.split(" ") # l = ["firstword" , "text" , "TWO" , "lastword"]
print l[1] # l[1] = "text"
or
s = "firstword text TWO lastword"
print s.split(" ")[1]
answered Nov 19 at 12:37
Ali Kargar
1444
1444
The problem with this is that my string can have any length after theONEorTWO. I'm looking Remove everything that comes after theONEorTWO, it might be 1 word or 10 words. Sorry for not being more specific. A more realistic example of the string I'm working with isfirstword text ONE extra text which needs to be deleted
– remus2232
Nov 19 at 12:49
add a comment |
The problem with this is that my string can have any length after theONEorTWO. I'm looking Remove everything that comes after theONEorTWO, it might be 1 word or 10 words. Sorry for not being more specific. A more realistic example of the string I'm working with isfirstword text ONE extra text which needs to be deleted
– remus2232
Nov 19 at 12:49
The problem with this is that my string can have any length after the
ONE or TWO. I'm looking Remove everything that comes after the ONE or TWO, it might be 1 word or 10 words. Sorry for not being more specific. A more realistic example of the string I'm working with is firstword text ONE extra text which needs to be deleted– remus2232
Nov 19 at 12:49
The problem with this is that my string can have any length after the
ONE or TWO. I'm looking Remove everything that comes after the ONE or TWO, it might be 1 word or 10 words. Sorry for not being more specific. A more realistic example of the string I'm working with is firstword text ONE extra text which needs to be deleted– remus2232
Nov 19 at 12:49
add a comment |
up vote
0
down vote
Try This
str_list = ["firstword text ONE lastword","firstword text TWO lastword","any text u entered before firstword text ONE","firstword text TWO any text After"]
end_key_lst = ['ONE','TWO']
print map(lambda x:x.split('firstword')[-1].strip(),[''.join(val.split(end_key)[:-1]) for val in str_list for i,end_key in enumerate(end_key_lst) if end_key in val.split()])
Result:['text', 'text', 'text', 'text']
How i do this:
May You have number of strings like those,So i kept them in list and Arrange Our End Keys like ONE,TWO in one list.
I use list Compression and Map function to get our desired target list.
add a comment |
up vote
0
down vote
Try This
str_list = ["firstword text ONE lastword","firstword text TWO lastword","any text u entered before firstword text ONE","firstword text TWO any text After"]
end_key_lst = ['ONE','TWO']
print map(lambda x:x.split('firstword')[-1].strip(),[''.join(val.split(end_key)[:-1]) for val in str_list for i,end_key in enumerate(end_key_lst) if end_key in val.split()])
Result:['text', 'text', 'text', 'text']
How i do this:
May You have number of strings like those,So i kept them in list and Arrange Our End Keys like ONE,TWO in one list.
I use list Compression and Map function to get our desired target list.
add a comment |
up vote
0
down vote
up vote
0
down vote
Try This
str_list = ["firstword text ONE lastword","firstword text TWO lastword","any text u entered before firstword text ONE","firstword text TWO any text After"]
end_key_lst = ['ONE','TWO']
print map(lambda x:x.split('firstword')[-1].strip(),[''.join(val.split(end_key)[:-1]) for val in str_list for i,end_key in enumerate(end_key_lst) if end_key in val.split()])
Result:['text', 'text', 'text', 'text']
How i do this:
May You have number of strings like those,So i kept them in list and Arrange Our End Keys like ONE,TWO in one list.
I use list Compression and Map function to get our desired target list.
Try This
str_list = ["firstword text ONE lastword","firstword text TWO lastword","any text u entered before firstword text ONE","firstword text TWO any text After"]
end_key_lst = ['ONE','TWO']
print map(lambda x:x.split('firstword')[-1].strip(),[''.join(val.split(end_key)[:-1]) for val in str_list for i,end_key in enumerate(end_key_lst) if end_key in val.split()])
Result:['text', 'text', 'text', 'text']
How i do this:
May You have number of strings like those,So i kept them in list and Arrange Our End Keys like ONE,TWO in one list.
I use list Compression and Map function to get our desired target list.
answered Nov 19 at 12:54
Narendra Lucky
568
568
add a comment |
add a comment |
up vote
0
down vote
You can use regex like:
import re
string = "firstword text TWO lastword"
re.search('firstwords+(w+)s+[ONE|TWO]', string).group(1)
'text'
add a comment |
up vote
0
down vote
You can use regex like:
import re
string = "firstword text TWO lastword"
re.search('firstwords+(w+)s+[ONE|TWO]', string).group(1)
'text'
add a comment |
up vote
0
down vote
up vote
0
down vote
You can use regex like:
import re
string = "firstword text TWO lastword"
re.search('firstwords+(w+)s+[ONE|TWO]', string).group(1)
'text'
You can use regex like:
import re
string = "firstword text TWO lastword"
re.search('firstwords+(w+)s+[ONE|TWO]', string).group(1)
'text'
answered Nov 19 at 12:57
Franco Piccolo
1,335611
1,335611
add a comment |
add a comment |
up vote
0
down vote
Actually there's no need to use regex. You can store required separators into a list and then check if they exist.
orig_text = "firstword text ONE lastword"
first_separator = "firstword"
#Place all "end words" here
last_separators = ["ONE", "TWO"]
output =
#Splitting the original text into list
orig_text = orig_text.split(" ")
#Checking if there's the "firstword" just in case
if first_separator in orig_text:
#Here we check if there's "ONE" or "TWO" in the text
for i in last_separators:
if i in orig_text:
#taking everything between "firstword" and "ONE"/"TWO"
output = orig_text[orig_text.index(first_separator)+1 : orig_text.index(i)]
break
#Converting to string
output = " ".join(output)
print(output)
Here's an example of outputs:
"firstword text TWO lastword" -> "text"
"firstword hello world ONE" -> "hello world"
"first text ONE" -> ""
"firstword text" -> ""
add a comment |
up vote
0
down vote
Actually there's no need to use regex. You can store required separators into a list and then check if they exist.
orig_text = "firstword text ONE lastword"
first_separator = "firstword"
#Place all "end words" here
last_separators = ["ONE", "TWO"]
output =
#Splitting the original text into list
orig_text = orig_text.split(" ")
#Checking if there's the "firstword" just in case
if first_separator in orig_text:
#Here we check if there's "ONE" or "TWO" in the text
for i in last_separators:
if i in orig_text:
#taking everything between "firstword" and "ONE"/"TWO"
output = orig_text[orig_text.index(first_separator)+1 : orig_text.index(i)]
break
#Converting to string
output = " ".join(output)
print(output)
Here's an example of outputs:
"firstword text TWO lastword" -> "text"
"firstword hello world ONE" -> "hello world"
"first text ONE" -> ""
"firstword text" -> ""
add a comment |
up vote
0
down vote
up vote
0
down vote
Actually there's no need to use regex. You can store required separators into a list and then check if they exist.
orig_text = "firstword text ONE lastword"
first_separator = "firstword"
#Place all "end words" here
last_separators = ["ONE", "TWO"]
output =
#Splitting the original text into list
orig_text = orig_text.split(" ")
#Checking if there's the "firstword" just in case
if first_separator in orig_text:
#Here we check if there's "ONE" or "TWO" in the text
for i in last_separators:
if i in orig_text:
#taking everything between "firstword" and "ONE"/"TWO"
output = orig_text[orig_text.index(first_separator)+1 : orig_text.index(i)]
break
#Converting to string
output = " ".join(output)
print(output)
Here's an example of outputs:
"firstword text TWO lastword" -> "text"
"firstword hello world ONE" -> "hello world"
"first text ONE" -> ""
"firstword text" -> ""
Actually there's no need to use regex. You can store required separators into a list and then check if they exist.
orig_text = "firstword text ONE lastword"
first_separator = "firstword"
#Place all "end words" here
last_separators = ["ONE", "TWO"]
output =
#Splitting the original text into list
orig_text = orig_text.split(" ")
#Checking if there's the "firstword" just in case
if first_separator in orig_text:
#Here we check if there's "ONE" or "TWO" in the text
for i in last_separators:
if i in orig_text:
#taking everything between "firstword" and "ONE"/"TWO"
output = orig_text[orig_text.index(first_separator)+1 : orig_text.index(i)]
break
#Converting to string
output = " ".join(output)
print(output)
Here's an example of outputs:
"firstword text TWO lastword" -> "text"
"firstword hello world ONE" -> "hello world"
"first text ONE" -> ""
"firstword text" -> ""
answered Nov 19 at 13:14
OSA413
108128
108128
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53374665%2fsplit-or-partition-string-after-certain-words%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Possible duplicate of find-string-between-two-substrings
– Mayank Porwal
Nov 19 at 12:36