Python - how to add a new line every time there is a pattern is found in a string?
How can I add a new line every time there is a pattern of a regex-list found in a string ?
I am using python 3.6.
I got the following input:
12.13.14 Here is supposed to start a new line.
12.13.15 Here is supposed to start a new line.
Here is some text. It is written in one lines. 12.13. Here is some more text. 2.12.14. Here is even more text.
I wish to have the following output:
12.13.14
Here is supposed to start a new line.
12.13.15
Here is supposed to start a new line.
Here is some text. It is written in one lines.
12.13.
Here is some more text.
2.12.14.
Here is even more text.
My first try returns as the output the same as the input:
in_file2 = 'work1-T1.txt'
out_file2 = 'work2-T1.txt'
start_rx = re.compile('|'.join(
['dd.dd.', 'd.dd.dd','dd.dd.dd']))
with open(in_file2,'r', encoding='utf-8') as fin2, open(out_file2, 'w', encoding='utf-8') as fout2:
text_list = fin2.read().split()
fin2.seek(0)
for string in fin2:
if re.match(start_rx, string):
string = str.replace(start_rx, 'nn' + start_rx + 'n')
fout2.write(string)
My second try returns an error 'TypeError: unsupported operand type(s) for +: '_sre.SRE_Pattern' and 'str''
in_file2 = 'work1-T1.txt'
out_file2 = 'work2-T1.txt'
start_rx = re.compile('|'.join(
['dd.dd.', 'd.dd.dd','dd.dd.dd']))
with open(in_file2,"r") as fin2, open(out_file2, 'w') as fout3:
for line in fin2:
start = False
if re.match(start_rx, line):
start = True
if start == False:
print ('do something')
if start == True:
line = 'n' + line ## leerzeichen vor Pos Nr
line = line.replace(start_rx, start_rx + 'n')
fout3.write(line)
regex python-3.x replace
add a comment |
How can I add a new line every time there is a pattern of a regex-list found in a string ?
I am using python 3.6.
I got the following input:
12.13.14 Here is supposed to start a new line.
12.13.15 Here is supposed to start a new line.
Here is some text. It is written in one lines. 12.13. Here is some more text. 2.12.14. Here is even more text.
I wish to have the following output:
12.13.14
Here is supposed to start a new line.
12.13.15
Here is supposed to start a new line.
Here is some text. It is written in one lines.
12.13.
Here is some more text.
2.12.14.
Here is even more text.
My first try returns as the output the same as the input:
in_file2 = 'work1-T1.txt'
out_file2 = 'work2-T1.txt'
start_rx = re.compile('|'.join(
['dd.dd.', 'd.dd.dd','dd.dd.dd']))
with open(in_file2,'r', encoding='utf-8') as fin2, open(out_file2, 'w', encoding='utf-8') as fout2:
text_list = fin2.read().split()
fin2.seek(0)
for string in fin2:
if re.match(start_rx, string):
string = str.replace(start_rx, 'nn' + start_rx + 'n')
fout2.write(string)
My second try returns an error 'TypeError: unsupported operand type(s) for +: '_sre.SRE_Pattern' and 'str''
in_file2 = 'work1-T1.txt'
out_file2 = 'work2-T1.txt'
start_rx = re.compile('|'.join(
['dd.dd.', 'd.dd.dd','dd.dd.dd']))
with open(in_file2,"r") as fin2, open(out_file2, 'w') as fout3:
for line in fin2:
start = False
if re.match(start_rx, line):
start = True
if start == False:
print ('do something')
if start == True:
line = 'n' + line ## leerzeichen vor Pos Nr
line = line.replace(start_rx, start_rx + 'n')
fout3.write(line)
regex python-3.x replace
1
Note you are trying to usestr.replace
method with regex, but it does not accept regex. You needre.sub
. Trytext = fin2.read()
and thenfout2.write(re.sub(r's*(d+(?:.d+)+.?)s*', r'nn1n', text))
, too. See this demo.
– Wiktor Stribiżew
Nov 23 '18 at 10:54
This fixed the problem. Thank you.
– Mady
Nov 23 '18 at 11:05
add a comment |
How can I add a new line every time there is a pattern of a regex-list found in a string ?
I am using python 3.6.
I got the following input:
12.13.14 Here is supposed to start a new line.
12.13.15 Here is supposed to start a new line.
Here is some text. It is written in one lines. 12.13. Here is some more text. 2.12.14. Here is even more text.
I wish to have the following output:
12.13.14
Here is supposed to start a new line.
12.13.15
Here is supposed to start a new line.
Here is some text. It is written in one lines.
12.13.
Here is some more text.
2.12.14.
Here is even more text.
My first try returns as the output the same as the input:
in_file2 = 'work1-T1.txt'
out_file2 = 'work2-T1.txt'
start_rx = re.compile('|'.join(
['dd.dd.', 'd.dd.dd','dd.dd.dd']))
with open(in_file2,'r', encoding='utf-8') as fin2, open(out_file2, 'w', encoding='utf-8') as fout2:
text_list = fin2.read().split()
fin2.seek(0)
for string in fin2:
if re.match(start_rx, string):
string = str.replace(start_rx, 'nn' + start_rx + 'n')
fout2.write(string)
My second try returns an error 'TypeError: unsupported operand type(s) for +: '_sre.SRE_Pattern' and 'str''
in_file2 = 'work1-T1.txt'
out_file2 = 'work2-T1.txt'
start_rx = re.compile('|'.join(
['dd.dd.', 'd.dd.dd','dd.dd.dd']))
with open(in_file2,"r") as fin2, open(out_file2, 'w') as fout3:
for line in fin2:
start = False
if re.match(start_rx, line):
start = True
if start == False:
print ('do something')
if start == True:
line = 'n' + line ## leerzeichen vor Pos Nr
line = line.replace(start_rx, start_rx + 'n')
fout3.write(line)
regex python-3.x replace
How can I add a new line every time there is a pattern of a regex-list found in a string ?
I am using python 3.6.
I got the following input:
12.13.14 Here is supposed to start a new line.
12.13.15 Here is supposed to start a new line.
Here is some text. It is written in one lines. 12.13. Here is some more text. 2.12.14. Here is even more text.
I wish to have the following output:
12.13.14
Here is supposed to start a new line.
12.13.15
Here is supposed to start a new line.
Here is some text. It is written in one lines.
12.13.
Here is some more text.
2.12.14.
Here is even more text.
My first try returns as the output the same as the input:
in_file2 = 'work1-T1.txt'
out_file2 = 'work2-T1.txt'
start_rx = re.compile('|'.join(
['dd.dd.', 'd.dd.dd','dd.dd.dd']))
with open(in_file2,'r', encoding='utf-8') as fin2, open(out_file2, 'w', encoding='utf-8') as fout2:
text_list = fin2.read().split()
fin2.seek(0)
for string in fin2:
if re.match(start_rx, string):
string = str.replace(start_rx, 'nn' + start_rx + 'n')
fout2.write(string)
My second try returns an error 'TypeError: unsupported operand type(s) for +: '_sre.SRE_Pattern' and 'str''
in_file2 = 'work1-T1.txt'
out_file2 = 'work2-T1.txt'
start_rx = re.compile('|'.join(
['dd.dd.', 'd.dd.dd','dd.dd.dd']))
with open(in_file2,"r") as fin2, open(out_file2, 'w') as fout3:
for line in fin2:
start = False
if re.match(start_rx, line):
start = True
if start == False:
print ('do something')
if start == True:
line = 'n' + line ## leerzeichen vor Pos Nr
line = line.replace(start_rx, start_rx + 'n')
fout3.write(line)
regex python-3.x replace
regex python-3.x replace
asked Nov 23 '18 at 10:33
MadyMady
1389
1389
1
Note you are trying to usestr.replace
method with regex, but it does not accept regex. You needre.sub
. Trytext = fin2.read()
and thenfout2.write(re.sub(r's*(d+(?:.d+)+.?)s*', r'nn1n', text))
, too. See this demo.
– Wiktor Stribiżew
Nov 23 '18 at 10:54
This fixed the problem. Thank you.
– Mady
Nov 23 '18 at 11:05
add a comment |
1
Note you are trying to usestr.replace
method with regex, but it does not accept regex. You needre.sub
. Trytext = fin2.read()
and thenfout2.write(re.sub(r's*(d+(?:.d+)+.?)s*', r'nn1n', text))
, too. See this demo.
– Wiktor Stribiżew
Nov 23 '18 at 10:54
This fixed the problem. Thank you.
– Mady
Nov 23 '18 at 11:05
1
1
Note you are trying to use
str.replace
method with regex, but it does not accept regex. You need re.sub
. Try text = fin2.read()
and then fout2.write(re.sub(r's*(d+(?:.d+)+.?)s*', r'nn1n', text))
, too. See this demo.– Wiktor Stribiżew
Nov 23 '18 at 10:54
Note you are trying to use
str.replace
method with regex, but it does not accept regex. You need re.sub
. Try text = fin2.read()
and then fout2.write(re.sub(r's*(d+(?:.d+)+.?)s*', r'nn1n', text))
, too. See this demo.– Wiktor Stribiżew
Nov 23 '18 at 10:54
This fixed the problem. Thank you.
– Mady
Nov 23 '18 at 11:05
This fixed the problem. Thank you.
– Mady
Nov 23 '18 at 11:05
add a comment |
2 Answers
2
active
oldest
votes
First of all, to search and replace with a regex, you need to use re.sub
, not str.replace
.
Second, if you use a re.sub
, you can't use the regex pattern inside a replacement pattern, you need to group the parts of the regex you want to keep and use backreferences in the replacement (or, if you just want to refer to the whole match, use g<0>
backreference, no capturing groups are required).
Third, when you build an unanchored alternation pattern, make sure longer alternatives come first, i.e. start_rx = re.compile('|'.join(['dd.dd.dd', 'd.dd.dd', 'dd.dd.']))
. However, you may use a more precise pattern here manually.
Here is how your code can be fixed:
with open(in_file2,'r', encoding='utf-8') as fin2, open(out_file2, 'w', encoding='utf-8') as fout2:
text = fin2.read()
fout2.write(re.sub(r's*(d+(?:.d+)+.?)s*', r'nn1n', text))
See the Python demo
The pattern is
s*(d+(?:.d+)+.?)s*
See the regex demo
Details
s*
- 0+ whitespaces
(d+(?:.d+)+.?)
- Group 1 (1
in the replacement pattern):
d+
- 1+ digits
(?:.d+)+
- 1 or more repetitions of.
and 1+ digits
.?
- an optional.
s*
- 0+ whitespaces
add a comment |
Try this
out_file2=re.sub(r'(d+) ', r'1n', in_file2)
out_file2=re.sub(r'(w+).', r'1.n', in_file2)
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53445013%2fpython-how-to-add-a-new-line-every-time-there-is-a-pattern-is-found-in-a-strin%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
First of all, to search and replace with a regex, you need to use re.sub
, not str.replace
.
Second, if you use a re.sub
, you can't use the regex pattern inside a replacement pattern, you need to group the parts of the regex you want to keep and use backreferences in the replacement (or, if you just want to refer to the whole match, use g<0>
backreference, no capturing groups are required).
Third, when you build an unanchored alternation pattern, make sure longer alternatives come first, i.e. start_rx = re.compile('|'.join(['dd.dd.dd', 'd.dd.dd', 'dd.dd.']))
. However, you may use a more precise pattern here manually.
Here is how your code can be fixed:
with open(in_file2,'r', encoding='utf-8') as fin2, open(out_file2, 'w', encoding='utf-8') as fout2:
text = fin2.read()
fout2.write(re.sub(r's*(d+(?:.d+)+.?)s*', r'nn1n', text))
See the Python demo
The pattern is
s*(d+(?:.d+)+.?)s*
See the regex demo
Details
s*
- 0+ whitespaces
(d+(?:.d+)+.?)
- Group 1 (1
in the replacement pattern):
d+
- 1+ digits
(?:.d+)+
- 1 or more repetitions of.
and 1+ digits
.?
- an optional.
s*
- 0+ whitespaces
add a comment |
First of all, to search and replace with a regex, you need to use re.sub
, not str.replace
.
Second, if you use a re.sub
, you can't use the regex pattern inside a replacement pattern, you need to group the parts of the regex you want to keep and use backreferences in the replacement (or, if you just want to refer to the whole match, use g<0>
backreference, no capturing groups are required).
Third, when you build an unanchored alternation pattern, make sure longer alternatives come first, i.e. start_rx = re.compile('|'.join(['dd.dd.dd', 'd.dd.dd', 'dd.dd.']))
. However, you may use a more precise pattern here manually.
Here is how your code can be fixed:
with open(in_file2,'r', encoding='utf-8') as fin2, open(out_file2, 'w', encoding='utf-8') as fout2:
text = fin2.read()
fout2.write(re.sub(r's*(d+(?:.d+)+.?)s*', r'nn1n', text))
See the Python demo
The pattern is
s*(d+(?:.d+)+.?)s*
See the regex demo
Details
s*
- 0+ whitespaces
(d+(?:.d+)+.?)
- Group 1 (1
in the replacement pattern):
d+
- 1+ digits
(?:.d+)+
- 1 or more repetitions of.
and 1+ digits
.?
- an optional.
s*
- 0+ whitespaces
add a comment |
First of all, to search and replace with a regex, you need to use re.sub
, not str.replace
.
Second, if you use a re.sub
, you can't use the regex pattern inside a replacement pattern, you need to group the parts of the regex you want to keep and use backreferences in the replacement (or, if you just want to refer to the whole match, use g<0>
backreference, no capturing groups are required).
Third, when you build an unanchored alternation pattern, make sure longer alternatives come first, i.e. start_rx = re.compile('|'.join(['dd.dd.dd', 'd.dd.dd', 'dd.dd.']))
. However, you may use a more precise pattern here manually.
Here is how your code can be fixed:
with open(in_file2,'r', encoding='utf-8') as fin2, open(out_file2, 'w', encoding='utf-8') as fout2:
text = fin2.read()
fout2.write(re.sub(r's*(d+(?:.d+)+.?)s*', r'nn1n', text))
See the Python demo
The pattern is
s*(d+(?:.d+)+.?)s*
See the regex demo
Details
s*
- 0+ whitespaces
(d+(?:.d+)+.?)
- Group 1 (1
in the replacement pattern):
d+
- 1+ digits
(?:.d+)+
- 1 or more repetitions of.
and 1+ digits
.?
- an optional.
s*
- 0+ whitespaces
First of all, to search and replace with a regex, you need to use re.sub
, not str.replace
.
Second, if you use a re.sub
, you can't use the regex pattern inside a replacement pattern, you need to group the parts of the regex you want to keep and use backreferences in the replacement (or, if you just want to refer to the whole match, use g<0>
backreference, no capturing groups are required).
Third, when you build an unanchored alternation pattern, make sure longer alternatives come first, i.e. start_rx = re.compile('|'.join(['dd.dd.dd', 'd.dd.dd', 'dd.dd.']))
. However, you may use a more precise pattern here manually.
Here is how your code can be fixed:
with open(in_file2,'r', encoding='utf-8') as fin2, open(out_file2, 'w', encoding='utf-8') as fout2:
text = fin2.read()
fout2.write(re.sub(r's*(d+(?:.d+)+.?)s*', r'nn1n', text))
See the Python demo
The pattern is
s*(d+(?:.d+)+.?)s*
See the regex demo
Details
s*
- 0+ whitespaces
(d+(?:.d+)+.?)
- Group 1 (1
in the replacement pattern):
d+
- 1+ digits
(?:.d+)+
- 1 or more repetitions of.
and 1+ digits
.?
- an optional.
s*
- 0+ whitespaces
edited Nov 23 '18 at 11:12
answered Nov 23 '18 at 11:06
Wiktor StribiżewWiktor Stribiżew
316k16134215
316k16134215
add a comment |
add a comment |
Try this
out_file2=re.sub(r'(d+) ', r'1n', in_file2)
out_file2=re.sub(r'(w+).', r'1.n', in_file2)
add a comment |
Try this
out_file2=re.sub(r'(d+) ', r'1n', in_file2)
out_file2=re.sub(r'(w+).', r'1.n', in_file2)
add a comment |
Try this
out_file2=re.sub(r'(d+) ', r'1n', in_file2)
out_file2=re.sub(r'(w+).', r'1.n', in_file2)
Try this
out_file2=re.sub(r'(d+) ', r'1n', in_file2)
out_file2=re.sub(r'(w+).', r'1.n', in_file2)
answered Nov 23 '18 at 10:46
gocengocen
258
258
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53445013%2fpython-how-to-add-a-new-line-every-time-there-is-a-pattern-is-found-in-a-strin%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
Note you are trying to use
str.replace
method with regex, but it does not accept regex. You needre.sub
. Trytext = fin2.read()
and thenfout2.write(re.sub(r's*(d+(?:.d+)+.?)s*', r'nn1n', text))
, too. See this demo.– Wiktor Stribiżew
Nov 23 '18 at 10:54
This fixed the problem. Thank you.
– Mady
Nov 23 '18 at 11:05