Python - how to add a new line every time there is a pattern is found in a string?

How can I add a new line every time there is a pattern of a regex-list found in a string ?

I am using python 3.6.

I got the following input:

12.13.14 Here is supposed to start a new line.

12.13.15 Here is supposed to start a new line.

Here is some text. It is written in one lines. 12.13. Here is some more text. 2.12.14. Here is even more text.

I wish to have the following output:

12.13.14

Here is supposed to start a new line.

12.13.15

Here is supposed to start a new line.

Here is some text. It is written in one lines.

12.13.

Here is some more text.

2.12.14.

Here is even more text.

My first try returns as the output the same as the input:

in_file2 = 'work1-T1.txt'

out_file2 = 'work2-T1.txt'





start_rx = re.compile('|'.join(

    ['dd.dd.', 'd.dd.dd','dd.dd.dd']))





with open(in_file2,'r', encoding='utf-8') as fin2, open(out_file2, 'w', encoding='utf-8') as fout2:

    text_list = fin2.read().split()

    fin2.seek(0)



    for string in fin2:

        if re.match(start_rx, string):

            string = str.replace(start_rx, 'nn' + start_rx + 'n')



        fout2.write(string)

My second try returns an error 'TypeError: unsupported operand type(s) for +: '_sre.SRE_Pattern' and 'str''

in_file2 = 'work1-T1.txt'

out_file2 = 'work2-T1.txt'





start_rx = re.compile('|'.join(

            ['dd.dd.', 'd.dd.dd','dd.dd.dd']))



with open(in_file2,"r") as fin2, open(out_file2, 'w') as fout3:

    for line in fin2:

        start = False

        if re.match(start_rx, line):

            start = True

        if start == False:

            print ('do something')

        if start == True:

            line = 'n' + line ## leerzeichen vor Pos Nr

            line = line.replace(start_rx, start_rx + 'n')

        fout3.write(line)

asked Nov 23 '18 at 10:33

Mady

1389

1

Note you are trying to use str.replace method with regex, but it does not accept regex. You need re.sub. Try text = fin2.read() and then fout2.write(re.sub(r's*(d+(?:.d+)+.?)s*', r'nn1n', text)), too. See this demo.

– Wiktor Stribiżew
Nov 23 '18 at 10:54

This fixed the problem. Thank you.

– Mady
Nov 23 '18 at 11:05

add a comment |

How can I add a new line every time there is a pattern of a regex-list found in a string ?

I am using python 3.6.

I got the following input:

12.13.14 Here is supposed to start a new line.

12.13.15 Here is supposed to start a new line.

Here is some text. It is written in one lines. 12.13. Here is some more text. 2.12.14. Here is even more text.

I wish to have the following output:

12.13.14

Here is supposed to start a new line.

12.13.15

Here is supposed to start a new line.

Here is some text. It is written in one lines.

12.13.

Here is some more text.

2.12.14.

Here is even more text.

My first try returns as the output the same as the input:

in_file2 = 'work1-T1.txt'

out_file2 = 'work2-T1.txt'





start_rx = re.compile('|'.join(

    ['dd.dd.', 'd.dd.dd','dd.dd.dd']))





with open(in_file2,'r', encoding='utf-8') as fin2, open(out_file2, 'w', encoding='utf-8') as fout2:

    text_list = fin2.read().split()

    fin2.seek(0)



    for string in fin2:

        if re.match(start_rx, string):

            string = str.replace(start_rx, 'nn' + start_rx + 'n')



        fout2.write(string)

My second try returns an error 'TypeError: unsupported operand type(s) for +: '_sre.SRE_Pattern' and 'str''

in_file2 = 'work1-T1.txt'

out_file2 = 'work2-T1.txt'





start_rx = re.compile('|'.join(

            ['dd.dd.', 'd.dd.dd','dd.dd.dd']))



with open(in_file2,"r") as fin2, open(out_file2, 'w') as fout3:

    for line in fin2:

        start = False

        if re.match(start_rx, line):

            start = True

        if start == False:

            print ('do something')

        if start == True:

            line = 'n' + line ## leerzeichen vor Pos Nr

            line = line.replace(start_rx, start_rx + 'n')

        fout3.write(line)

asked Nov 23 '18 at 10:33

Mady

1389

1

Note you are trying to use str.replace method with regex, but it does not accept regex. You need re.sub. Try text = fin2.read() and then fout2.write(re.sub(r's*(d+(?:.d+)+.?)s*', r'nn1n', text)), too. See this demo.

– Wiktor Stribiżew
Nov 23 '18 at 10:54

This fixed the problem. Thank you.

– Mady
Nov 23 '18 at 11:05

add a comment |

How can I add a new line every time there is a pattern of a regex-list found in a string ?

I am using python 3.6.

I got the following input:

12.13.14 Here is supposed to start a new line.

12.13.15 Here is supposed to start a new line.

Here is some text. It is written in one lines. 12.13. Here is some more text. 2.12.14. Here is even more text.

I wish to have the following output:

12.13.14

Here is supposed to start a new line.

12.13.15

Here is supposed to start a new line.

Here is some text. It is written in one lines.

12.13.

Here is some more text.

2.12.14.

Here is even more text.

My first try returns as the output the same as the input:

in_file2 = 'work1-T1.txt'

out_file2 = 'work2-T1.txt'





start_rx = re.compile('|'.join(

    ['dd.dd.', 'd.dd.dd','dd.dd.dd']))





with open(in_file2,'r', encoding='utf-8') as fin2, open(out_file2, 'w', encoding='utf-8') as fout2:

    text_list = fin2.read().split()

    fin2.seek(0)



    for string in fin2:

        if re.match(start_rx, string):

            string = str.replace(start_rx, 'nn' + start_rx + 'n')



        fout2.write(string)

My second try returns an error 'TypeError: unsupported operand type(s) for +: '_sre.SRE_Pattern' and 'str''

in_file2 = 'work1-T1.txt'

out_file2 = 'work2-T1.txt'





start_rx = re.compile('|'.join(

            ['dd.dd.', 'd.dd.dd','dd.dd.dd']))



with open(in_file2,"r") as fin2, open(out_file2, 'w') as fout3:

    for line in fin2:

        start = False

        if re.match(start_rx, line):

            start = True

        if start == False:

            print ('do something')

        if start == True:

            line = 'n' + line ## leerzeichen vor Pos Nr

            line = line.replace(start_rx, start_rx + 'n')

        fout3.write(line)

asked Nov 23 '18 at 10:33

Mady

1389

How can I add a new line every time there is a pattern of a regex-list found in a string ?

I am using python 3.6.

I got the following input:

12.13.14 Here is supposed to start a new line.

12.13.15 Here is supposed to start a new line.

Here is some text. It is written in one lines. 12.13. Here is some more text. 2.12.14. Here is even more text.

I wish to have the following output:

12.13.14

Here is supposed to start a new line.

12.13.15

Here is supposed to start a new line.

Here is some text. It is written in one lines.

12.13.

Here is some more text.

2.12.14.

Here is even more text.

My first try returns as the output the same as the input:

in_file2 = 'work1-T1.txt'

out_file2 = 'work2-T1.txt'





start_rx = re.compile('|'.join(

    ['dd.dd.', 'd.dd.dd','dd.dd.dd']))





with open(in_file2,'r', encoding='utf-8') as fin2, open(out_file2, 'w', encoding='utf-8') as fout2:

    text_list = fin2.read().split()

    fin2.seek(0)



    for string in fin2:

        if re.match(start_rx, string):

            string = str.replace(start_rx, 'nn' + start_rx + 'n')



        fout2.write(string)

My second try returns an error 'TypeError: unsupported operand type(s) for +: '_sre.SRE_Pattern' and 'str''

in_file2 = 'work1-T1.txt'

out_file2 = 'work2-T1.txt'





start_rx = re.compile('|'.join(

            ['dd.dd.', 'd.dd.dd','dd.dd.dd']))



with open(in_file2,"r") as fin2, open(out_file2, 'w') as fout3:

    for line in fin2:

        start = False

        if re.match(start_rx, line):

            start = True

        if start == False:

            print ('do something')

        if start == True:

            line = 'n' + line ## leerzeichen vor Pos Nr

            line = line.replace(start_rx, start_rx + 'n')

        fout3.write(line)

regex python-3.x replace

asked Nov 23 '18 at 10:33

Mady

1389

asked Nov 23 '18 at 10:33

Mady

1389

asked Nov 23 '18 at 10:33

Mady

1389

asked Nov 23 '18 at 10:33

Mady

1389

asked Nov 23 '18 at 10:33

Mady

1389

1

Note you are trying to use str.replace method with regex, but it does not accept regex. You need re.sub. Try text = fin2.read() and then fout2.write(re.sub(r's*(d+(?:.d+)+.?)s*', r'nn1n', text)), too. See this demo.

– Wiktor Stribiżew
Nov 23 '18 at 10:54

This fixed the problem. Thank you.

– Mady
Nov 23 '18 at 11:05

add a comment |

1

Note you are trying to use str.replace method with regex, but it does not accept regex. You need re.sub. Try text = fin2.read() and then fout2.write(re.sub(r's*(d+(?:.d+)+.?)s*', r'nn1n', text)), too. See this demo.

– Wiktor Stribiżew
Nov 23 '18 at 10:54

This fixed the problem. Thank you.

– Mady
Nov 23 '18 at 11:05

Note you are trying to use str.replace method with regex, but it does not accept regex. You need re.sub. Try text = fin2.read() and then fout2.write(re.sub(r's*(d+(?:.d+)+.?)s*', r'nn1n', text)), too. See this demo.

– Wiktor Stribiżew
Nov 23 '18 at 10:54

This fixed the problem. Thank you.

– Mady
Nov 23 '18 at 11:05

add a comment |

2 Answers
2

active

oldest

votes

First of all, to search and replace with a regex, you need to use re.sub, not str.replace.

Second, if you use a re.sub, you can't use the regex pattern inside a replacement pattern, you need to group the parts of the regex you want to keep and use backreferences in the replacement (or, if you just want to refer to the whole match, use g<0> backreference, no capturing groups are required).

Third, when you build an unanchored alternation pattern, make sure longer alternatives come first, i.e. start_rx = re.compile('|'.join(['dd.dd.dd', 'd.dd.dd', 'dd.dd.'])). However, you may use a more precise pattern here manually.

Here is how your code can be fixed:

with open(in_file2,'r', encoding='utf-8') as fin2, open(out_file2, 'w', encoding='utf-8') as fout2:

    text = fin2.read()

    fout2.write(re.sub(r's*(d+(?:.d+)+.?)s*', r'nn1n', text))

See the Python demo

The pattern is

s*(d+(?:.d+)+.?)s*

See the regex demo

Details

s* - 0+ whitespaces

(d+(?:.d+)+.?) - Group 1 (1 in the replacement pattern):
- d+ - 1+ digits
- (?:.d+)+ - 1 or more repetitions of . and 1+ digits
- .? - an optional .

s* - 0+ whitespaces

edited Nov 23 '18 at 11:12

answered Nov 23 '18 at 11:06

Wiktor Stribiżew

316k16134215

add a comment |

Try this

out_file2=re.sub(r'(d+) ', r'1n', in_file2)

out_file2=re.sub(r'(w+).', r'1.n', in_file2)

answered Nov 23 '18 at 10:46

gocen

258

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53445013%2fpython-how-to-add-a-new-line-every-time-there-is-a-pattern-is-found-in-a-strin%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

First of all, to search and replace with a regex, you need to use re.sub, not str.replace.

Here is how your code can be fixed:

with open(in_file2,'r', encoding='utf-8') as fin2, open(out_file2, 'w', encoding='utf-8') as fout2:

    text = fin2.read()

    fout2.write(re.sub(r's*(d+(?:.d+)+.?)s*', r'nn1n', text))

See the Python demo

The pattern is

s*(d+(?:.d+)+.?)s*

See the regex demo

Details

s* - 0+ whitespaces

(d+(?:.d+)+.?) - Group 1 (1 in the replacement pattern):
- d+ - 1+ digits
- (?:.d+)+ - 1 or more repetitions of . and 1+ digits
- .? - an optional .

s* - 0+ whitespaces

edited Nov 23 '18 at 11:12

answered Nov 23 '18 at 11:06

Wiktor Stribiżew

316k16134215

add a comment |

First of all, to search and replace with a regex, you need to use re.sub, not str.replace.

Here is how your code can be fixed:

with open(in_file2,'r', encoding='utf-8') as fin2, open(out_file2, 'w', encoding='utf-8') as fout2:

    text = fin2.read()

    fout2.write(re.sub(r's*(d+(?:.d+)+.?)s*', r'nn1n', text))

See the Python demo

The pattern is

s*(d+(?:.d+)+.?)s*

See the regex demo

Details

s* - 0+ whitespaces

(d+(?:.d+)+.?) - Group 1 (1 in the replacement pattern):
- d+ - 1+ digits
- (?:.d+)+ - 1 or more repetitions of . and 1+ digits
- .? - an optional .

s* - 0+ whitespaces

edited Nov 23 '18 at 11:12

answered Nov 23 '18 at 11:06

Wiktor Stribiżew

316k16134215

add a comment |

First of all, to search and replace with a regex, you need to use re.sub, not str.replace.

Here is how your code can be fixed:

with open(in_file2,'r', encoding='utf-8') as fin2, open(out_file2, 'w', encoding='utf-8') as fout2:

    text = fin2.read()

    fout2.write(re.sub(r's*(d+(?:.d+)+.?)s*', r'nn1n', text))

See the Python demo

The pattern is

s*(d+(?:.d+)+.?)s*

See the regex demo

Details

s* - 0+ whitespaces

(d+(?:.d+)+.?) - Group 1 (1 in the replacement pattern):
- d+ - 1+ digits
- (?:.d+)+ - 1 or more repetitions of . and 1+ digits
- .? - an optional .

s* - 0+ whitespaces

edited Nov 23 '18 at 11:12

answered Nov 23 '18 at 11:06

Wiktor Stribiżew

316k16134215

First of all, to search and replace with a regex, you need to use re.sub, not str.replace.

Here is how your code can be fixed:

with open(in_file2,'r', encoding='utf-8') as fin2, open(out_file2, 'w', encoding='utf-8') as fout2:

    text = fin2.read()

    fout2.write(re.sub(r's*(d+(?:.d+)+.?)s*', r'nn1n', text))

See the Python demo

The pattern is

s*(d+(?:.d+)+.?)s*

See the regex demo

Details

s* - 0+ whitespaces

(d+(?:.d+)+.?) - Group 1 (1 in the replacement pattern):
- d+ - 1+ digits
- (?:.d+)+ - 1 or more repetitions of . and 1+ digits
- .? - an optional .

s* - 0+ whitespaces

edited Nov 23 '18 at 11:12

answered Nov 23 '18 at 11:06

Wiktor Stribiżew

316k16134215

edited Nov 23 '18 at 11:12

answered Nov 23 '18 at 11:06

Wiktor Stribiżew

316k16134215

answered Nov 23 '18 at 11:06

Wiktor Stribiżew

316k16134215

answered Nov 23 '18 at 11:06

Wiktor Stribiżew

316k16134215

add a comment |

Try this

out_file2=re.sub(r'(d+) ', r'1n', in_file2)

out_file2=re.sub(r'(w+).', r'1.n', in_file2)

answered Nov 23 '18 at 10:46

gocen

258

add a comment |

Try this

out_file2=re.sub(r'(d+) ', r'1n', in_file2)

out_file2=re.sub(r'(w+).', r'1.n', in_file2)

answered Nov 23 '18 at 10:46

gocen

258

add a comment |

Try this

out_file2=re.sub(r'(d+) ', r'1n', in_file2)

out_file2=re.sub(r'(w+).', r'1.n', in_file2)

answered Nov 23 '18 at 10:46

gocen

258

Try this

out_file2=re.sub(r'(d+) ', r'1n', in_file2)

out_file2=re.sub(r'(w+).', r'1.n', in_file2)

answered Nov 23 '18 at 10:46

gocen

258

answered Nov 23 '18 at 10:46

gocen

258

answered Nov 23 '18 at 10:46

gocen

258

answered Nov 23 '18 at 10:46

gocen

258

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Nsryjdtyk