Easy way to add combination of tags, text and links in Beautifulsoup?

I am looking to scrape references off a website and appending them to my pre-existing list of references, so far I've been successful with scraping, but I can't seem to do the final step which is appending to the existing references. Let me illustrate:

The reference I managed to scrape:

scraped_ref = 'Case courtesy of Dr Sachintha Hapugoda, <a href="https://radiopaedia.org/">Radiopaedia.org</a>. From the case <a href="https://radiopaedia.org/cases/52525">rID: 52525</a> [Accessed 15 Nov. 2018].'

I need to add the following tag before that reference:

<br>3. <b>Image: </b>

Which will then make it:

formatted_ref = '<br>3. <b>Image: </b>Case courtesy of Dr Sachintha Hapugoda, <a href="https://radiopaedia.org/">Radiopaedia.org</a>. From the case <a href="https://radiopaedia.org/cases/52525">rID: 52525</a> [Accessed 15 Nov. 2018].'

Then finally add the formatted reference to my pre-existing list of references:

existing_ref = <p class="references" style="font-size:15px">1. Mcminn. (2003). Last's Anatomy. Elsevier Australia. ISBN:0729537528. <a href="http://books.google.com/books?vid=ISBN0729537528">Read it at Google Books</a> - <a href="http://www.amazon.com/gp/product/0729537528">Find it at Amazon</a><br>2. Netter, F. H. (2019). Atlas of human anatomy. Philadelphia, PA: Elsevier.</p>

I tried this:

for p in soup.find_all("p", {"class":"references"}):

    print(p.append('<br>3. <b>Image: </b>' + scraped_ref))

But the result I got lost all tag information:

<p class="references" style="font-size:15px">1. Mcminn. (2003). Last's Anatomy. Elsevier Australia. ISBN:0729537528. <a href="http://books.google.com/books?vid=ISBN0729537528">Read it at Google Books</a> - <a href="http://www.amazon.com/gp/product/0729537528">Find it at Amazon</a><br/>

2. Netter, F. H. (2019). Atlas of human anatomy. Philadelphia, PA: Elsevier.&lt;br&gt;3. &lt;b&gt;Image: &lt;/b&gt;Case courtesy of Dr Sachintha Hapugoda, &lt;a href="https://radiopaedia.org/"&gt;Radiopaedia.org&lt;/a&gt;. From the case &lt;a href="https://radiopaedia.org/cases/52525"&gt;rID: 52525&lt;/a&gt; [Accessed 15 Nov.2018].</p>

What do I do?

asked Nov 20 '18 at 19:19

Code Monkey

143110

So you are trying to insert some new tags into the HTML and I assume then write the resulting HTML out? If so what is the URL you are scraping?
– Martin Evans
Nov 20 '18 at 20:38

The url I scraped is not relevant to the question as I have already scraped it - I wrote the code for scraping the url and it works. What doesn’t work is the code to include the scraped information into the existing list, hence the question.
– Code Monkey
Nov 20 '18 at 22:11

Having the URL would allow soup to be obtained and your code to be tested.
– Martin Evans
Nov 20 '18 at 22:21

Sure, the url is specified within the scraped_ref variable above. But I use many other references, too many to specify here. My scraping code is complicated and is designed to accommodate various sites. It would only muddle the question.
– Code Monkey
Nov 20 '18 at 23:00

add a comment |

The reference I managed to scrape:

scraped_ref = 'Case courtesy of Dr Sachintha Hapugoda, <a href="https://radiopaedia.org/">Radiopaedia.org</a>. From the case <a href="https://radiopaedia.org/cases/52525">rID: 52525</a> [Accessed 15 Nov. 2018].'

I need to add the following tag before that reference:

<br>3. <b>Image: </b>

Which will then make it:

formatted_ref = '<br>3. <b>Image: </b>Case courtesy of Dr Sachintha Hapugoda, <a href="https://radiopaedia.org/">Radiopaedia.org</a>. From the case <a href="https://radiopaedia.org/cases/52525">rID: 52525</a> [Accessed 15 Nov. 2018].'

Then finally add the formatted reference to my pre-existing list of references:

existing_ref = <p class="references" style="font-size:15px">1. Mcminn. (2003). Last's Anatomy. Elsevier Australia. ISBN:0729537528. <a href="http://books.google.com/books?vid=ISBN0729537528">Read it at Google Books</a> - <a href="http://www.amazon.com/gp/product/0729537528">Find it at Amazon</a><br>2. Netter, F. H. (2019). Atlas of human anatomy. Philadelphia, PA: Elsevier.</p>

I tried this:

for p in soup.find_all("p", {"class":"references"}):

    print(p.append('<br>3. <b>Image: </b>' + scraped_ref))

But the result I got lost all tag information:

<p class="references" style="font-size:15px">1. Mcminn. (2003). Last's Anatomy. Elsevier Australia. ISBN:0729537528. <a href="http://books.google.com/books?vid=ISBN0729537528">Read it at Google Books</a> - <a href="http://www.amazon.com/gp/product/0729537528">Find it at Amazon</a><br/>

2. Netter, F. H. (2019). Atlas of human anatomy. Philadelphia, PA: Elsevier.&lt;br&gt;3. &lt;b&gt;Image: &lt;/b&gt;Case courtesy of Dr Sachintha Hapugoda, &lt;a href="https://radiopaedia.org/"&gt;Radiopaedia.org&lt;/a&gt;. From the case &lt;a href="https://radiopaedia.org/cases/52525"&gt;rID: 52525&lt;/a&gt; [Accessed 15 Nov.2018].</p>

What do I do?

asked Nov 20 '18 at 19:19

Code Monkey

143110

So you are trying to insert some new tags into the HTML and I assume then write the resulting HTML out? If so what is the URL you are scraping?
– Martin Evans
Nov 20 '18 at 20:38

The url I scraped is not relevant to the question as I have already scraped it - I wrote the code for scraping the url and it works. What doesn’t work is the code to include the scraped information into the existing list, hence the question.
– Code Monkey
Nov 20 '18 at 22:11

Having the URL would allow soup to be obtained and your code to be tested.
– Martin Evans
Nov 20 '18 at 22:21

Sure, the url is specified within the scraped_ref variable above. But I use many other references, too many to specify here. My scraping code is complicated and is designed to accommodate various sites. It would only muddle the question.
– Code Monkey
Nov 20 '18 at 23:00

add a comment |

The reference I managed to scrape:

scraped_ref = 'Case courtesy of Dr Sachintha Hapugoda, <a href="https://radiopaedia.org/">Radiopaedia.org</a>. From the case <a href="https://radiopaedia.org/cases/52525">rID: 52525</a> [Accessed 15 Nov. 2018].'

I need to add the following tag before that reference:

<br>3. <b>Image: </b>

Which will then make it:

formatted_ref = '<br>3. <b>Image: </b>Case courtesy of Dr Sachintha Hapugoda, <a href="https://radiopaedia.org/">Radiopaedia.org</a>. From the case <a href="https://radiopaedia.org/cases/52525">rID: 52525</a> [Accessed 15 Nov. 2018].'

Then finally add the formatted reference to my pre-existing list of references:

existing_ref = <p class="references" style="font-size:15px">1. Mcminn. (2003). Last's Anatomy. Elsevier Australia. ISBN:0729537528. <a href="http://books.google.com/books?vid=ISBN0729537528">Read it at Google Books</a> - <a href="http://www.amazon.com/gp/product/0729537528">Find it at Amazon</a><br>2. Netter, F. H. (2019). Atlas of human anatomy. Philadelphia, PA: Elsevier.</p>

I tried this:

for p in soup.find_all("p", {"class":"references"}):

    print(p.append('<br>3. <b>Image: </b>' + scraped_ref))

But the result I got lost all tag information:

<p class="references" style="font-size:15px">1. Mcminn. (2003). Last's Anatomy. Elsevier Australia. ISBN:0729537528. <a href="http://books.google.com/books?vid=ISBN0729537528">Read it at Google Books</a> - <a href="http://www.amazon.com/gp/product/0729537528">Find it at Amazon</a><br/>

2. Netter, F. H. (2019). Atlas of human anatomy. Philadelphia, PA: Elsevier.&lt;br&gt;3. &lt;b&gt;Image: &lt;/b&gt;Case courtesy of Dr Sachintha Hapugoda, &lt;a href="https://radiopaedia.org/"&gt;Radiopaedia.org&lt;/a&gt;. From the case &lt;a href="https://radiopaedia.org/cases/52525"&gt;rID: 52525&lt;/a&gt; [Accessed 15 Nov.2018].</p>

What do I do?

asked Nov 20 '18 at 19:19

Code Monkey

143110

The reference I managed to scrape:

scraped_ref = 'Case courtesy of Dr Sachintha Hapugoda, <a href="https://radiopaedia.org/">Radiopaedia.org</a>. From the case <a href="https://radiopaedia.org/cases/52525">rID: 52525</a> [Accessed 15 Nov. 2018].'

I need to add the following tag before that reference:

<br>3. <b>Image: </b>

Which will then make it:

formatted_ref = '<br>3. <b>Image: </b>Case courtesy of Dr Sachintha Hapugoda, <a href="https://radiopaedia.org/">Radiopaedia.org</a>. From the case <a href="https://radiopaedia.org/cases/52525">rID: 52525</a> [Accessed 15 Nov. 2018].'

Then finally add the formatted reference to my pre-existing list of references:

existing_ref = <p class="references" style="font-size:15px">1. Mcminn. (2003). Last's Anatomy. Elsevier Australia. ISBN:0729537528. <a href="http://books.google.com/books?vid=ISBN0729537528">Read it at Google Books</a> - <a href="http://www.amazon.com/gp/product/0729537528">Find it at Amazon</a><br>2. Netter, F. H. (2019). Atlas of human anatomy. Philadelphia, PA: Elsevier.</p>

I tried this:

for p in soup.find_all("p", {"class":"references"}):

    print(p.append('<br>3. <b>Image: </b>' + scraped_ref))

But the result I got lost all tag information:

<p class="references" style="font-size:15px">1. Mcminn. (2003). Last's Anatomy. Elsevier Australia. ISBN:0729537528. <a href="http://books.google.com/books?vid=ISBN0729537528">Read it at Google Books</a> - <a href="http://www.amazon.com/gp/product/0729537528">Find it at Amazon</a><br/>

2. Netter, F. H. (2019). Atlas of human anatomy. Philadelphia, PA: Elsevier.&lt;br&gt;3. &lt;b&gt;Image: &lt;/b&gt;Case courtesy of Dr Sachintha Hapugoda, &lt;a href="https://radiopaedia.org/"&gt;Radiopaedia.org&lt;/a&gt;. From the case &lt;a href="https://radiopaedia.org/cases/52525"&gt;rID: 52525&lt;/a&gt; [Accessed 15 Nov.2018].</p>

What do I do?

python beautifulsoup

asked Nov 20 '18 at 19:19

Code Monkey

143110

asked Nov 20 '18 at 19:19

Code Monkey

143110

asked Nov 20 '18 at 19:19

Code Monkey

143110

asked Nov 20 '18 at 19:19

Code Monkey

143110

asked Nov 20 '18 at 19:19

Code Monkey

143110

So you are trying to insert some new tags into the HTML and I assume then write the resulting HTML out? If so what is the URL you are scraping?
– Martin Evans
Nov 20 '18 at 20:38

The url I scraped is not relevant to the question as I have already scraped it - I wrote the code for scraping the url and it works. What doesn’t work is the code to include the scraped information into the existing list, hence the question.
– Code Monkey
Nov 20 '18 at 22:11

Having the URL would allow soup to be obtained and your code to be tested.
– Martin Evans
Nov 20 '18 at 22:21

Sure, the url is specified within the scraped_ref variable above. But I use many other references, too many to specify here. My scraping code is complicated and is designed to accommodate various sites. It would only muddle the question.
– Code Monkey
Nov 20 '18 at 23:00

add a comment |

So you are trying to insert some new tags into the HTML and I assume then write the resulting HTML out? If so what is the URL you are scraping?
– Martin Evans
Nov 20 '18 at 20:38

The url I scraped is not relevant to the question as I have already scraped it - I wrote the code for scraping the url and it works. What doesn’t work is the code to include the scraped information into the existing list, hence the question.
– Code Monkey
Nov 20 '18 at 22:11

Having the URL would allow soup to be obtained and your code to be tested.
– Martin Evans
Nov 20 '18 at 22:21

Sure, the url is specified within the scraped_ref variable above. But I use many other references, too many to specify here. My scraping code is complicated and is designed to accommodate various sites. It would only muddle the question.
– Code Monkey
Nov 20 '18 at 23:00

So you are trying to insert some new tags into the HTML and I assume then write the resulting HTML out? If so what is the URL you are scraping?
– Martin Evans
Nov 20 '18 at 20:38

The url I scraped is not relevant to the question as I have already scraped it - I wrote the code for scraping the url and it works. What doesn’t work is the code to include the scraped information into the existing list, hence the question.
– Code Monkey
Nov 20 '18 at 22:11

Having the URL would allow soup to be obtained and your code to be tested.
– Martin Evans
Nov 20 '18 at 22:21

Sure, the url is specified within the scraped_ref variable above. But I use many other references, too many to specify here. My scraping code is complicated and is designed to accommodate various sites. It would only muddle the question.
– Code Monkey
Nov 20 '18 at 23:00

add a comment |

active

oldest

votes

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53400070%2feasy-way-to-add-combination-of-tags-text-and-links-in-beautifulsoup%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

active

oldest

votes

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

Some of your past answers have not been well-received, and you're in danger of being blocked from answering.

Please pay close attention to the following guidance:

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

zv,6 abFUABcHBwfraM,nvM,7Sm,I2,8ZFcwLg Yrh8Vh4,QiF9,GAk sHSE,Ywp cmg,3,94TNo

搜尋此網誌

Nsryjdtyk