Easy way to add combination of tags, text and links in Beautifulsoup?












1














I am looking to scrape references off a website and appending them to my pre-existing list of references, so far I've been successful with scraping, but I can't seem to do the final step which is appending to the existing references. Let me illustrate:



The reference I managed to scrape:



scraped_ref = 'Case courtesy of Dr Sachintha Hapugoda, <a href="https://radiopaedia.org/">Radiopaedia.org</a>. From the case <a href="https://radiopaedia.org/cases/52525">rID: 52525</a> [Accessed 15 Nov. 2018].'


I need to add the following tag before that reference:



<br>3. <b>Image: </b>


Which will then make it:



formatted_ref = '<br>3. <b>Image: </b>Case courtesy of Dr Sachintha Hapugoda, <a href="https://radiopaedia.org/">Radiopaedia.org</a>. From the case <a href="https://radiopaedia.org/cases/52525">rID: 52525</a> [Accessed 15 Nov. 2018].'


Then finally add the formatted reference to my pre-existing list of references:



existing_ref = <p class="references" style="font-size:15px">1. Mcminn. (2003). Last's Anatomy. Elsevier Australia. ISBN:0729537528. <a href="http://books.google.com/books?vid=ISBN0729537528">Read it at Google Books</a> - <a href="http://www.amazon.com/gp/product/0729537528">Find it at Amazon</a><br>2. Netter, F. H. (2019). Atlas of human anatomy. Philadelphia, PA: Elsevier.</p>


I tried this:



for p in soup.find_all("p", {"class":"references"}):
print(p.append('<br>3. <b>Image: </b>' + scraped_ref))


But the result I got lost all tag information:



<p class="references" style="font-size:15px">1. Mcminn. (2003). Last's Anatomy. Elsevier Australia. ISBN:0729537528. <a href="http://books.google.com/books?vid=ISBN0729537528">Read it at Google Books</a> - <a href="http://www.amazon.com/gp/product/0729537528">Find it at Amazon</a><br/>
2. Netter, F. H. (2019). Atlas of human anatomy. Philadelphia, PA: Elsevier.&lt;br&gt;3. &lt;b&gt;Image: &lt;/b&gt;Case courtesy of Dr Sachintha Hapugoda, &lt;a href="https://radiopaedia.org/"&gt;Radiopaedia.org&lt;/a&gt;. From the case &lt;a href="https://radiopaedia.org/cases/52525"&gt;rID: 52525&lt;/a&gt; [Accessed 15 Nov.2018].</p>


What do I do?










share|improve this question






















  • So you are trying to insert some new tags into the HTML and I assume then write the resulting HTML out? If so what is the URL you are scraping?
    – Martin Evans
    Nov 20 '18 at 20:38












  • The url I scraped is not relevant to the question as I have already scraped it - I wrote the code for scraping the url and it works. What doesn’t work is the code to include the scraped information into the existing list, hence the question.
    – Code Monkey
    Nov 20 '18 at 22:11










  • Having the URL would allow soup to be obtained and your code to be tested.
    – Martin Evans
    Nov 20 '18 at 22:21










  • Sure, the url is specified within the scraped_ref variable above. But I use many other references, too many to specify here. My scraping code is complicated and is designed to accommodate various sites. It would only muddle the question.
    – Code Monkey
    Nov 20 '18 at 23:00
















1














I am looking to scrape references off a website and appending them to my pre-existing list of references, so far I've been successful with scraping, but I can't seem to do the final step which is appending to the existing references. Let me illustrate:



The reference I managed to scrape:



scraped_ref = 'Case courtesy of Dr Sachintha Hapugoda, <a href="https://radiopaedia.org/">Radiopaedia.org</a>. From the case <a href="https://radiopaedia.org/cases/52525">rID: 52525</a> [Accessed 15 Nov. 2018].'


I need to add the following tag before that reference:



<br>3. <b>Image: </b>


Which will then make it:



formatted_ref = '<br>3. <b>Image: </b>Case courtesy of Dr Sachintha Hapugoda, <a href="https://radiopaedia.org/">Radiopaedia.org</a>. From the case <a href="https://radiopaedia.org/cases/52525">rID: 52525</a> [Accessed 15 Nov. 2018].'


Then finally add the formatted reference to my pre-existing list of references:



existing_ref = <p class="references" style="font-size:15px">1. Mcminn. (2003). Last's Anatomy. Elsevier Australia. ISBN:0729537528. <a href="http://books.google.com/books?vid=ISBN0729537528">Read it at Google Books</a> - <a href="http://www.amazon.com/gp/product/0729537528">Find it at Amazon</a><br>2. Netter, F. H. (2019). Atlas of human anatomy. Philadelphia, PA: Elsevier.</p>


I tried this:



for p in soup.find_all("p", {"class":"references"}):
print(p.append('<br>3. <b>Image: </b>' + scraped_ref))


But the result I got lost all tag information:



<p class="references" style="font-size:15px">1. Mcminn. (2003). Last's Anatomy. Elsevier Australia. ISBN:0729537528. <a href="http://books.google.com/books?vid=ISBN0729537528">Read it at Google Books</a> - <a href="http://www.amazon.com/gp/product/0729537528">Find it at Amazon</a><br/>
2. Netter, F. H. (2019). Atlas of human anatomy. Philadelphia, PA: Elsevier.&lt;br&gt;3. &lt;b&gt;Image: &lt;/b&gt;Case courtesy of Dr Sachintha Hapugoda, &lt;a href="https://radiopaedia.org/"&gt;Radiopaedia.org&lt;/a&gt;. From the case &lt;a href="https://radiopaedia.org/cases/52525"&gt;rID: 52525&lt;/a&gt; [Accessed 15 Nov.2018].</p>


What do I do?










share|improve this question






















  • So you are trying to insert some new tags into the HTML and I assume then write the resulting HTML out? If so what is the URL you are scraping?
    – Martin Evans
    Nov 20 '18 at 20:38












  • The url I scraped is not relevant to the question as I have already scraped it - I wrote the code for scraping the url and it works. What doesn’t work is the code to include the scraped information into the existing list, hence the question.
    – Code Monkey
    Nov 20 '18 at 22:11










  • Having the URL would allow soup to be obtained and your code to be tested.
    – Martin Evans
    Nov 20 '18 at 22:21










  • Sure, the url is specified within the scraped_ref variable above. But I use many other references, too many to specify here. My scraping code is complicated and is designed to accommodate various sites. It would only muddle the question.
    – Code Monkey
    Nov 20 '18 at 23:00














1












1








1







I am looking to scrape references off a website and appending them to my pre-existing list of references, so far I've been successful with scraping, but I can't seem to do the final step which is appending to the existing references. Let me illustrate:



The reference I managed to scrape:



scraped_ref = 'Case courtesy of Dr Sachintha Hapugoda, <a href="https://radiopaedia.org/">Radiopaedia.org</a>. From the case <a href="https://radiopaedia.org/cases/52525">rID: 52525</a> [Accessed 15 Nov. 2018].'


I need to add the following tag before that reference:



<br>3. <b>Image: </b>


Which will then make it:



formatted_ref = '<br>3. <b>Image: </b>Case courtesy of Dr Sachintha Hapugoda, <a href="https://radiopaedia.org/">Radiopaedia.org</a>. From the case <a href="https://radiopaedia.org/cases/52525">rID: 52525</a> [Accessed 15 Nov. 2018].'


Then finally add the formatted reference to my pre-existing list of references:



existing_ref = <p class="references" style="font-size:15px">1. Mcminn. (2003). Last's Anatomy. Elsevier Australia. ISBN:0729537528. <a href="http://books.google.com/books?vid=ISBN0729537528">Read it at Google Books</a> - <a href="http://www.amazon.com/gp/product/0729537528">Find it at Amazon</a><br>2. Netter, F. H. (2019). Atlas of human anatomy. Philadelphia, PA: Elsevier.</p>


I tried this:



for p in soup.find_all("p", {"class":"references"}):
print(p.append('<br>3. <b>Image: </b>' + scraped_ref))


But the result I got lost all tag information:



<p class="references" style="font-size:15px">1. Mcminn. (2003). Last's Anatomy. Elsevier Australia. ISBN:0729537528. <a href="http://books.google.com/books?vid=ISBN0729537528">Read it at Google Books</a> - <a href="http://www.amazon.com/gp/product/0729537528">Find it at Amazon</a><br/>
2. Netter, F. H. (2019). Atlas of human anatomy. Philadelphia, PA: Elsevier.&lt;br&gt;3. &lt;b&gt;Image: &lt;/b&gt;Case courtesy of Dr Sachintha Hapugoda, &lt;a href="https://radiopaedia.org/"&gt;Radiopaedia.org&lt;/a&gt;. From the case &lt;a href="https://radiopaedia.org/cases/52525"&gt;rID: 52525&lt;/a&gt; [Accessed 15 Nov.2018].</p>


What do I do?










share|improve this question













I am looking to scrape references off a website and appending them to my pre-existing list of references, so far I've been successful with scraping, but I can't seem to do the final step which is appending to the existing references. Let me illustrate:



The reference I managed to scrape:



scraped_ref = 'Case courtesy of Dr Sachintha Hapugoda, <a href="https://radiopaedia.org/">Radiopaedia.org</a>. From the case <a href="https://radiopaedia.org/cases/52525">rID: 52525</a> [Accessed 15 Nov. 2018].'


I need to add the following tag before that reference:



<br>3. <b>Image: </b>


Which will then make it:



formatted_ref = '<br>3. <b>Image: </b>Case courtesy of Dr Sachintha Hapugoda, <a href="https://radiopaedia.org/">Radiopaedia.org</a>. From the case <a href="https://radiopaedia.org/cases/52525">rID: 52525</a> [Accessed 15 Nov. 2018].'


Then finally add the formatted reference to my pre-existing list of references:



existing_ref = <p class="references" style="font-size:15px">1. Mcminn. (2003). Last's Anatomy. Elsevier Australia. ISBN:0729537528. <a href="http://books.google.com/books?vid=ISBN0729537528">Read it at Google Books</a> - <a href="http://www.amazon.com/gp/product/0729537528">Find it at Amazon</a><br>2. Netter, F. H. (2019). Atlas of human anatomy. Philadelphia, PA: Elsevier.</p>


I tried this:



for p in soup.find_all("p", {"class":"references"}):
print(p.append('<br>3. <b>Image: </b>' + scraped_ref))


But the result I got lost all tag information:



<p class="references" style="font-size:15px">1. Mcminn. (2003). Last's Anatomy. Elsevier Australia. ISBN:0729537528. <a href="http://books.google.com/books?vid=ISBN0729537528">Read it at Google Books</a> - <a href="http://www.amazon.com/gp/product/0729537528">Find it at Amazon</a><br/>
2. Netter, F. H. (2019). Atlas of human anatomy. Philadelphia, PA: Elsevier.&lt;br&gt;3. &lt;b&gt;Image: &lt;/b&gt;Case courtesy of Dr Sachintha Hapugoda, &lt;a href="https://radiopaedia.org/"&gt;Radiopaedia.org&lt;/a&gt;. From the case &lt;a href="https://radiopaedia.org/cases/52525"&gt;rID: 52525&lt;/a&gt; [Accessed 15 Nov.2018].</p>


What do I do?







python beautifulsoup






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 20 '18 at 19:19









Code Monkey

143110




143110












  • So you are trying to insert some new tags into the HTML and I assume then write the resulting HTML out? If so what is the URL you are scraping?
    – Martin Evans
    Nov 20 '18 at 20:38












  • The url I scraped is not relevant to the question as I have already scraped it - I wrote the code for scraping the url and it works. What doesn’t work is the code to include the scraped information into the existing list, hence the question.
    – Code Monkey
    Nov 20 '18 at 22:11










  • Having the URL would allow soup to be obtained and your code to be tested.
    – Martin Evans
    Nov 20 '18 at 22:21










  • Sure, the url is specified within the scraped_ref variable above. But I use many other references, too many to specify here. My scraping code is complicated and is designed to accommodate various sites. It would only muddle the question.
    – Code Monkey
    Nov 20 '18 at 23:00


















  • So you are trying to insert some new tags into the HTML and I assume then write the resulting HTML out? If so what is the URL you are scraping?
    – Martin Evans
    Nov 20 '18 at 20:38












  • The url I scraped is not relevant to the question as I have already scraped it - I wrote the code for scraping the url and it works. What doesn’t work is the code to include the scraped information into the existing list, hence the question.
    – Code Monkey
    Nov 20 '18 at 22:11










  • Having the URL would allow soup to be obtained and your code to be tested.
    – Martin Evans
    Nov 20 '18 at 22:21










  • Sure, the url is specified within the scraped_ref variable above. But I use many other references, too many to specify here. My scraping code is complicated and is designed to accommodate various sites. It would only muddle the question.
    – Code Monkey
    Nov 20 '18 at 23:00
















So you are trying to insert some new tags into the HTML and I assume then write the resulting HTML out? If so what is the URL you are scraping?
– Martin Evans
Nov 20 '18 at 20:38






So you are trying to insert some new tags into the HTML and I assume then write the resulting HTML out? If so what is the URL you are scraping?
– Martin Evans
Nov 20 '18 at 20:38














The url I scraped is not relevant to the question as I have already scraped it - I wrote the code for scraping the url and it works. What doesn’t work is the code to include the scraped information into the existing list, hence the question.
– Code Monkey
Nov 20 '18 at 22:11




The url I scraped is not relevant to the question as I have already scraped it - I wrote the code for scraping the url and it works. What doesn’t work is the code to include the scraped information into the existing list, hence the question.
– Code Monkey
Nov 20 '18 at 22:11












Having the URL would allow soup to be obtained and your code to be tested.
– Martin Evans
Nov 20 '18 at 22:21




Having the URL would allow soup to be obtained and your code to be tested.
– Martin Evans
Nov 20 '18 at 22:21












Sure, the url is specified within the scraped_ref variable above. But I use many other references, too many to specify here. My scraping code is complicated and is designed to accommodate various sites. It would only muddle the question.
– Code Monkey
Nov 20 '18 at 23:00




Sure, the url is specified within the scraped_ref variable above. But I use many other references, too many to specify here. My scraping code is complicated and is designed to accommodate various sites. It would only muddle the question.
– Code Monkey
Nov 20 '18 at 23:00

















active

oldest

votes











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53400070%2feasy-way-to-add-combination-of-tags-text-and-links-in-beautifulsoup%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown






























active

oldest

votes













active

oldest

votes









active

oldest

votes






active

oldest

votes
















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.





Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


Please pay close attention to the following guidance:


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53400070%2feasy-way-to-add-combination-of-tags-text-and-links-in-beautifulsoup%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Create new schema in PostgreSQL using DBeaver

Deepest pit of an array with Javascript: test on Codility

Costa Masnaga