You can download this code by clicking the button below.
This code is now available for download.
This function extracts all <h1> tag text content from a specified URL, which is the title of the web page.
Technology Stack : BeautifulSoup, urllib.request
Code Type : Function
Code Difficulty : Intermediate
def extract_titles(url, parser='html.parser'):
from bs4 import BeautifulSoup, SoupStrainer
from urllib.request import urlopen
# Fetch the web page content
response = urlopen(url)
html_content = response.read()
response.close()
# Parse the HTML content using BeautifulSoup
soup = BeautifulSoup(html_content, parser)
# Extract all titles from the page
titles = soup.find_all('h1')
# Return the list of titles
return [title.get_text() for title in titles]