How to Get Web Page in Python
-
Use the
urllib
Package to Get a Web Page in Python -
Use the
requests
Package to Get a Webpage in Python
In Python, we can create connections and read data from the web. We can download files over the web and read whole web pages.
This tutorial shows how to get a webpage in Python.
Use the urllib
Package to Get a Web Page in Python
This package is used to fetch web pages and handle URL-related operations in Python. We can use the urllib.request.urlopen()
function to retrieve a webpage using its URL.
The urllib.request
module opens the given URL and returns an object. This object has different attributes like header
, status
, and more. We can read the webpage using the read()
function with this object. It will return the full content of the web page.
See the following example.
import urllib.request
page = urllib.request.urlopen("http://www.python.org")
print(page.read())
In recent times, newer versions of the urllib
package have emerged. First, we have the urllib2
package, built as an experimental version of urllib
with newer and improved features. It can also accept Requests
object from the requests
package. The urlencode()
is missing from the urllib2
package.
The urllib3
package was also introduced and is a third-party package, unlike the previous two versions. The requests
package discussed below uses functionalities from this package internally.
Use the requests
Package to Get a Webpage in Python
The requests
library is simple to use and provides a lot of HTTP-related functionalities. We can use the requests.get()
function to retrieve a webpage and return a Response
object.
This object also possesses several attributes like status_code
, content
, and more. We can use the content
attribute to return the given web page’s content.
For example,
import requests
response = requests.get("http://www.python.org")
print(response.status_code)
print(response.content)
The requests
library aims to provide simple to use API and has a more convenient way to handle errors. Also, it automatically decodes the response retrieved into Unicode.
Manav is a IT Professional who has a lot of experience as a core developer in many live projects. He is an avid learner who enjoys learning new things and sharing his findings whenever possible.
LinkedIn