How to Download a File in Python
-
Use the
requests
Module to Download Files in Python -
Use the
urllib
Module to Download Files in Python -
Use the
pycurl
Module to Download Files in Python
Python is used very frequently to access resources on the internet. We can generate requests and connections using different libraries. Such libraries can also help us in downloading or reading HTTP files from the web.
In this tutorial, we will download files from the internet in Python.
Use the requests
Module to Download Files in Python
We can use the requests
module to retrieve information and read web pages from the internet.
The get()
method helps to retrieve the file’s path from the given URL, from which the file is to be downloaded. The open()
method creates a file object where we wish to save the file, and then the write()
function is used to write the contents of the file to the desired path.
We use these functions to download a file, as shown below.
import requests as req
URL = "https://www.facebook.com/favicon.ico"
file = req.get(url, allow_redirects=True)
open("facebook.ico", "wb").write(file.content)
Output:
1150
The above code downloads a logo file of Facebook from its URL and stores it in the working directory. We can specify any path in the open() function, but we have to open it in wb
mode. This indicates that we intend to write a file in binary mode.
The above example is suitable for downloading smaller files but does not work efficiently for large files. The file.content
function is used to get the file content as a single string. Since we used a small file in the above example, it worked properly.
If we have to download a big file, then we should use the file.iter_content()
function in which we will be specifying the chunk size. It downloads the data in the form of chunks.
We use this function in the following example.
import requests
URL = "http://codex.cs.yale.edu/avi/db-book/db4/slide-dir/ch1-2.pdf"
file = requests.get(URL, stream=True)
with open("Python.pdf", "wb") as pdf:
for chunk in file.iter_content(chunk_size=1024):
if chunk:
pdf.write(chunk)
Use the urllib
Module to Download Files in Python
We can also use the urllib
library in Python for downloading and reading files from the web. This is a URL handling module that has different functions to perform the given task.
Here also, we have to specify the URL of the file to be downloaded. The urllib.request.urlopen()
method gets the path of the file and sends a request to the server where the file is being downloaded.
To download files, we can use the urllib.request.urlretrieve()
function. It will download the resource from the given address and store it in the provided directory.
We download the icon of Facebook using this method in the following example.
import urllib
urllib.request.urlretrieve("https://www.facebook.com/favicon.ico", "fb.ico")
Output:
('fb.ico', <http.client.HTTPMessage at 0x2d2d317a088>)
The above output indicates that the file was downloaded successfully.
Use the pycurl
Module to Download Files in Python
We can use file handling with this module to download files from the internet. First, we have to create a file object for where we wish to download the required file. Then, we will use the pycurl.Curl()
function to create an object and initiate the curl session.
The setopt()
method is used to set the URL value of the file. Next, the perform()
function performs the file transfer process from the server by sending the HTTP request. Next, we will write the data retrieved to the file using the file object. Finally, the close()
method closes the session, and we get our file downloaded in the working directory.
See the code below.
import pycurl
file_name = "fb.ico"
file_src = "https://www.facebook.com/favicon.ico"
with open(file_name, "wb") as f:
cl = pycurl.Curl()
cl.setopt(cl.URL, file_src)
cl.setopt(cl.WRITEDATA, f)
cl.perform()
cl.close()