How to Save HTML as PDF in Python

  1. Method 1: Using WeasyPrint
  2. Method 2: Using pdfkit
  3. Method 3: Using ReportLab
  4. Conclusion
  5. FAQ
How to Save HTML as PDF in Python

Saving HTML content as a PDF can be incredibly useful for a variety of applications, from generating reports to archiving web pages.

In this tutorial, we’ll explore how to accomplish this task using Python. Whether you’re a beginner or an experienced developer, you’ll find that the methods we discuss are straightforward and effective. We’ll cover different libraries that can help you convert HTML to PDF seamlessly. By the end of this article, you’ll not only understand how to save HTML as a PDF but also be equipped with practical code examples to implement in your projects. Let’s dive in!

Method 1: Using WeasyPrint

WeasyPrint is a powerful library that allows you to convert HTML and CSS documents into PDF files. It’s particularly favored for its ability to render complex layouts accurately. To get started, you first need to install the WeasyPrint library. You can do this via pip:

Bash
 bashCopypip install WeasyPrint

Once you have WeasyPrint installed, you can use the following code snippet to convert an HTML file to PDF:

Python
 pythonCopyfrom weasyprint import HTML

HTML('yourfile.html').write_pdf('output.pdf')

In this example, replace ‘yourfile.html’ with the path to your HTML file, and ‘output.pdf’ will be the name of the generated PDF file.

Output:

 textCopyPDF file created successfully as output.pdf

This simple code does the trick. WeasyPrint takes the HTML file and processes it, rendering the content into a beautifully formatted PDF. One of the standout features of WeasyPrint is its support for CSS, allowing you to maintain the styling of your HTML content in the PDF. This makes it an excellent choice for generating reports, invoices, or any other documents where layout matters.

Method 2: Using pdfkit

Another popular choice for converting HTML to PDF in Python is pdfkit, which is a wrapper for the wkhtmltopdf command line tool. This method is particularly useful if you are looking for a solution that handles web pages as well as local HTML files. First, you need to install pdfkit and wkhtmltopdf. You can install pdfkit via pip:

Bash
 bashCopypip install pdfkit

Next, make sure you have wkhtmltopdf installed on your system. You can download it from the official website and follow the installation instructions for your operating system. Once you have everything set up, you can use the following code:

Python
 pythonCopyimport pdfkit

pdfkit.from_file('yourfile.html', 'output.pdf')

Just like before, replace ‘yourfile.html’ with the path to your HTML file. The output will be saved as ‘output.pdf’.

Output:

 textCopyPDF file created successfully as output.pdf

The pdfkit library is very versatile and can handle complex HTML and CSS, similar to WeasyPrint. It also allows for additional options, such as setting page size, orientation, and margins, making it a flexible choice for various PDF generation needs. If you need to convert web pages directly, pdfkit can do that as well by using pdfkit.from_url('http://example.com', 'output.pdf').

Method 3: Using ReportLab

If you’re looking for more control over PDF generation, ReportLab is a fantastic library to consider. Unlike WeasyPrint and pdfkit, ReportLab allows you to create PDFs from scratch, which means you can customize every aspect of the PDF document. To get started, install ReportLab:

Bash
 bashCopypip install reportlab

Here’s a simple example of how to create a PDF file using ReportLab:

Python
 pythonCopyfrom reportlab.lib.pagesizes import letter
from reportlab.pdfgen import canvas

c = canvas.Canvas("output.pdf", pagesize=letter)
c.drawString(100, 750, "Hello World")
c.save()

In this example, we create a simple PDF that contains the text “Hello World”. The canvas class allows for precise placement of text and graphics on the page, giving you full control over the document layout.

Output:

 textCopyPDF file created successfully as output.pdf

While ReportLab may not directly convert HTML to PDF, it offers a robust solution for generating PDFs programmatically. This can be particularly useful in scenarios where you need to create dynamic content or reports based on data. If you want to incorporate HTML content, you may need to parse the HTML and manually add elements using ReportLab’s API.

Conclusion

In this tutorial, we’ve explored three powerful methods for saving HTML as a PDF in Python: WeasyPrint, pdfkit, and ReportLab. Each method has its unique strengths, making them suitable for different use cases. Whether you need a quick conversion or want to create a highly customized PDF document, you now have the tools to achieve your goals. With the examples provided, you can easily implement these solutions in your projects. Happy coding!

FAQ

  1. How do I install WeasyPrint?
    You can install WeasyPrint using pip with the command pip install WeasyPrint.

  2. Can pdfkit handle web pages?
    Yes, pdfkit can convert both local HTML files and web pages to PDF.

  3. What is ReportLab best used for?
    ReportLab is best used for creating PDFs from scratch, allowing for detailed customization.

  4. Are there any alternatives to these libraries?
    Yes, other alternatives include PyPDF2, xhtml2pdf, and FPDF, each with its own features.

  5. Do I need to know HTML to use these libraries?
    While knowing HTML helps, you can still use these libraries with basic HTML knowledge.

Enjoying our tutorials? Subscribe to DelftStack on YouTube to support us in creating more high-quality video guides. Subscribe
Author: Manav Narula
Manav Narula avatar Manav Narula avatar

Manav is a IT Professional who has a lot of experience as a core developer in many live projects. He is an avid learner who enjoys learning new things and sharing his findings whenever possible.

LinkedIn