How to Make the Legend of the Scatter Plot in Matplotlib

  1. Creating a Basic Scatter Plot
  2. Adding a Legend to the Scatter Plot
  3. Customizing the Legend
  4. Using Multiple Data Series in a Scatter Plot
  5. Conclusion
  6. FAQ
How to Make the Legend of the Scatter Plot in Matplotlib

Creating effective visualizations is crucial in data analysis, and scatter plots are one of the most powerful tools in your arsenal. When using Matplotlib, a popular Python library, you can easily generate scatter plots that convey complex data relationships. However, a scatter plot without a legend can leave your audience confused about what each color or marker represents.

In this article, we will explore how to create a legend for your scatter plot using the matplotlib.pyplot.legend function. Not only will we walk through the process step-by-step, but we will also provide clear examples that you can easily replicate. Let’s dive into the world of Matplotlib and enhance your data visualizations with informative legends.

Before we jump into creating legends, it’s essential to understand what a scatter plot is and why it’s useful. A scatter plot displays values for typically two variables for a set of data. Each point on the plot represents an observation, allowing you to see patterns, trends, and correlations between the variables. For instance, you might plot the height and weight of individuals to see if there’s a correlation between these two variables.

In Matplotlib, you can create scatter plots using the scatter() function. Adding a legend helps identify different categories or groups within your data, making your visualizations more informative.

Creating a Basic Scatter Plot

Let’s start by generating a simple scatter plot. Here’s how you can do that in Python using Matplotlib.

import numpy as np
import matplotlib.pyplot as plt

x = [1, 2, 3, 4, 5]

y1 = [i ** 2 for i in x]
y2 = [2 * i + 1 for i in x]

plt.scatter(x, y1, marker="x", color="r", label="x**2")
plt.scatter(x, y2, marker="o", color="b", label="2*x+1")
plt.show()

Output:

Matplotlib 2D scatter plot

We have two separate scatter plots in the figure: one represented by x and another by the o mark. We assign the label to each scatter plot used as a tag while generating the legend.Finally we display the entire figure using the show() method.

While this plot is informative, it lacks a legend. Let’s add one to clarify what our data points represent.

Adding a Legend to the Scatter Plot

Now that we have our basic scatter plot, let’s add a legend to it. The legend will help viewers understand what the blue points signify.

import numpy as np
import matplotlib.pyplot as plt

x = [1, 2, 3, 4, 5]

y1 = [i ** 2 for i in x]
y2 = [2 * i + 1 for i in x]

plt.scatter(x, y1, marker="x", color="r", label="x**2")
plt.scatter(x, y2, marker="o", color="b", label="2*x+1")
plt.legend()
plt.show()

Output:

Add legend to a 2D scatter plot

In this updated code, we added the plt.legend() function before displaying the plot. This function automatically generates a legend based on the labels provided in the scatter() function. The legend is positioned in the best location by default, ensuring it doesn’t overlap with the data points.

Adding a legend is a simple yet effective way to enhance the clarity of your scatter plot. It allows your audience to quickly grasp the meaning behind the different colors or markers used.

Customizing the Legend

While the default legend is useful, you may want to customize its appearance to better fit your visualization. Matplotlib provides numerous options for styling your legend, including changing its location, font size, and background color.

import matplotlib.pyplot as plt

x = [1, 2, 3, 4, 5]
y = [2, 3, 5, 7, 11]
plt.scatter(x, y, color='blue', label='Prime Numbers')
plt.title('Customized Legend Scatter Plot')
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.legend(loc='upper left', fontsize='large', frameon=True, facecolor='white')
plt.show()

Output:

Customizing the Legend of scatter plot in Matplotlib

In this code, we’ve added several parameters to the plt.legend() function. The loc parameter specifies where the legend should appear, and in this case, we’ve chosen the upper left corner. The fontsize parameter adjusts the text size, making it easier to read. Additionally, the frameon and facecolor parameters allow you to customize the legend’s background color and frame appearance.

Customizing your legend not only enhances its visibility but also improves the overall aesthetic of your scatter plot.

Using Multiple Data Series in a Scatter Plot

In many cases, you may want to visualize multiple data series within the same scatter plot. This can be done by calling the scatter() function multiple times, each time with a different dataset and label.

import matplotlib.pyplot as plt

x1 = [1, 2, 3, 4, 5]
y1 = [2, 3, 5, 7, 11]
x2 = [1, 2, 3, 4, 5]
y2 = [1, 4, 6, 8, 10]

plt.scatter(x1, y1, color='blue', label='Prime Numbers')
plt.scatter(x2, y2, color='red', label='Even Numbers')
plt.title('Scatter Plot with Multiple Data Series')
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.legend()
plt.show()

Output:

Using Multiple Data Series in a Scatter Plot in Matplotlib

In this example, we define two sets of data points: one for prime numbers and another for even numbers. Each call to plt.scatter() includes a different color and label. By adding a single plt.legend() call at the end, we generate a legend that clearly distinguishes between the two datasets. This approach is effective for comparing different categories within the same plot, making your analysis more comprehensive.

Conclusion

Creating a legend for your scatter plot in Matplotlib is a straightforward process that significantly enhances your data visualization. By following the steps outlined in this article, you can easily add, customize, and manage legends, making your scatter plots more informative and visually appealing. Whether you’re presenting data in a report or sharing insights with colleagues, a well-constructed legend will ensure that your audience understands the key points of your analysis. So, go ahead and start incorporating legends into your scatter plots to improve clarity and engagement.

FAQ

  1. How do I add a legend to my scatter plot in Matplotlib?
    You can add a legend using the plt.legend() function after plotting your data points.

  2. Can I customize the legend in Matplotlib?
    Yes, you can customize the legend’s location, font size, and background color using various parameters in the plt.legend() function.

  3. How do I create a scatter plot with multiple data series?
    You can call the plt.scatter() function multiple times for different datasets and provide a label for each to distinguish them in the legend.

  4. What is the purpose of a legend in a scatter plot?
    A legend helps viewers understand what different colors or markers represent in the scatter plot, making the visualization more informative.

  5. Is it necessary to include a legend in every scatter plot?
    While not always necessary, including a legend is recommended when your scatter plot contains multiple data series or categories to avoid confusion.

Enjoying our tutorials? Subscribe to DelftStack on YouTube to support us in creating more high-quality video guides. Subscribe
Author: Suraj Joshi
Suraj Joshi avatar Suraj Joshi avatar

Suraj Joshi is a backend software engineer at Matrice.ai.

LinkedIn

Related Article - Matplotlib Scatter Plot

Related Article - Matplotlib Legend