Python Craft

Seaborn 0.12: An Insightful Guide to the Objects Interface and Declarative Graphics

Streamlining your data visualization journey with Python's popular library

Peng Qian

20 Aug 2023 — 14 min read

Photo Credit: Created by Author, Canva

This article aims to introduce the objects interface feature in Seaborn 0.12, including the concept of declarative graphic syntax, and a practical visualization project to showcase the usage of the objects interface.

By the end of this article, you'll have a clear understanding of the advantages and limitations of Seaborn's objects interface API. And you will be able to use Seaborn for data analysis projects more easily.

Introduction

Imagine you're creating a data visualization chart using Python.

You have to instruct the computer every step of the way: select a dataset, create a figure, set the color, add labels, adjust the size, etc...

Then you realize your code is getting longer and more complex, and all you wanted was to quickly visualize your data.

It's like going to the grocery store and having to specify every item's location, color, size, and shape, instead of just telling the shop assistant what you need.

Not only is this time-consuming, but it can also feel tiring.

However, Seaborn 0.12's new feature—the objects interface—and its use of declarative graphic syntax is like having a shop assistant who understands you. You just need to tell it what you need to do, and it will find everything for you.

You no longer need to instruct it every step of the way. You just need to tell it what kind of result you want.

In this article, I'll guide you through using the objects interface, this new feature that makes your data visualization process more effortless, flexible, and enjoyable. Let's get started!

Seaborn API: Then and Now

Before diving into the objects interface API, let's systematically look at the differences between the Seaborn API of earlier versions and the 0.12 version.

The original API

Many readers might have been intimidated by Matplotlib's complex API documentation when learning Python data visualization.

Seaborn simplifies this by wrapping and streamlining Matplotlib's API, making the learning curve gentler.

Seaborn doesn't just offer high-level encapsulation of Matplotlib; it also categorizes all charts into relational, distributional, and categorical scenarios.

Overview of Seaborn's original API design. Image by Author

You should comprehensively understand Seaborn's API through this diagram and know when to use which chart.

For example, a histplot representing data distribution would fall under the distribution chart category.

In contrast, a violinplot representing data features by category would be classified as a categorical chart.

Aside from vertical categorization, Seaborn also performs horizontal categorization: Figure-level and axes-level.

According to the official website, axes-level charts are drawn on matplotlib.pyplot.axes and can only draw one figure.

In contrast, Figure-level charts use Matplotlib's FacetGrid to draw multiple charts in one figure, facilitating easy comparison of similar data dimensions.

However, even though Seaborn's API significantly simplifies chart drawing through encapsulating Matplotlib, creating an individual-specific chart still requires complex configurations.

For example, if I use Seaborn's built-in penguins dataset to draw a histplot, the code is as follows:

sns.histplot(penguins, x="flipper_length_mm", hue="species");

The original way of drawing a histplot. Image by Author

And when I use the same dataset to draw a kdeplot, the code is as follows:

sns.kdeplot(penguins, x="flipper_length_mm", fill=True, hue="species");

The original way of drawing a kdeplot. Image by Author

Except for the chart API, the rest of the configurations are identical.

This is like telling the chef I want to use lamb chops and onions to make a lamb soup and specifying the cooking steps. When I want to use these ingredients to make a roasted lamb chop, I have to tell the chef about the ingredients and the cooking steps all over again.

Not only is it inefficient, but it also needs more flexibility.

That's why Seaborn introduced the objects interface API in its 0.12 version. This declarative graphic syntax dramatically improves the process of creating a chart.

The objects Interface API

Before we start with the objects interface API, let's take a high-level look at it to better understand the drawing process.

Unlike the original Seaborn API, which organizes the drawing API by classification, the objects interface API collects the API by a drawing pipeline.

The objects interface API divides the drawing into multiple stages, such as data binding, layout, presentation, customization, etc.

Overview of Seaborn's objects interface API design. Image by Author

The data binding and presentation stages are necessary, while other stages are optional.

Also, since the stages are independent, each stage can be reused. Following the previous example of the hist and kde plots:

To use the objects interface to draw, we first need to bind the data:

p = so.Plot(penguins, x="flipper_length_mm", color="species")

From this line of code, we can see that the objects interface uses the so.Plot class for data binding.

Also, compared to the original API that uses the incomprehensible hue parameter, it uses the color parameter to bind the species dimension directly to the chart color, making the configuration more intuitive.

Finally, this line of code returns a p instance that can be reused to draw a chart.

Next, let's draw a histplot:

p.add(so.Bars(), so.Hist())

Use objects interface API to draw a histplot. Image by Author

This line of code shows that the drawing stage does not need to rebind the data. We just need to tell the add method what to draw: so.Bars(), and how to calculate it: so.Hist().

The add method also returns a copy of the Plot instance, so any adjustments in the add method will not affect the original data binding. The p instance can still be reused.

Therefore, we continue to call the p.add() method to draw a kdeplot:

p.add(so.Area(), so.KDE())

Use objects interface API to draw a kdeplot. Image by Author

Since KDE is a way of statistic, so.KDE() is called on the stat parameter here. And since the kdeplot itself is an area plot, so.Area() is used for drawing.

We reused the p instance bound to the data, so there is no need to tell the chef how to cook each dish, but to directly say what we want. Isn't it much more concise and flexible?

Unpacking the Objects Interface with Examples

Next, see how some common charts are written using the original Seaborn API and the objects interface API.

Before we start, we need to import the necessary libraries:

%matplotlib inline

import matplotlib.pyplot as plt
import seaborn as sns
import seaborn.objects as so

import pandas as pd

sns.set()
penguins = sns.load_dataset('penguins')

Bar chart

In the original API, to draw a bar chart, the code is as follows:

sns.barplot(penguins, x="island", y="body_mass_g", hue="species");

The original way of drawing a bar chart. Image by Author

In the objects interface, to draw a bar chart, the code is as follows:

(
    so.Plot(penguins, x="island", y="body_mass_g", color="species")
    .add(so.Bar(), so.Dodge())
)

Use objects interface to draw a bar chart. Image by Author

Scatter plot

In the original API, to draw a scatter plot, the code is as follows:

sns.relplot(penguins, x="bill_length_mm", y="bill_depth_mm", hue="species");

In the original way, we use relplot to draw a scatter plot. Image by Author

In the objects interface, to draw a scatter plot, the code is as follows:

(
    so.Plot(penguins, x="bill_length_mm", y="bill_depth_mm", color="species")
    .add(so.Dots())
)

When using objects interface, we use so.Dots to draw a scatter plot. Image by Author

You may think that after comparing the drawing of the two APIs, it doesn't seem like the objects interface is too special either.

Don't worry. Let's take a look at the advanced usage of the objects interface.

Advanced usage

Suppose we use Seaborn's tips dataset.

tips = sns.load_dataset("tips")

I want to use a bar chart to see the average tip for different dates and mark the values on the chart.

The chart I want is shown below:

A bar chart with text to show the values. Image by Author

Before we start drawing, we need to process the tips dataset to calculate the average value for each day.

day_mean = tips[['day', 'tip']].groupby('day').mean().round(2).reset_index()

Then, we can use the objects interface to draw:

(
    day_mean
    .pipe(so.Plot, y="day", x="tip", text="tip")
    .add(so.Bar(width=.5))
    .add(so.Text(color='w', halign="right"))
)

We use two tricks here:

First, we call the pipe method on the dataframe to enable chain code calls.

Second, we can reuse the instance of so.Plot, and only bind the data once to draw multiple graphs.

Then, let's see how the code would be written using the original API:

ax = sns.barplot(day_mean, x="tip", y="day")

for p in ax.patches:
    width = p.get_width()
    ax.text(width,
            p.get_y() + p.get_height()/2,
            '{:1.2f}'.format(width),
            ha="right", va="center")
plt.show()

As you can see, the original code is much more complex:

First, draw a horizontal bar chart.

Then use iteration to draw the corresponding values on each bar.

In comparison, doesn't the objects interface seem simpler and more flexible?

Applying the Objects Interface to Real-World Data

Next, to help everyone deepen their memory and master the usage of the objects interface systematically, I plan to lead everyone to practice in an actual data visualization project.

In this project, I plan to visually explore the data of New York City's shared bicycle system to understand the usage of the city's shared bicycles and help enterprises operate better.