Back to all posts

Seaborn

Seaborn is a powerful and easy-to-use data visualization library in Python built on top of Matplotlib. It's great for creating attractive and informative s…

Seaborn is a powerful and easy-to-use data visualization library in Python built on top of Matplotlib. It's great for creating attractive and informative statistical graphics.

Plain Text
pip install seaborn
JavaScript
import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd

Loading a Dataset

Seaborn comes with several built-in datasets, which are great for practice and learning. Let’s start by loading one of them—the famous tips dataset:

PHP
# Load the tips dataset
tips = sns.load_dataset("tips")
print(tips.head())
Bash
sns.scatterplot(data=tips, x="total_bill", y="tip")
plt.title("Scatter plot of Total Bill vs Tip")
plt.show()

Seaborn Plotting Functions

Seaborn provides a variety of plotting functions, each with its own set of properties and parameters. Here are some of the most commonly used functions:

a) Relational Plots

  • scatterplot(): Creates a scatter plot.
    • Properties: x, y, hue, size, style, palette, markers, sizes, legend, alpha, etc.
  • lineplot(): Creates a line plot.
    • Properties: x, y, hue, size, style, dashes, markers, palette, legend, etc.
  • relplot(): A flexible function that can create both scatter and line plots.
    • Properties: x, y, hue, size, style, kind (scatter or line), col, row, col_wrap, palette, etc.

b) Categorical Plots

  • barplot(): Creates a bar plot.
    • Properties: x, y, hue, ci, palette, saturation, dodge, orient, width, etc.
  • countplot(): Displays the counts of observations in each categorical bin using bars.
    • Properties: x, y, hue, palette, saturation, dodge, orient, etc.
  • boxplot(): Draws a box plot to show distributions with respect to categories.
    • Properties: x, y, hue, palette, saturation, dodge, fliersize, width, etc.
  • violinplot(): Draws a combination of boxplot and kernel density estimate.
    • Properties: x, y, hue, split, palette, saturation, scale, inner, etc.
  • stripplot(): Draws a scatterplot where one variable is categorical.
    • Properties: x, y, hue, jitter, dodge, palette, size, marker, etc.
  • swarmplot(): Draws a categorical scatterplot with non-overlapping points.
    • Properties: x, y, hue, palette, size, marker, etc.
  • pointplot(): Shows point estimates and confidence intervals using scatter plot glyphs.
    • Properties: x, y, hue, palette, markers, linestyles, scale, etc.
  • catplot(): A general categorical plot that can draw all types of categorical plots (bar, box, violin, etc.).
    • Properties: x, y, hue, kind, col, row, palette, col_wrap, aspect, etc.

c) Distribution Plots

  • histplot(): Plots a univariate or bivariate histogram.
    • Properties: x, y, hue, weights, bins, binwidth, discrete, kde, log_scale, cbar, etc.
  • kdeplot(): Plots a kernel density estimate.
    • Properties: x, y, hue, shade, bw_adjust, kernel, cut, cumulative, etc.
  • distplot(): Deprecated function, now replaced by histplot() and kdeplot().
  • ecdfplot(): Plots empirical cumulative distribution functions.
    • Properties: x, hue, stat, complementary, log_scale, etc.
  • rugplot(): Draws a rugplot, which is a plot of data points on an axis.
    • Properties: x, y, hue, height, expand_margins, etc.

d) Matrix Plots

  • heatmap(): Draws a heatmap of a matrix.
    • Properties: data, vmin, vmax, cmap, center, annot, fmt, linewidths, linecolor, cbar, etc.
  • clustermap(): Plots a matrix dataset as a hierarchically clustered heatmap.
    • Properties: data, pivot_kws, method, metric, z_score, standard_scale, cmap, dendrogram_ratio, col_cluster, etc.

e) Regression Plots

  • lmplot(): Plots data and regression model fits across a FacetGrid.
    • Properties: x, y, hue, col, row, palette, ci, scatter_kws, line_kws, col_wrap, etc.
  • regplot(): Fits and plots a univariate regression model.
    • Properties: x, y, data, x_estimator, order, robust, logistic, scatter_kws, line_kws, etc.
  • residplot(): Plots the residuals of a linear regression.
    • Properties: x, y, lowess, color, scatter_kws, line_kws, etc.
  • jointplot(): Draws a plot of two variables with bivariate and univariate graphs.
    • Properties: x, y, kind, hue, palette, height, ratio, space, marginal_kws, joint_kws, etc.
  • pairplot(): Plots pairwise relationships in a dataset.
    • Properties: data, hue, palette, kind, diag_kind, markers, corner, height, aspect, etc.

2. Global Seaborn Settings

Seaborn allows you to set global aesthetic parameters that affect all plots in your session.

a) Set Aesthetics

  • sns.set(): Sets the aesthetic parameters.
    • Parameters: style, context, palette, font, font_scale, color_codes, etc.

b) Themes

  • sns.set_style(): Sets the plotting style.
    • Styles: "darkgrid", "whitegrid", "dark", "white", "ticks".
  • sns.despine(): Removes or trims the spines of the plot.
    • Parameters: offset, trim, left, right, top, bottom.
  • sns.set_context(): Sets the context parameters.
    • Contexts: "paper", "notebook", "talk", "poster".
    • Parameters: rc (dictionary of parameters to override).

c) Color Palettes

  • sns.set_palette(): Sets the default color cycle.
    • Palettes: "deep", "muted", "bright", "pastel", "dark", "colorblind", or any custom list of colors.
  • sns.color_palette(): Returns a list of colors defining a color palette.
  • sns.palplot(): Visualizes a color palette.

d) Figure-Level Parameters

  • sns.set_context(): Controls the scale of plot elements.
    • Parameters: context, font_scale, rc.
  • sns.set(): Can be used to set style, context, palette, etc., all in one function.

3. Customization and Fine-Tuning

a) Annotating Plots

  • annotate(): Add annotations to plots, useful for highlighting specific data points.
  • text(): Add text labels to plots.

b) Adjusting Plot Layouts

  • plt.subplots_adjust(): Adjusts subplot parameters for better spacing.
  • sns.FacetGrid(): Allows for easy creation of multi-plot grids.
    • Properties: col, row, hue, margin_titles, despine, sharex, sharey, etc.

c) Saving Plots

  • plt.savefig(): Saves the current figure to a file.
    • Parameters: fname, dpi, quality, bbox_inches, pad_inches, etc.

4. Seaborn Utilities

a) Loading Datasets

  • sns.load_dataset(): Loads one of Seaborn’s built-in datasets.
  • sns.get_dataset_names(): Returns a list of available dataset names.

b) Grid and Axis-Level Methods

  • FacetGrid.map(): Maps a function onto a grid.
  • FacetGrid.map_dataframe(): Maps a function that works with Pandas DataFrames onto a grid.
  • FacetGrid.add_legend(): Adds a legend to the grid.
  • FacetGrid.set_axis_labels(): Sets the axis labels of the grid.

c) Miscellaneous

  • sns.clustermap(): Useful for visualizing clustered data.
  • sns.cubehelix_palette(): Creates a sequential color palette that is perceptually uniform in both color and grayscale.

Keep building your data skillset

Explore more SQL, Python, analytics, and engineering tutorials.