Module kitchen.plotting

Custom plotting functions for data in pd.DataFrame, anndata.AnnData, and decoupler formats

Functions

def boxplots_group(a, features, groupby, groupby_order=None, groupby_colordict=None, pairby=None, layer=None, log_scale=None, pseudocount=1.0, sig=True, bonferroni=False, ylabel=None, titles=None, legend=True, size=3, panelsize=(3, 3), ncols=6, outdir='./', save_prefix='', dpi=300)

Plot trends from a.obs metadata. Save all plots in grid of single .png file.

Parameters

a : Union[anndata.Anndata, pd.DataFrame]
The annotated data matrix of shape n_obs by n_vars. Rows correspond to samples and columns to genes. Can also be pd.DataFrame.
features : list of str
List of genes, .obs columns, or DataFrame columns to plot (if a is pd.DataFrame) (y variable)
groupby : list of str
Columns from a or a.obs to group by (x variable)
groupby_order : list of str, optional (default=None)
List of values in a[groupby] or a.obs[groupby] specifying the order of groups on x-axis. If groupby is a list, groupby_order should also be a list with corresponding orders in each element.
groupby_colordict : dictionary, optional (default=None)
Dictionary of group, color pairs from groupby to color boxes and points by
pairby : str, optional (default=None)
Categorical .obs column identifying point pairings to draw lines between across groupby categories. Ignored if jitter==False.
layer : str, optional (default=None)
Key from layers attribute of adata if present
log_scale : int, optional (default=None)
Set axis scale(s) to log. Numeric values are interpreted as the desired base (e.g. 10). When None, plot defers to the existing Axes scale.
pseudocount : float, optional (default=1.0)
Pseudocount to add to values before log-transforming with base=log_scale
sig : bool, optional (default=True)
Perform significance testing (2-way t-test) between all groups and add significance bars to plot(s)
bonferroni : bool, optional (default=False)
Adjust significance p-values with simple Bonferroni correction
ylabel : str, optional (default=None)
Label for y axes. If None use colors.
titles : list of str, optional (default=None)
Titles for each set of axes. If None use features.
legend : bool, optional (default=True)
Add legend to plot
size : int, optional (default=3)
Size of the jitter points
panelsize : tuple of float, optional (default=(3, 3))
Size of each panel in output figure in inches
ncols : int, optional (default=5)
Number of columns in gridspec
outdir : str, optional (default="./")
Path to output directory for saving plots
save_prefix : str, optional (default="")
Prefix to add to filenames for saving
dpi : float, optional (default=300)
Resolution in dots per inch for saving figure. Ignored if save_prefix is None.

Returns

gs : gridspec.GridSpec
Return gridspec object if save==None. Otherwise, write to .png in outdir/.
sig_out : dict
Dictionary of t-test statistics if sig==True. Otherwise, write to .csv in outdir/.
def build_gridspec(panels, ncols, panelsize=(3, 3))

Create gridspec.GridSpec object from a list of panels

Parameters

panels : list or str
List of panels in plot grid. If string, only one panel is made.
ncols : int
Number of columns in grid. Number of rows is calculated from len(panels) and ncols.
panelsize : tuple of float, optional (default=(3,3))
Size in inches of each panel within the plot grid.

Returns

gs : gridspec.GridSpec
GridSpec object
fig : matplotlib.Figure
Figure object
def cluster_pie(adata, pie_by='batch', groupby='leiden', ncols=5, show=None, figsize=(5, 5))

Plots pie graphs showing makeup of cluster groups

Parameters

adata : anndata.AnnData
the data
pie_by : str, optional (default="batch")
adata.obs column to split pie charts by
groupby : str, optional (default="leiden")
adata.obs column to create pie charts for
ncols : int, optional (default=5)
number of columns in gridspec
show : bool, optional (default=None)
show figure or just return axes
figsize : tuple of float, optional (default=(5,5))
size of matplotlib figure

Returns

matplotlib gridspec with access to the axes
 
def custom_heatmap(adata, groupby, features=None, vars_dict=None, cluster_obs=False, cluster_vars=False, groupby_order=None, groupby_colordict='native', layer=None, plot_type='dotplot', cmap='Greys', log_scale=None, linthresh=1.0, italicize_vars=True, colorbar_title='Mean expression\nin group', vars_axis='y', figsize=(5, 5), save=None, dpi=300, **kwargs)

Custom wrapper around sc.pl.dotplot, sc.pl.matrixplot, and sc.pl.stacked_violin

Parameters

adata : sc.AnnData
AnnData object to plot from
groupby : str
Categorical column of adata.obs to group dotplot by
features : list of str (default=None)
List of features from adata.obs.columns or adata.var_names to plot. If None, then vars_dict must provided.
vars_dict : dict, optional (default=None)
Dictionary of groups of vars to highlight with brackets on dotplot. Keys are variable group names and values are list of variables found in features. If provided, features is ignored.
cluster_obs : bool, optional (default=False)
Hierarchically cluster groupby observations and show dendrogram
cluster_vars : bool, optional (default=False)
Hierarchically cluster features for a prettier dotplot. If True, return features in their new order (first return variable).
groupby_order : list, optional (default=None)
Explicit order for groups of observations from adata.obs[groupby]
groupby_colordict : dict, str, or None, optional (default='native')
Dictionary mapping groupby categories (keys) to colors (values) for the group axis text. If groupby_colordict == 'native', use the colors available in adata.uns[f'{groupby}_colors']. If None, don't color group text on plot.
layer : str
Key from adata.layers to use for plotting gene values
plot_type : str, optional (default="dotplot")
One of "dotplot", "matrixplot", "dotmatrix", "stacked_violin", or "heatmap"
cmap : str, optional (default="Greys")
matplotlib colormap for dots
log_scale : int, optional (default=None)
Set axis scale(s) to symlog. Numeric values are interpreted as the desired base (e.g. 10). When None, plot defers to the existing Axes scale.
linthresh : float, optional (default=1.0)
The range within which the color scale is linear (to avoid having the plot go to infinity around zero).
italicize_vars : bool, optional (default=True)
Whether or not to italicize variable names on plot
colorbar_title : str, optional (default="Mean expression\nin group")
Title for colorbar key
vars_axis : str, optional (default="y")
If "y", vars are shown on y-axis. If "x", vars are shown on x-axis.
figsize : tuple of float, optional (default=(5,5))
Size of the figure in inches
save : str or None, optional (default=None)
Path to file to save image to. If None, return figure object (second return variable if cluster_vars == True).
dpi : float, optional (default=300)
Resolution in dots per inch for saving figure. Ignored if save is None.
**kwargs
Keyword arguments to pass to sc.pl.dotplot, sc.pl.matrixplot, sc.pl.stacked_violin, or sc.pl.heatmap

Returns

features_ordered : list of str
If cluster_vars == True, reordered features based on hierarchical clustering
figure : sc._plotting object (DotPlot, matrixplot, stacked_violin)
If save == None, scanpy plotting object is returned
def decoupler_dotplot(df, x, y, c, s, largest_dot=50, cmap='coolwarm', title=None, figsize=(3, 5), ax=None, return_fig=False, save=None, dpi=200)

Plot results of decoupler enrichment analysis as dots.

Parameters

df : DataFrame
Results of enrichment analysis.
x : str
Column name of df to use as continous value.
y : str
Column name of df to use as labels.
c : str
Column name of df to use for coloring.
s : str
Column name of df to use for dot size.
largest_dot : int
Parameter to control the size of the dots in points.
cmap : str
Colormap to use.
title : str, None
Text to write as title of the plot.
figsize : tuple
Figure size.
ax : Axes, None
A matplotlib axes object. If None returns new figure.
return_fig : bool
Whether to return a Figure object or not.
save : str, None
Path to where to save the plot. Infer the filetype if ending in {.pdf, .png, .svg}.
dpi : float, optional (default=200)
Resolution in dots per inch for saving figure. Ignored if save is None.

Returns

fig : matplotlib.Figure, None
If return_fig==True, returns figure object.
def decoupler_dotplot_facet(df, group_col='group', x='Combined score', y='Term', c='FDR p-value', s='Overlap ratio', top_n=None, ncols=4, figsize_scale=1.5, save=None, dpi=200, **kwargs)

Plot results of decoupler enrichment analysis as dots, faceted by group

Parameters

df : DataFrame
results of enrichment analysis.
group_col : str
column from df to facet by
x : str
column name of df to use as continous value.
y : str
column name of df to use as labels.
c : str
column name of df to use for coloring.
s : str
column name of df to use for dot size.
top_n : int
number of top terms to plot per group. If None show all terms.
ncols : int
number of columns for faceting. If None use len(df[group_col].unique())
figsize_scale : float
scale size of matplotlib figure
save : str
path to file to save image to. If None, return axes objects
dpi : float, optional (default=200)
Resolution in dots per inch for saving figure. Ignored if save is None.
**kwargs
keyword arguments to pass to decoupler_dotplot() function

Returns

fig : matplotlib.Figure
Return figure object if save==None. Otherwise, write to save.
def jointgrid_boxplots_category(a, x, y, color, figheight=5, sig=True, bonferroni=False, cmap_dict=None, stripplot=True, outdir='./', save_prefix=None, dpi=300)

Jointplot with scatter between two variables and marginal boxplots showing distributions and stats across a third variable (color)

Parameters

a : Union[anndata.Anndata, pd.DataFrame]
The annotated data matrix of shape n_obs by n_vars. Rows correspond to samples and columns to genes. Can also be pd.DataFrame.
x : str
Column from a or a.obs to plot on x axis of jointgrid
y : str
Column from a or a.obs to plot on y axis of jointgrid
color : str
Column from a or a.obs containing categories for marginal boxplots and statistics in x and y.
figheight : float, optional (default=5)
Size of output figure in inches (it will be square)
sig : bool, optional (default=True)
Perform significance testing (2-way t-test) between all groups and add significance bars to marginal boxplots
bonferroni : bool, optional (default=False)
Adjust significance p-values with simple Bonferroni correction
cmap_dict : dictionary, optional (default=None)
Dictionary of group, color pairs from color to color boxes and points by
stripplot : bool, optional (default=True)
Plot stripplot with jittered points over marginal boxplots
outdir : str
Path to output directory for saving plots
save_prefix : str, optional (default=None)
Prefix to add to filenames for saving. If None, don't save anything.
dpi : float, optional (default=300)
Resolution in dots per inch for saving figure. Ignored if save_prefix is None.

Returns

If save_prefix!=None, saves plot as .png file to outdir and stats (if
sig==True) to .csv file.
g : sns.JointGrid
Plot object
sig_out : dict
Dictionary containing statistics (if sig==True)
def jointgrid_boxplots_threshold(a, x, y, color, x_thresh, figheight=5, sig=True, cmap_dict=None, stripplot=True, dodge_by_color=False, outdir='./', save_prefix=None, dpi=300)

Jointplot with scatter between two variables and marginal boxplots showing distributions and stats across a third variable (color, x values in y margin) and a threshold of x (x_thresh, y values in x margin)

Parameters

a : Union[anndata.Anndata, pd.DataFrame]
The annotated data matrix of shape n_obs by n_vars. Rows correspond to samples and columns to genes. Can also be pd.DataFrame.
x : str
Column from a or a.obs to plot on x axis of jointgrid
y : str
Column from a or a.obs to plot on y axis of jointgrid
color : str
Column from a or a.obs containing categories for marginal boxplots and statistics in x and y.
x_thresh : float
Threshold along x for which to split points by and plot marginal boxplots for y
figheight : float, optional (default=5)
Size of output figure in inches (it will be square)
sig : bool, optional (default=True)
Perform significance testing (2-way t-test) between all groups and add significance bars to marginal boxplots
cmap_dict : dictionary, optional (default=None)
Dictionary of group, color pairs from color to color boxes and points by
stripplot : bool, optional (default=True)
Plot stripplot with jittered points over marginal boxplots
dodge_by_color : bool, optional (default=False)
Dodge boxplots and stripplots in y-marginal axes by color. If False, only one boxplot per category ("-lo" and "-hi" based on x_thresh), with jitterplot points colored by color if stripplot==True.
outdir : str
Path to output directory for saving plots
save_prefix : str, optional (default=None)
Prefix to add to filenames for saving. If None, don't save anything.
dpi : float, optional (default=300)
Resolution in dots per inch for saving figure. Ignored if save_prefix is None.

Returns

If save_prefix!=None, saves plot as .png file to outdir and stats (if
sig==True) to .csv file.
g : sns.JointGrid
Plot object
sig_out : dict
Dictionary containing statistics (if sig==True)
def list_union(lst1, lst2)

Combines two lists by the union of their values

Parameters

lst1, lst2 : list
lists to combine

Returns

final_list : list
union of values in lst1 and lst2
def myexpm1(x, base=2, pseudocount=0.1, myround=False)

Custom expm1 function

def mylog1p(x, base=2, pseudocount=0.1)

Custom log1p function

def myround(x, n_precision=3)

Custom rounding function

def pie_from_col(df, col, title=None, figsize=(8, 8), save_to=None)

Create a pie chart from the values in a pd.DataFrame column

Parameters

df : pd.DataFrame
Dataframe from which to plot
col : str
Column in df to create pie chart for
title : str
Title of plot
figsize : tuple of float, optional (default=(8,8))
Size of figure
save_to : str, optional (default=None)
Path to image file to save plot to
def plot_embedding(adata, basis='X_umap', colors=None, show_clustering=True, ncols=5, n_cnmf_markers=7, figsize_scale=1.0, cmap='viridis', seed=18, save_to=None, verbose=True, **kwargs)

Plots reduced-dimension embeddings of single-cell dataset

Parameters

adata : anndata.AnnData
object containing preprocessed and dimension-reduced counts matrix
basis : str, optional (default="X_umap")
key from adata.obsm containing embedding coordinates
colors : list of str, optional (default=None)
colors to plot; can be genes or .obs columns
show_clustering : bool, optional (default=True)
plot PAGA graph and leiden clusters on first two axes
basis : str, optional (default="X_umap")
embedding to plot - key from adata.obsm
ncols : int, optional (default=5)
number of columns in gridspec
n_cnmf_markers : int, optional (default=7)
number of top genes to print on cNMF plots
figsize_scale : float, optional (default=1.0)
scaler for figure size. calculated using ncols to keep each panel square. values < 1.0 will compress figure, > 1.0 will expand.
cmap : str, optional (default="viridis")
valid color map for the plot
seed : int, optional (default=18)
random state for plotting PAGA
save_to : str, optional (default=None)
path to .png file for saving figure; default is plt.show()
verbose : bool, optional (default=True)
print updates to console
**kwargs : optional
args to pass to sc.pl.embedding (e.g. "size", "add_outline", etc.)

Returns

plot of embedding with overlays from "colors" as matplotlib gridspec object,
 

unless save_to is not None.

def rank_genes_cnmf(adata, attr='varm', keys='cnmf_spectra', indices=None, labels=None, titles=None, color='black', n_points=20, ncols=5, log=False, show=None, figsize=(5, 5))

Plots rankings. [Adapted from scanpy.plotting._anndata.ranking]

See, for example, how this is used in pl.pca_ranking.

Parameters

adata : anndata.AnnData
the data
attr : str {'var', 'obs', 'uns', 'varm', 'obsm'}
the attribute of adata that contains the score
keys : str or list of str, optional (default="cnmf_spectra")
scores to look up an array from the attribute of adata
indices : list of int, optional (default=None)
the column indices of keys for which to plot (e.g. [0,1,2] for first three keys)
labels : list of str, optional (default=None)
Labels to use for features displayed as plt.txt objects on the axes
titles : list of str, optional (default=None)
Labels for titles of each plot panel, in order
ncols : int, optional (default=5)
number of columns in gridspec
show : bool, optional (default=None)
show figure or just return axes
figsize : tuple of float, optional (default=(5,5))
size of matplotlib figure

Returns

matplotlib gridspec with access to the axes
 
def save_plot(fig, ax, save)
def significance_bar(start, end, height, displaystring, linewidth=1.2, markersize=8, boxpad=0.3, fontsize=15, color='k', ax=None, horizontal=False)

Draw significance bracket on matplotlib figure

def split_violin(a, features, groupby=None, groupby_order=None, splitby=None, splitby_order=None, pairby=None, points_colorby=None, layer=None, log_scale=None, pseudocount=1.0, scale='width', plot_type='violin', split=True, strip=True, jitter=True, size=1, panelsize=(3, 3), ncols=1, ylabel=None, titles=None, legend=True, save=None, dpi=300)

Plot genes grouped by one variable and split by another

Parameters

a : Union[anndata.Anndata, pd.DataFrame]
The annotated data matrix of shape n_obs by n_vars. Rows correspond to samples and columns to genes. Can also be pd.DataFrame.
features : list of str
List of genes, .obs columns, or DataFrame columns to plot (if a is pd.DataFrame).
groupby : str, optional (default=None)
Column from a or a.obs to group by (x variable)
groupby_order : list of str, optional (default=None)
List of values in a[groupby] or a.obs[groupby] specifying the order of groups on x-axis. If groupby is a list, groupby_order should also be a list with corresponding orders in each element.
splitby : str, optional (default=None)
Categorical .obs column to split violins by.
splitby_order : list of str, optional (default=None)
Order of categories in adata.obs[splitby].
pairby : str, optional (default=None)
Categorical .obs column identifying point pairings to draw lines between across groupby categories. Ignored if jitter==False.
points_colorby : str, optional (default=None)
Categorical .obs column to color stripplot points by.
layer : str, optional (default=None)
Key from layers attribute of adata if present
log_scale : int, optional (default=None)
Set axis scale(s) to log. Numeric values are interpreted as the desired base (e.g. 10). When None, plot defers to the existing Axes scale.
pseudocount : float, optional (default=1.0)
Pseudocount to add to values before log-transforming with base=log_scale
scale : str, optional (default="width")
See :func:~seaborn.violinplot.
plot_type : str, optional (default="violin")
"violin" for violinplot, "box" for boxplot
split : bool, optional (default=True)
Whether to split the violins or not.
strip : bool, optional (default=True)
Show a strip plot on top of the violin plot.
jitter : Union[int, float, bool], optional (default=True)
If set to 0, no points are drawn. See :func:~seaborn.stripplot.
size : int, optional (default=1)
Size of the jitter points
panelsize : tuple of int, optional (default=(3, 3))
Size of each panel in output figure in inches
ncols : int, optional (default=1)
Number of columns in gridspec. If None use len(features).
ylabel : str, optional (default=None)
Label for y axes. If None use "expression"
titles : list of str, optional (default=None)
Titles for each set of axes. If None use features.
legend : bool, optional (default=True)
Add legend to plot
save : str, optional (default=None)
Path to file to save image to. If None, return axes objects
dpi : float, optional (default=300)
Resolution in dots per inch for saving figure. Ignored if save is None.

Returns

fig : matplotlib.Figure
Return figure object if save==None. Otherwise, write to save.