Module kitchen.plotting
Custom plotting functions for data in pd.DataFrame
, anndata.AnnData
, and
decoupler
formats
Functions
def boxplots_group(a, features, groupby, groupby_order=None, groupby_colordict=None, pairby=None, layer=None, log_scale=None, pseudocount=1.0, sig=True, bonferroni=False, ylabel=None, titles=None, legend=True, size=3, panelsize=(3, 3), ncols=6, outdir='./', save_prefix='', dpi=300)
-
Plot trends from
a.obs
metadata. Save all plots in grid of single.png
file.Parameters
a
:Union[anndata.Anndata, pd.DataFrame]
- The annotated data matrix of shape
n_obs
byn_vars
. Rows correspond to samples and columns to genes. Can also bepd.DataFrame
. features
:list
ofstr
- List of genes,
.obs
columns, or DataFrame columns to plot (ifa
ispd.DataFrame
) (y variable) groupby
:list
ofstr
- Columns from
a
ora.obs
to group by (x variable) groupby_order
:list
ofstr
, optional(default=
None)
- List of values in
a[groupby]
ora.obs[groupby]
specifying the order of groups on x-axis. Ifgroupby
is a list,groupby_order
should also be a list with corresponding orders in each element. groupby_colordict
:dictionary
, optional(default=
None)
- Dictionary of group, color pairs from
groupby
to color boxes and points by pairby
:str
, optional(default=
None)
- Categorical
.obs
column identifying point pairings to draw lines between acrossgroupby
categories. Ignored ifjitter==False
. layer
:str
, optional(default=
None)
- Key from
layers
attribute ofadata
if present log_scale
:int
, optional(default=
None)
- Set axis scale(s) to log. Numeric values are interpreted as the desired base
(e.g. 10). When
None
, plot defers to the existing Axes scale. pseudocount
:float
, optional(default=1.0)
- Pseudocount to add to values before log-transforming with base=
log_scale
sig
:bool
, optional(default=True)
- Perform significance testing (2-way t-test) between all groups and add significance bars to plot(s)
bonferroni
:bool
, optional(default=False)
- Adjust significance p-values with simple Bonferroni correction
ylabel
:str
, optional(default=
None)
- Label for y axes. If
None
usecolors
. titles
:list
ofstr
, optional(default=
None)
- Titles for each set of axes. If
None
usefeatures
. legend
:bool
, optional(default=
True)
- Add legend to plot
size
:int
, optional(default=3)
- Size of the jitter points
panelsize
:tuple
offloat
, optional(default=(3, 3))
- Size of each panel in output figure in inches
ncols
:int
, optional(default=5)
- Number of columns in gridspec
outdir
:str
, optional(default="./")
- Path to output directory for saving plots
save_prefix
:str
, optional(default="")
- Prefix to add to filenames for saving
dpi
:float
, optional(default=300)
- Resolution in dots per inch for saving figure. Ignored if
save_prefix
isNone
.
Returns
gs
:gridspec.GridSpec
- Return gridspec object if
save==None
. Otherwise, write to.png
inoutdir/
. sig_out
:dict
- Dictionary of t-test statistics if
sig==True
. Otherwise, write to.csv
inoutdir/
.
def build_gridspec(panels, ncols, panelsize=(3, 3))
-
Create
gridspec.GridSpec
object from a list of panelsParameters
panels
:list
orstr
- List of panels in plot grid. If string, only one panel is made.
ncols
:int
- Number of columns in grid. Number of rows is calculated from
len(panels)
andncols
. panelsize
:tuple
offloat
, optional(default=(3,3))
- Size in inches of each panel within the plot grid.
Returns
gs
:gridspec.GridSpec
- GridSpec object
fig
:matplotlib.Figure
- Figure object
def cluster_pie(adata, pie_by='batch', groupby='leiden', ncols=5, show=None, figsize=(5, 5))
-
Plots pie graphs showing makeup of cluster groups
Parameters
adata
:anndata.AnnData
- the data
pie_by
:str
, optional(default="batch")
- adata.obs column to split pie charts by
groupby
:str
, optional(default="leiden")
- adata.obs column to create pie charts for
ncols
:int
, optional(default=5)
- number of columns in gridspec
show
:bool
, optional(default=None)
- show figure or just return axes
figsize
:tuple
offloat
, optional(default=(5,5))
- size of matplotlib figure
Returns
matplotlib gridspec with access to the axes
def custom_heatmap(adata, groupby, features=None, vars_dict=None, cluster_obs=False, cluster_vars=False, groupby_order=None, groupby_colordict='native', layer=None, plot_type='dotplot', cmap='Greys', log_scale=None, linthresh=1.0, italicize_vars=True, colorbar_title='Mean expression\nin group', vars_axis='y', figsize=(5, 5), save=None, dpi=300, **kwargs)
-
Custom wrapper around
sc.pl.dotplot
,sc.pl.matrixplot
, andsc.pl.stacked_violin
Parameters
adata
:sc.AnnData
- AnnData object to plot from
groupby
:str
- Categorical column of
adata.obs
to group dotplot by features
:list
ofstr (default=
None)
- List of features from
adata.obs.columns
oradata.var_names
to plot. IfNone
, thenvars_dict
must provided. vars_dict
:dict
, optional(default=
None)
- Dictionary of groups of vars to highlight with brackets on dotplot. Keys are
variable group names and values are list of variables found in
features
. If provided,features
is ignored. cluster_obs
:bool
, optional(default=
False)
- Hierarchically cluster
groupby
observations and show dendrogram cluster_vars
:bool
, optional(default=
False)
- Hierarchically cluster
features
for a prettier dotplot. IfTrue
, returnfeatures
in their new order (first return variable). groupby_order
:list
, optional(default=
None)
- Explicit order for groups of observations from
adata.obs[groupby]
groupby_colordict
:dict, str,
orNone
, optional(default='native')
- Dictionary mapping
groupby
categories (keys) to colors (values) for the group axis text. Ifgroupby_colordict == 'native'
, use the colors available inadata.uns[f'{groupby}_colors']
. IfNone
, don't color group text on plot. layer
:str
- Key from
adata.layers
to use for plotting gene values plot_type
:str
, optional(default="dotplot")
- One of "dotplot", "matrixplot", "dotmatrix", "stacked_violin", or "heatmap"
cmap
:str
, optional(default="Greys")
- matplotlib colormap for dots
log_scale
:int
, optional(default=
None)
- Set axis scale(s) to symlog. Numeric values are interpreted as the desired base
(e.g. 10). When
None
, plot defers to the existing Axes scale. linthresh
:float
, optional(default=1.0)
- The range within which the color scale is linear (to avoid having the plot go to infinity around zero).
italicize_vars
:bool
, optional(default=
True)
- Whether or not to italicize variable names on plot
colorbar_title
:str
, optional(default="Mean expression\nin group")
- Title for colorbar key
vars_axis
:str
, optional(default="y")
- If "y", vars are shown on y-axis. If "x", vars are shown on x-axis.
figsize
:tuple
offloat
, optional(default=(5,5))
- Size of the figure in inches
save
:str
orNone
, optional(default=
None)
- Path to file to save image to. If
None
, return figure object (second return variable ifcluster_vars == True
). dpi
:float
, optional(default=300)
- Resolution in dots per inch for saving figure. Ignored if
save
isNone
. **kwargs
- Keyword arguments to pass to
sc.pl.dotplot
,sc.pl.matrixplot
,sc.pl.stacked_violin
, orsc.pl.heatmap
Returns
features_ordered
:list
ofstr
- If
cluster_vars == True
, reorderedfeatures
based on hierarchical clustering figure
:sc._plotting object (DotPlot, matrixplot, stacked_violin)
- If
save == None
, scanpy plotting object is returned
def decoupler_dotplot(df, x, y, c, s, largest_dot=50, cmap='coolwarm', title=None, figsize=(3, 5), ax=None, return_fig=False, save=None, dpi=200)
-
Plot results of
decoupler
enrichment analysis as dots.Parameters
df
:DataFrame
- Results of enrichment analysis.
x
:str
- Column name of
df
to use as continous value. y
:str
- Column name of
df
to use as labels. c
:str
- Column name of
df
to use for coloring. s
:str
- Column name of
df
to use for dot size. largest_dot
:int
- Parameter to control the size of the dots in points.
cmap
:str
- Colormap to use.
title
:str, None
- Text to write as title of the plot.
figsize
:tuple
- Figure size.
ax
:Axes, None
- A matplotlib axes object. If None returns new figure.
return_fig
:bool
- Whether to return a Figure object or not.
save
:str, None
- Path to where to save the plot. Infer the filetype if ending in
{
.pdf
,.png
,.svg
}. dpi
:float
, optional(default=200)
- Resolution in dots per inch for saving figure. Ignored if
save
isNone
.
Returns
fig
:matplotlib.Figure, None
- If
return_fig==True
, returns figure object.
def decoupler_dotplot_facet(df, group_col='group', x='Combined score', y='Term', c='FDR p-value', s='Overlap ratio', top_n=None, ncols=4, figsize_scale=1.5, save=None, dpi=200, **kwargs)
-
Plot results of
decoupler
enrichment analysis as dots, faceted by groupParameters
df
:DataFrame
- results of enrichment analysis.
group_col
:str
- column from
df
to facet by x
:str
- column name of
df
to use as continous value. y
:str
- column name of
df
to use as labels. c
:str
- column name of
df
to use for coloring. s
:str
- column name of
df
to use for dot size. top_n
:int
- number of top terms to plot per group. If
None
show all terms. ncols
:int
- number of columns for faceting. If
None
uselen(df[group_col].unique())
figsize_scale
:float
- scale size of
matplotlib
figure save
:str
- path to file to save image to. If
None
, return axes objects dpi
:float
, optional(default=200)
- Resolution in dots per inch for saving figure. Ignored if
save
isNone
. **kwargs
- keyword arguments to pass to
decoupler_dotplot()
function
Returns
fig
:matplotlib.Figure
- Return figure object if
save==None
. Otherwise, write tosave
.
def jointgrid_boxplots_category(a, x, y, color, figheight=5, sig=True, bonferroni=False, cmap_dict=None, stripplot=True, outdir='./', save_prefix=None, dpi=300)
-
Jointplot with scatter between two variables and marginal boxplots showing distributions and stats across a third variable (color)
Parameters
a
:Union[anndata.Anndata, pd.DataFrame]
- The annotated data matrix of shape
n_obs
byn_vars
. Rows correspond to samples and columns to genes. Can also bepd.DataFrame
. x
:str
- Column from
a
ora.obs
to plot on x axis of jointgrid y
:str
- Column from
a
ora.obs
to plot on y axis of jointgrid color
:str
- Column from
a
ora.obs
containing categories for marginal boxplots and statistics inx
andy
. figheight
:float
, optional(default=5)
- Size of output figure in inches (it will be square)
sig
:bool
, optional(default=True)
- Perform significance testing (2-way t-test) between all groups and add significance bars to marginal boxplots
bonferroni
:bool
, optional(default=False)
- Adjust significance p-values with simple Bonferroni correction
cmap_dict
:dictionary
, optional(default=None)
- Dictionary of group, color pairs from
color
to color boxes and points by stripplot
:bool
, optional(default=
True)
- Plot stripplot with jittered points over marginal boxplots
outdir
:str
- Path to output directory for saving plots
save_prefix
:str
, optional(default=
None)
- Prefix to add to filenames for saving. If
None
, don't save anything. dpi
:float
, optional(default=300)
- Resolution in dots per inch for saving figure. Ignored if
save_prefix
isNone
.
Returns
- If
save_prefix!=None
, saves plot as.png
file tooutdir
and stats (if sig==True
) to.csv
file.g
:sns.JointGrid
- Plot object
sig_out
:dict
- Dictionary containing statistics (if
sig==True
)
def jointgrid_boxplots_threshold(a, x, y, color, x_thresh, figheight=5, sig=True, cmap_dict=None, stripplot=True, dodge_by_color=False, outdir='./', save_prefix=None, dpi=300)
-
Jointplot with scatter between two variables and marginal boxplots showing distributions and stats across a third variable (color, x values in y margin) and a threshold of x (x_thresh, y values in x margin)
Parameters
a
:Union[anndata.Anndata, pd.DataFrame]
- The annotated data matrix of shape
n_obs
byn_vars
. Rows correspond to samples and columns to genes. Can also bepd.DataFrame
. x
:str
- Column from
a
ora.obs
to plot on x axis of jointgrid y
:str
- Column from
a
ora.obs
to plot on y axis of jointgrid color
:str
- Column from
a
ora.obs
containing categories for marginal boxplots and statistics inx
andy
. x_thresh
:float
- Threshold along
x
for which to split points by and plot marginal boxplots fory
figheight
:float
, optional(default=5)
- Size of output figure in inches (it will be square)
sig
:bool
, optional(default=True)
- Perform significance testing (2-way t-test) between all groups and add significance bars to marginal boxplots
cmap_dict
:dictionary
, optional(default=None)
- Dictionary of group, color pairs from
color
to color boxes and points by stripplot
:bool
, optional(default=
True)
- Plot stripplot with jittered points over marginal boxplots
dodge_by_color
:bool
, optional(default=
False)
- Dodge boxplots and stripplots in y-marginal axes by
color
. IfFalse
, only one boxplot per category ("-lo" and "-hi" based onx_thresh
), with jitterplot points colored bycolor
ifstripplot==True
. outdir
:str
- Path to output directory for saving plots
save_prefix
:str
, optional(default=
None)
- Prefix to add to filenames for saving. If
None
, don't save anything. dpi
:float
, optional(default=300)
- Resolution in dots per inch for saving figure. Ignored if
save_prefix
isNone
.
Returns
- If
save_prefix!=None
, saves plot as.png
file tooutdir
and stats (if sig==True
) to.csv
file.g
:sns.JointGrid
- Plot object
sig_out
:dict
- Dictionary containing statistics (if
sig==True
)
def list_union(lst1, lst2)
-
Combines two lists by the union of their values
Parameters
lst1
,lst2
:list
- lists to combine
Returns
final_list
:list
- union of values in lst1 and lst2
def myexpm1(x, base=2, pseudocount=0.1, myround=False)
-
Custom expm1 function
def mylog1p(x, base=2, pseudocount=0.1)
-
Custom log1p function
def myround(x, n_precision=3)
-
Custom rounding function
def pie_from_col(df, col, title=None, figsize=(8, 8), save_to=None)
-
Create a pie chart from the values in a pd.DataFrame column
Parameters
df
:pd.DataFrame
- Dataframe from which to plot
col
:str
- Column in
df
to create pie chart for title
:str
- Title of plot
figsize
:tuple
offloat
, optional(default=(8,8))
- Size of figure
save_to
:str
, optional(default=
None)
- Path to image file to save plot to
def plot_embedding(adata, basis='X_umap', colors=None, show_clustering=True, ncols=5, n_cnmf_markers=7, figsize_scale=1.0, cmap='viridis', seed=18, save_to=None, verbose=True, **kwargs)
-
Plots reduced-dimension embeddings of single-cell dataset
Parameters
adata
:anndata.AnnData
- object containing preprocessed and dimension-reduced counts matrix
basis
:str
, optional(default="X_umap")
- key from
adata.obsm
containing embedding coordinates colors
:list
ofstr
, optional(default=None)
- colors to plot; can be genes or .obs columns
show_clustering
:bool
, optional(default=True)
- plot PAGA graph and leiden clusters on first two axes
basis
:str
, optional(default="X_umap")
- embedding to plot - key from
adata.obsm
ncols
:int
, optional(default=5)
- number of columns in gridspec
n_cnmf_markers
:int
, optional(default=7)
- number of top genes to print on cNMF plots
figsize_scale
:float
, optional(default=1.0)
- scaler for figure size. calculated using ncols to keep each panel square. values < 1.0 will compress figure, > 1.0 will expand.
cmap
:str
, optional(default="viridis")
- valid color map for the plot
seed
:int
, optional(default=18)
- random state for plotting PAGA
save_to
:str
, optional(default=None)
- path to .png file for saving figure; default is plt.show()
verbose
:bool
, optional(default=True)
- print updates to console
**kwargs
:optional
- args to pass to
sc.pl.embedding
(e.g. "size", "add_outline", etc.)
Returns
plot
ofembedding with overlays from "colors" as matplotlib gridspec object,
unless
save_to
is not None. def rank_genes_cnmf(adata, attr='varm', keys='cnmf_spectra', indices=None, labels=None, titles=None, color='black', n_points=20, ncols=5, log=False, show=None, figsize=(5, 5))
-
Plots rankings. [Adapted from
scanpy.plotting._anndata.ranking
]See, for example, how this is used in
pl.pca_ranking
.Parameters
adata
:anndata.AnnData
- the data
attr
:str {'var', 'obs', 'uns', 'varm', 'obsm'}
- the attribute of adata that contains the score
keys
:str
orlist
ofstr
, optional(default="cnmf_spectra")
- scores to look up an array from the attribute of adata
indices
:list
ofint
, optional(default=None)
- the column indices of keys for which to plot (e.g. [0,1,2] for first three keys)
labels
:list
ofstr
, optional(default=None)
- Labels to use for features displayed as plt.txt objects on the axes
titles
:list
ofstr
, optional(default=None)
- Labels for titles of each plot panel, in order
ncols
:int
, optional(default=5)
- number of columns in gridspec
show
:bool
, optional(default=None)
- show figure or just return axes
figsize
:tuple
offloat
, optional(default=(5,5))
- size of matplotlib figure
Returns
matplotlib gridspec with access to the axes
def save_plot(fig, ax, save)
def significance_bar(start, end, height, displaystring, linewidth=1.2, markersize=8, boxpad=0.3, fontsize=15, color='k', ax=None, horizontal=False)
-
Draw significance bracket on matplotlib figure
def split_violin(a, features, groupby=None, groupby_order=None, splitby=None, splitby_order=None, pairby=None, points_colorby=None, layer=None, log_scale=None, pseudocount=1.0, scale='width', plot_type='violin', split=True, strip=True, jitter=True, size=1, panelsize=(3, 3), ncols=1, ylabel=None, titles=None, legend=True, save=None, dpi=300)
-
Plot genes grouped by one variable and split by another
Parameters
a
:Union[anndata.Anndata, pd.DataFrame]
- The annotated data matrix of shape
n_obs
byn_vars
. Rows correspond to samples and columns to genes. Can also bepd.DataFrame
. features
:list
ofstr
- List of genes,
.obs
columns, or DataFrame columns to plot (ifa
ispd.DataFrame
). groupby
:str
, optional(default=
None)
- Column from
a
ora.obs
to group by (x variable) groupby_order
:list
ofstr
, optional(default=
None)
- List of values in
a[groupby]
ora.obs[groupby]
specifying the order of groups on x-axis. Ifgroupby
is a list,groupby_order
should also be a list with corresponding orders in each element. splitby
:str
, optional(default=
None)
- Categorical
.obs
column to split violins by. splitby_order
:list
ofstr
, optional(default=
None)
- Order of categories in
adata.obs[splitby]
. pairby
:str
, optional(default=
None)
- Categorical
.obs
column identifying point pairings to draw lines between acrossgroupby
categories. Ignored ifjitter==False
. points_colorby
:str
, optional(default=
None)
- Categorical
.obs
column to color stripplot points by. layer
:str
, optional(default=
None)
- Key from
layers
attribute ofadata
if present log_scale
:int
, optional(default=
None)
- Set axis scale(s) to log. Numeric values are interpreted as the desired base
(e.g. 10). When
None
, plot defers to the existing Axes scale. pseudocount
:float
, optional(default=1.0)
- Pseudocount to add to values before log-transforming with base=
log_scale
scale
:str
, optional(default="width")
- See :func:
~seaborn.violinplot
. plot_type
:str
, optional(default="violin")
- "violin" for violinplot, "box" for boxplot
split
:bool
, optional(default=
True)
- Whether to split the violins or not.
strip
:bool
, optional(default=
True)
- Show a strip plot on top of the violin plot.
jitter
:Union[int, float, bool]
, optional(default=
True)
- If set to 0, no points are drawn. See :func:
~seaborn.stripplot
. size
:int
, optional(default=1)
- Size of the jitter points
panelsize
:tuple
ofint
, optional(default=(3, 3))
- Size of each panel in output figure in inches
ncols
:int
, optional(default=1)
- Number of columns in gridspec. If
None
uselen(features)
. ylabel
:str
, optional(default=
None)
- Label for y axes. If
None
use "expression" titles
:list
ofstr
, optional(default=
None)
- Titles for each set of axes. If
None
usefeatures
. legend
:bool
, optional(default=
True)
- Add legend to plot
save
:str
, optional(default=
None)
- Path to file to save image to. If
None
, return axes objects dpi
:float
, optional(default=300)
- Resolution in dots per inch for saving figure. Ignored if
save
isNone
.
Returns
fig
:matplotlib.Figure
- Return figure object if
save==None
. Otherwise, write tosave
.