PyComplexHeatmap.clustermap module

class PyComplexHeatmap.clustermap.ClusterMapPlotter(data, z_score=None, standard_scale=None, top_annotation=None, bottom_annotation=None, left_annotation=None, right_annotation=None, row_cluster=True, col_cluster=True, row_cluster_method='average', row_cluster_metric='correlation', col_cluster_method='average', col_cluster_metric='correlation', show_rownames=False, show_colnames=False, row_names_side='right', col_names_side='bottom', xticklabels_kws=None, yticklabels_kws=None, row_dendrogram=False, col_dendrogram=False, row_dendrogram_size=10, col_dendrogram_size=10, row_split=None, col_split=None, row_dendrogram_kws=None, col_dendrogram_kws=None, tree_kws=None, row_split_order=None, col_split_order=None, row_split_gap=0.5, col_split_gap=0.2, mask=None, subplot_gap=1, legend=True, legend_kws=None, plot=True, plot_legend=True, legend_anchor='auto', legend_gap=7, legend_width=None, legend_hpad=1, legend_vpad=5, legend_side='right', cmap='jet', label=None, xlabel=None, ylabel=None, xlabel_kws=None, ylabel_kws=None, xlabel_side='bottom', ylabel_side='left', xlabel_bbox_kws=None, ylabel_bbox_kws=None, rasterized='auto', legend_delta_x=None, verbose=1, **kwargs)[source]

Bases: object

Clustermap (Heatmap) plotter. Plot heatmap / clustermap with annotation and legends.

Parameters:
  • data (dataframe) – pandas dataframe or numpy array.

  • z_score (int) – whether to perform z score scale, either 0 for rows or 1 for columns, after scale, value range would be from -1 to 1.

  • standard_scale (int) – either 0 for rows or 1 for columns, after scale,value range would be from 0 to 1.

  • top_annotation (annotation: class of HeatmapAnnotation.) –

  • bottom_annotation (class AnnotationBase) – the same as top_annotation.

  • left_annotation (class AnnotationBase) – the same as top_annotation.

  • right_annotation (class AnnotationBase) – the same as top_annotation.

  • row_cluster (bool) – whether to perform cluster on rows/columns.

  • col_cluster (bool) – whether to perform cluster on rows/columns.

  • row_cluster_method (str) – cluster method for row/columns linkage, such single, complete, average,weighted, centroid, median, ward. see scipy.cluster.hierarchy.linkage or (https://docs.scipy.org/doc/scipy/reference/generated/scipy.cluster.hierarchy.linkage.html) for detail.

  • row_cluster_metric (str) – Pairwise distances between observations in n-dimensional space for row/columns, such euclidean, minkowski, cityblock, seuclidean, cosine, correlation, hamming, jaccard, chebyshev, canberra, braycurtis, mahalanobis, kulsinski et.al. centroid, median, ward. see scipy.cluster.hierarchy.linkage or https://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.spatial.distance.pdist.html Please try metric=’canberra’ if there are two columns having the same values.

  • col_cluster_method (str) – same as row_cluster_method

  • col_cluster_metric (str) – same as row_cluster_metric

  • show_rownames (bool) – True (default) or False, whether to show row ticklabels.

  • show_colnames (bool) – True of False, same as show_rownames.

  • row_names_side (str) – right or left.

  • col_names_side (str) – top or bottom.

  • row_dendrogram (bool) – True or False, whether to show dendrogram.

  • col_dendrogram (bool) – True or False, whether to show dendrogram.

  • row_dendrogram_size (int) – default is 10mm.

  • col_dendrogram_size (int) – default is 10mm.

  • row_split (int or pd.Series or pd.DataFrame) – number of cluster for hierarchical clustering or pd.Series or pd.DataFrame, used to split rows or rows into subplots.

  • col_split (int or pd.Series or pd.DataFrame) – int or pd.Series or pd.DataFrame, used to split rows or columns into subplots.

  • row_dendrogram_kws (dict) – custom linkage could be passed to row_dendrogram_kws, for example: row_dendrogram_kws=dict(linkage=my_linkage); Other kws passed to hierarchy.dendrogram.

  • col_dendrogram_kws (dict) – custom linkage could be passed to col_dendrogram_kws, for example: col_dendrogram_kws=dict(linkage=my_linkage); Other kws passed to hierarchy.dendrogram.

  • tree_kws (dict) – kws passed to DendrogramPlotter.plot()

  • row_split_order (list or str) – a list to specify the order of row_split, could also be ‘cluster_between_groups’, if cluster_between_groups was specified, hierarchical clustering will be performed on the mean values for each groups and pass the clsutered order to row_split_order. For example, see https://dingwb.github.io/PyComplexHeatmap/build/html/notebooks/advanced_usage.html#Cluster-between-groups

  • col_split_order (list or str) – a list to specify the order of col_split, could also be ‘cluster_between_groups’, if cluster_between_groups was specified, hierarchical clustering will be performed on the mean values for each groups and pass the clsutered order to row_split_order.

  • row_split_gap (float) – default are 0.5 and 0.2 mm for row and col.

  • col_split_gap (float) – default are 0.5 and 0.2 mm for row and col.

  • mask (dataframe or array) – mask the data in heatmap, the cell with missing values of infinite values will be masked automatically.

  • subplot_gap (float) – the gap between subplots, default is 1mm.

  • legend (bool) – True or False, whether to plot heatmap legend, determined by cmap.

  • legend_kws (dict) –

    vmax, vmin and other kws passed to plot legend, such asfontsize,

    fontsize, labelcolor, numpoints, markerscale, markerfirst, frameon shadow, facecolor, edgecolor, title, title_fontsize, labelspacing and so on (see ?plt.legend)

    Alaternatively, we can also change the outline color and linewidth of cbar after plotting: cm=ClusterMapPlotter(…) for cbar in cm.cbars:

    if isinstance(cbar,matplotlib.colorbar.Colorbar):

    cbar.outline.set_color(‘white’) cbar.outline.set_linewidth(2) cbar.dividers.set_color(‘red’) cbar.dividers.set_linewidth(2)

  • plot (bool) – whether to plot or not.

  • plot_legend (bool) – True or False, whether to plot legend, if False, legends can be plot with ClusterMapPlotter.plot_legends()

  • legend_anchor (str) – ax_heatmap or ax, the ax to which legend anchor.

  • legend_gap (float) – the columns gap between different legends.

  • legend_width (float [mm]) – width of the legend, default is None (infer from data automatically)

  • legend_hpad (float) – Horizonal space between the heatmap and legend, default is 2 [mm].

  • legend_vpad (float) – Vertical space between the top of legend_anchor and legend, default is 5 [mm].

  • legend_side (str) – right of left.

  • cmap (str) – default is ‘jet’, the colormap for heatmap colorbar, see plt.colormaps().

  • label (str) – the title (label) that will be shown in heatmap colorbar legend.

  • xticklabels_kws (dict) – xticklabels or yticklabels kws, such as axis, which, direction, length, width, color, pad, labelsize, labelcolor, colors, zorder, bottom, top, left, right, labelbottom, labeltop, labelleft, labelright, labelrotation, grid_color, grid_linestyle and so on. For more information,see ?matplotlib.axes.Axes.tick_params or ?ax.tick_params.

  • yticklabels_kws (dict) – the same as xticklabels_kws.

  • xlabel (str) – default is None (no xlabel would be shown).

  • ylabel (str) – default is None (no ylabel would be shown).

  • xlabel_kws (dict) –

    alpha,color,fontfamily,fontname,fontproperties,fontsize,fontstyle, fontweight,label,rasterized,rotation,rotation_mode(default,anchor),visible, zorder,verticalalignment,horizontalalignment. See ax.xaxis.label.properties(), for example:

    cm=ClusterMapPlotter(***), print(cm.ax.xaxis.label.properties())

    or matplotlib.axis.XAxis.label.properties() for detail.

  • ylabel_kws (dict) – sams as xlabel_kws

  • xlabel_side (str) – bottom or top, default is bottom,

  • ylabel_side (str) – left or right, default is left

  • xlabel_bbox_kws (dict) –

    alpha,clip_box, clip_on,edgecolor,facecolor,fill,height,in_layout,label, linestyle, linewidth,rasterized,visible,width. See ax.xaxis.label.get_bbox_patch().properties() for more information. For example:

    cm=ClusterMapPlotter(***), print(cm.ax.xaxis.label.get_bbox_patch().properties())

  • ylabel_bbox_kws (dict) – same as xlabel_bbox_kws

  • rasterized (bool) – default is auto, when the number of rows or number of cols > 5000, rasterized would be automatically set to True to speed up the plotting.

  • kwargs (kws passed to plot_heatmap, including vmin, vmax,center,robust,) – annot, annot_kws, fmt, mask, linewidths linecolor, na_col, cbar,cbar_kwss ,xticklabels/yticklabels and so on (see ?PyComplexHeatmap.clustermap.plot_heatmap). If annot is True, the values of data will be plotted on the top of heatmap, if annot is a dataframe, then the custom values will be plotted on heatmap, fmt should be set to None if dtype of annot is str. For documentation of custom annot, see https://dingwb.github.io/PyComplexHeatmap/build/html/notebooks/advanced_usage.html#Custom-annotation xticklabels/yticklabels will be shown automatically, if the width/height is too small to display all xticklabels, not all ticklabels will be shown (to avoid overlap). To force display all ticklabels, set xticklabels/yticklabels to True.

Return type:

Class ClusterMapPlotter.

cal_cold_between_groups(col_clusters)[source]
cal_rowd_between_groups(row_clusters)[source]
calculate_col_dendrograms(data, sizes=None, use_linkage=True)[source]
calculate_row_dendrograms(data, sizes=None, use_linkage=True)[source]
collect_legends()[source]
format_data(data, mask=None, z_score=None, standard_scale=None)[source]
plot(ax=None, subplot_spec=None, row_order=None, col_order=None)[source]
plot_dendrograms(row_order, col_order)[source]
plot_legends(ax=None)[source]
plot_matrix(row_order, col_order)[source]
post_processing()[source]
set_axes_labels_kws()[source]
set_height(fig, height)[source]
set_width(fig, width)[source]
set_xy_labels()[source]
static standard_scale(data2d, axis=1)[source]

Divide the data by the difference between the max and min

Parameters:
  • data2d (pandas.DataFrame) – Data to normalize

  • axis (int) – Which axis to normalize across. If 0, normalize across rows, if 1, normalize across columns.

Returns:

standardized – Noramlized data with a mean of 0 and variance of 1 across the specified axis.

Return type:

pandas.DataFrame

tight_layout(**tight_params)[source]
static z_score(data2d, axis=1)[source]

Standarize the mean and variance of the data axis

Parameters:
  • data2d (pandas.DataFrame) – Data to normalize

  • axis (int) – Which axis to normalize across. If 0, normalize across rows, if 1, normalize across columns.

Returns:

normalized – Noramlized data with a mean of 0 and variance of 1 across the specified axis.

Return type:

pandas.DataFrame

class PyComplexHeatmap.clustermap.DendrogramPlotter(data=None, linkage=None, metric='correlation', method='average', axis=0, label=True, rotate=False, sizes=None, dendrogram_kws=None)[source]

Bases: object

calculate_dendrogram()[source]
property calculated_linkage
check_array(data)[source]
get_coords(ax, gap_pixel=None, root_x=None)[source]
plot(ax, gap_pixel=None, root_x=None, tree_kws=None)[source]

Plots a dendrogram of the similarities between data on the axes :param ax: Axes object upon which the dendrogram is plotted :type ax: matplotlib.axes.Axes

property reordered_ind

Indices of the matrix, reordered by the dendrogram

PyComplexHeatmap.clustermap.composite(cmlist=None, main=0, ax=None, axis=1, row_gap=15, col_gap=15, legend_side='right', legend_gap=5, legend_y=0.8, legend_hpad=None, legend_width=None, width_ratios=None, height_ratios=None, verbose=1)[source]

Assemble multiple ClusterMapPlotter objects vertically or horizontally together.

Parameters:
  • cmlist (list) – a list of ClusterMapPlotter (with plot=False).

  • axis (int) – 1 for columns (align the cmlist horizontally), 0 for rows (vertically).

  • main (int) – use which as main ClusterMapPlotter, will influence row/col order. main is the index of cmlist.

  • row/col_gap (float) – the row or columns gap between subplots, unit is mm [15].

  • legend_side (str) – right,left [right].

  • legend_gap (float) – row gap between two legends, unit is mm.

  • legend_width (float) – default is None, will be estimated automatically

  • width_ratios (list) – a list of width, values can be either float or int.

  • height_ratios (list) – a list of height, values can be either float or int.

Returns:

ax,legend_axes

Return type:

tuple

PyComplexHeatmap.clustermap.heatmap(data, xlabel=None, ylabel=None, xlabel_side='bottom', ylabel_side='left', vmin=None, vmax=None, cmap=None, center=None, robust=False, cbar=True, cbar_kws=None, cbar_ax=None, square=False, xlabel_kws=None, ylabel_kws=None, xlabel_bbox_kws=None, ylabel_bbox_kws=None, xlabel_pad=None, ylabel_pad=None, xticklabels='auto', yticklabels='auto', xticklabels_side='bottom', yticklabels_side='left', xticklabels_kws=None, yticklabels_kws=None, mask=None, na_col='white', ax=None, annot=None, fmt='.2g', annot_kws=None, linewidths=0, linecolor='white', **kwargs)[source]

Plot heatmap.

Parameters:
  • data (dataframe) – pandas dataframe

  • ylabel (xlabel /) – True, False, or list of xlabels

  • ylabel_side (xlabel_side /) – bottom or top

  • vmax (float) – the maximal and minimal values for cmap colorbar.

  • vmin (float) – the maximal and minimal values for cmap colorbar.

  • center – the same as seaborn.heatmap

  • robust – the same as seaborn.heatmap

  • ylabel_kws (xlabel_kws /) – parameter from matplotlib.axis.XAxis.label.properties()

class PyComplexHeatmap.clustermap.heatmapPlotter(data=None, vmin=None, vmax=None, cmap='bwr', center=None, robust=True, annot=None, fmt='.2g', annot_kws=None, cbar=True, cbar_kws=None, xlabel=None, ylabel=None, xticklabels=True, yticklabels=True, mask=None, na_col='white')[source]

Bases: object

plot(ax, cax, xlabel_kws, xlabel_bbox_kws, ylabel_kws, ylabel_bbox_kws, xlabel_side, ylabel_side, xlabel_pad, ylabel_pad, xticklabels_side, yticklabels_side, xticklabels_kws, yticklabels_kws, kws)[source]

Draw the heatmap on the provided Axes.

PyComplexHeatmap.clustermap.plot_heatmap(data, vmin=None, vmax=None, cmap=None, center=None, robust=False, annot=None, fmt='.2g', annot_kws=None, xticklabels=True, yticklabels=True, mask=None, na_col='white', ax=None, linewidths=0, linecolor='white', **kwargs)[source]

Plot heatmap.

Parameters:
  • data (dataframe) – pandas dataframe

  • vmax (float) – the maximal and minimal values for cmap colorbar.

  • vmin (float) – the maximal and minimal values for cmap colorbar.

  • center – the same as seaborn.heatmap

  • robust – the same as seaborn.heatmap

  • annot (bool) – whether to add annotation for values

  • fmt (str) – annotation format.

  • anno_kws (dict) – passed to ax.text

  • xticklabels (bool) – whether to show ticklabels

  • yticklabels (bool) – whether to show ticklabels