PyComplexHeatmap.clustermap module¶
- class PyComplexHeatmap.clustermap.ClusterMapPlotter(data, z_score=None, standard_scale=None, top_annotation=None, bottom_annotation=None, left_annotation=None, right_annotation=None, row_cluster=True, col_cluster=True, row_cluster_method='average', row_cluster_metric='correlation', col_cluster_method='average', col_cluster_metric='correlation', show_rownames=False, show_colnames=False, row_names_side='right', col_names_side='bottom', xticklabels_kws=None, yticklabels_kws=None, row_dendrogram=False, col_dendrogram=False, row_dendrogram_size=10, col_dendrogram_size=10, row_split=None, col_split=None, row_dendrogram_kws=None, col_dendrogram_kws=None, bezier=False, dotsize=1, tree_kws=None, row_split_order=None, col_split_order=None, row_split_gap=0.5, col_split_gap=0.2, mask=None, subplot_gap=1, legend=True, legend_kws=None, plot=True, plot_legend=True, legend_order='auto', legend_anchor='auto', legend_gap=7, legend_width=None, legend_hpad=1, legend_vpad=5, legend_side='right', cmap='jet', label=None, xlabel=None, ylabel=None, xlabel_kws=None, ylabel_kws=None, xlabel_side='bottom', ylabel_side='left', xlabel_bbox_kws=None, ylabel_bbox_kws=None, rasterized='auto', legend_delta_x=None, verbose=1, **kwargs)[source]¶
Bases:
object
Clustermap (Heatmap) plotter. Plot heatmap / clustermap with annotation and legends.
- Parameters:
data (dataframe) – pandas dataframe or numpy array.
z_score (int) – whether to perform z score scale, either 0 for rows or 1 for columns, after scale, value range would be from -1 to 1.
standard_scale (int) – either 0 for rows or 1 for columns, after scale,value range would be from 0 to 1.
top_annotation (annotation: class of HeatmapAnnotation.) –
bottom_annotation (class AnnotationBase) – the same as top_annotation.
left_annotation (class AnnotationBase) – the same as top_annotation.
right_annotation (class AnnotationBase) – the same as top_annotation.
row_cluster (bool) – whether to perform cluster on rows/columns. Setting it to False will preserve the row order of the input data.
col_cluster (bool) – whether to perform cluster on rows/columns. Setting it to False will preserve the column order of the input data.
row_cluster_method (str) – cluster method for row/columns linkage, such single, complete, average,weighted, centroid, median, ward. see scipy.cluster.hierarchy.linkage or (https://docs.scipy.org/doc/scipy/reference/generated/scipy.cluster.hierarchy.linkage.html) for detail.
row_cluster_metric (str) – Pairwise distances between observations in n-dimensional space for row/columns, such euclidean, minkowski, cityblock, seuclidean, cosine, correlation, hamming, jaccard, chebyshev, canberra, braycurtis, mahalanobis, kulsinski et.al. centroid, median, ward. see scipy.cluster.hierarchy.linkage or https://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.spatial.distance.pdist.html Please try metric=’canberra’ if there are two columns having the same values.
col_cluster_method (str) – same as row_cluster_method
col_cluster_metric (str) – same as row_cluster_metric
show_rownames (bool) – True (default) or False, whether to show row ticklabels.
show_colnames (bool) – True of False, same as show_rownames.
row_names_side (str) – right or left.
col_names_side (str) – top or bottom.
row_dendrogram (bool) – True or False, whether to show dendrogram.
col_dendrogram (bool) – True or False, whether to show dendrogram.
row_dendrogram_size (int) – default is 10mm.
col_dendrogram_size (int) – default is 10mm.
row_split (int or pd.Series or pd.DataFrame) – number of cluster for hierarchical clustering or pd.Series or pd.DataFrame, used to split rows or rows into subplots.
col_split (int or pd.Series or pd.DataFrame) – int or pd.Series or pd.DataFrame, used to split rows or columns into subplots.
row_dendrogram_kws (dict) – custom linkage could be passed to row_dendrogram_kws, for example: row_dendrogram_kws=dict(linkage=my_linkage); Other kws passed to hierarchy.dendrogram.
col_dendrogram_kws (dict) – custom linkage could be passed to col_dendrogram_kws, for example: col_dendrogram_kws=dict(linkage=my_linkage); Other kws passed to hierarchy.dendrogram.
tree_kws (dict) – in addition to parameter colors, all other kws passed to DendrogramPlotter.plot()
row_split_order (list or str) – a list to specify the order of row_split, could also be ‘cluster_between_groups’, if cluster_between_groups was specified, hierarchical clustering will be performed on the mean values for each groups and pass the clsutered order to row_split_order. For example, see https://dingwb.github.io/PyComplexHeatmap/build/html/notebooks/advanced_usage.html#Cluster-between-groups
col_split_order (list or str) – a list to specify the order of col_split, could also be ‘cluster_between_groups’, if cluster_between_groups was specified, hierarchical clustering will be performed on the mean values for each groups and pass the clsutered order to row_split_order.
row_split_gap (float) – default are 0.5 and 0.2 mm for row and col.
col_split_gap (float) – default are 0.5 and 0.2 mm for row and col.
mask (dataframe or array) – mask the data in heatmap, the cell with missing values of infinite values will be masked automatically.
subplot_gap (float) – the gap between subplots, default is 1mm.
legend (bool) – True or False, whether to plot heatmap legend, determined by cmap.
legend_kws (dict) –
vmax, vmin and other kws passed to plot legend, such asfontsize, fontsize, labelcolor, numpoints, markerscale, markerfirst, frameon shadow, facecolor, edgecolor, title, title_fontsize, labelspacing and so on (see ?plt.legend) Alaternatively, we can also change the outline color and linewidth of cbar after plotting: ``` cm=ClusterMapPlotter(…) for cbar in cm.cbars: if isinstance(cbar,matplotlib.colorbar.Colorbar):
cbar.outline.set_color(‘white’) cbar.outline.set_linewidth(2) cbar.dividers.set_color(‘red’) cbar.dividers.set_linewidth(2)
plot (bool) – whether to plot or not.
plot_legend (bool) – True or False, whether to plot legend, if False, legends can be plot with ClusterMapPlotter.plot_legends()
legend_anchor (str) – ax_heatmap or ax, the ax to which legend anchor.
legend_order (str, bool or list) – control the order of legends, default is ‘auto’, sorted by length of legend. could also be True/False or a list (or tuple), if a list / tuple is provided, values should be the label (title) of each legend.
legend_gap (float) – the columns gap between different legends.
legend_width (float [mm]) – width of the legend, default is None (infer from data automatically)
legend_hpad (float) – Horizonal space between the heatmap and legend, default is 2 [mm].
legend_vpad (float) – Vertical space between the top of legend_anchor and legend, default is 5 [mm].
legend_side (str) – right of left.
cmap (str) – default is ‘jet’, the colormap for heatmap colorbar, see plt.colormaps().
label (str) – the title (label) that will be shown in heatmap colorbar legend.
xticklabels_kws (dict) – xticklabels or yticklabels kws, such as axis, which, direction, length, width, color, pad, labelsize, labelcolor, colors, zorder, bottom, top, left, right, labelbottom, labeltop, labelleft, labelright, labelrotation, grid_color, grid_linestyle and so on. For more information,see ?matplotlib.axes.Axes.tick_params or ?ax.tick_params.
yticklabels_kws (dict) – the same as xticklabels_kws.
xlabel (str) – default is None (no xlabel would be shown).
ylabel (str) – default is None (no ylabel would be shown).
xlabel_kws (dict) – alpha,color,fontfamily,fontname,fontproperties,fontsize,fontstyle, fontweight,label,rasterized,rotation,rotation_mode(default,anchor),visible, zorder,verticalalignment,horizontalalignment. See ax.xaxis.label.properties(), for example:
` cm=ClusterMapPlotter(***), print(cm.ax.xaxis.label.properties()) `
or` matplotlib.axis.XAxis.label.properties() for detail. `
ylabel_kws (dict) – sams as xlabel_kws
xlabel_side (str) – bottom or top, default is bottom,
ylabel_side (str) – left or right, default is left
xlabel_bbox_kws (dict) – alpha,clip_box, clip_on,edgecolor,facecolor,fill,height,in_layout,label, linestyle, linewidth,rasterized,visible,width. See ax.xaxis.label.get_bbox_patch().properties() for more information. For example:
` cm=ClusterMapPlotter(***), print(cm.ax.xaxis.label.get_bbox_patch().properties()) `
ylabel_bbox_kws (dict) – same as xlabel_bbox_kws
rasterized (bool) – default is auto, when the number of rows or number of cols > 5000, rasterized would be automatically set to True to speed up the plotting.
kwargs (kws passed to plot_heatmap, including vmin, vmax,center,robust,) – annot, annot_kws, fmt, mask, linewidths linecolor, na_col, cbar,cbar_kwss, xticklabels/yticklabels and so on (see ?PyComplexHeatmap.clustermap.plot_heatmap). If annot is True, the values of data will be plotted on the top of heatmap, if annot is a dataframe, then the custom values will be plotted on heatmap, fmt should be set to None if dtype of annot is str. For documentation of custom annot, see https://dingwb.github.io/PyComplexHeatmap/build/html/notebooks/advanced_usage.html#Custom-annotation xticklabels/yticklabels will be shown automatically, if the width/height is too small to display all xticklabels, not all ticklabels will be shown (to avoid overlap). To force display all ticklabels, set xticklabels/yticklabels to True.
- Return type:
Class ClusterMapPlotter.
- static standard_scale(data2d, axis=1)[source]¶
Divide the data by the difference between the max and min
- Parameters:
data2d (pandas.DataFrame) – Data to normalize
axis (int) – Which axis to normalize across. If 0, normalize across rows, if 1, normalize across columns.
- Returns:
standardized – Noramlized data with a mean of 0 and variance of 1 across the specified axis.
- Return type:
pandas.DataFrame
- static z_score(data2d, axis=1)[source]¶
Standarize the mean and variance of the data axis
- Parameters:
data2d (pandas.DataFrame) – Data to normalize
axis (int) – Which axis to normalize across. If 0, normalize across rows, if 1, normalize across columns.
- Returns:
normalized – Noramlized data with a mean of 0 and variance of 1 across the specified axis.
- Return type:
pandas.DataFrame
- class PyComplexHeatmap.clustermap.DendrogramPlotter(data=None, linkage=None, metric='correlation', method='average', sizes=None, dendrogram_kws=None)[source]¶
Bases:
object
- property calculated_linkage¶
- plot(ax, gap_pixel=None, root_x=None, tree_kws=None, bezier=False, dotsize=1, root_dot=True, axis=1, label=False, orientation=None, invert=True)[source]¶
Plots a dendrogram of the similarities between data on the axes :param ax: Axes object upon which the dendrogram is plotted :type ax: matplotlib.axes.Axes :param axis: 0 for rows (default) and 1 for columns. :type axis: int :param label: :type label: bool :param rotate: If axis==0 and one would like to plot row dendrogram, rotate
should be True
- property reordered_ind¶
Indices of the matrix, reordered by the dendrogram
- PyComplexHeatmap.clustermap.composite(cmlist=None, main=0, ax=None, axis=1, row_gap=15, col_gap=15, legend_side='right', legend_gap=5, legend_y=0.8, legend_hpad=None, legend_width=None, width_ratios=None, height_ratios=None, verbose=1)[source]¶
Assemble multiple ClusterMapPlotter objects vertically or horizontally together.
- Parameters:
cmlist (list) – a list of ClusterMapPlotter (with plot=False).
axis (int) – 1 for columns (align the cmlist horizontally), 0 for rows (vertically).
main (int) – use which as main ClusterMapPlotter, will influence row/col order. main is the index of cmlist.
row/col_gap (float) – the row or columns gap between subplots, unit is mm [15].
legend_side (str) – right,left [right].
legend_gap (float) – row gap between two legends, unit is mm.
legend_width (float) – default is None, will be estimated automatically
width_ratios (list) – a list of width, values can be either float or int.
height_ratios (list) – a list of height, values can be either float or int.
- Returns:
ax,legend_axes
- Return type:
tuple
- PyComplexHeatmap.clustermap.heatmap(data, xlabel=None, ylabel=None, xlabel_side='bottom', ylabel_side='left', vmin=None, vmax=None, cmap=None, center=None, robust=False, cbar=True, cbar_kws=None, cbar_ax=None, square=False, xlabel_kws=None, ylabel_kws=None, xlabel_bbox_kws=None, ylabel_bbox_kws=None, xlabel_pad=None, ylabel_pad=None, xticklabels='auto', yticklabels='auto', xticklabels_side='bottom', yticklabels_side='left', xticklabels_kws=None, yticklabels_kws=None, mask=None, na_col='white', ax=None, annot=None, fmt='.2g', annot_kws=None, linewidths=0, linecolor='white', **kwargs)[source]¶
Plot heatmap.
- Parameters:
data (dataframe) – pandas dataframe
ylabel (xlabel /) – True, False, or list of xlabels
ylabel_side (xlabel_side /) – bottom or top
vmax (float) – the maximal and minimal values for cmap colorbar.
vmin (float) – the maximal and minimal values for cmap colorbar.
center – the same as seaborn.heatmap
robust – the same as seaborn.heatmap
ylabel_kws (xlabel_kws /) – parameter from matplotlib.axis.XAxis.label.properties()
- class PyComplexHeatmap.clustermap.heatmapPlotter(data=None, vmin=None, vmax=None, cmap='bwr', center=None, robust=True, annot=None, fmt='.2g', annot_kws=None, cbar=True, cbar_kws=None, xlabel=None, ylabel=None, xticklabels=True, yticklabels=True, mask=None, na_col='white')[source]¶
Bases:
object
- PyComplexHeatmap.clustermap.plot_heatmap(data, vmin=None, vmax=None, cmap=None, center=None, robust=False, annot=None, fmt='.2g', annot_kws=None, xticklabels=True, yticklabels=True, mask=None, na_col='white', ax=None, linewidths=0, linecolor='white', **kwargs)[source]¶
Plot heatmap.
- Parameters:
data (dataframe) – pandas dataframe
vmax (float) – the maximal and minimal values for cmap colorbar.
vmin (float) – the maximal and minimal values for cmap colorbar.
center – the same as seaborn.heatmap
robust – the same as seaborn.heatmap
annot (bool) – whether to add annotation for values
fmt (str) – annotation format.
anno_kws (dict) – passed to ax.text
xticklabels (bool) – whether to show ticklabels
yticklabels (bool) – whether to show ticklabels