cima.utils package

Submodules

cima.utils.matrices_comparison module

class cima.utils.matrices_comparison.SCCComputer

Bases: object

getDiagValidCounts(m1: ndarray, m2: ndarray) ndarray

Get the number of valid (finite) entries in each diagonal of the upper triangle of the input matrices m1 and m2.

Parameters:
  • m1 (np.ndarray) – First matrix.

  • m2 (np.ndarray) – Second matrix.

Returns:

An array containing the number of valid entries in each diagonal of the upper triangle of the input matrices.

Return type:

np.ndarray

sccByDiag2(m1: ndarray, m2: ndarray) float

Compute diagonal-wise SCC score for the two input matrices m1 and m2, where the SCC score for each diagonal is computed as the Pearson correlation coefficient between the corresponding diagonals of m1 and m2, and the final SCC score is a weighted average of the diagonal-wise SCC scores, with weights being a function of the number of valid entries in each diagonal.

Parameters:
  • m1 (np.ndarray) – First matrix.

  • m2 (np.ndarray) – Second matrix.

Returns:

Diagonal-wise SCC score.

Return type:

float

sccByDiagOriginal(m1pre: coo_matrix, m2pre: coo_matrix, nDiags: int = 0) float

Compute diagonal-wise hicrep SCC score for the two input matrices up to nDiags diagonals

Parameters:
  • m1 (sp.coo_matrix) – Input contact matrix 1

  • m2 (sp.coo_matrix) – Input contact matrix 2

  • nDiags (int, optional) – Compute SCC scores for diagonals whose index is in the range of [1, nDiags). If nDiags is 0, compute for all diagonals (default is 0)

Returns:

Hicrep SCC scores

Return type:

float

upperDiagCsr(m: coo_matrix, nDiags: int)

Convert an input sp.coo_matrix into a sp.csr_matrix where each row in the the output corresponds to one diagonal of the upper triangle of the input.

Parameters:
  • m (sp.coo_matrix) – Input matrix

  • nDiags (int) – Output diagonals with index in the range [1, nDiags) as rows of the output matrix

Returns:

Whose rows are the diagonals of the input

Return type:

sp.csr_matrix

varVstran(n: int | ndarray) int | ndarray

Calculate the variance of variance-stabilizing transformed (or vstran() in the original R implementation) data. The vstran() turns the input data into ranks, whose variance is only a function of the input size:

` var(1/n, 2/n, ..., n/n) = (1 - 1/(n^2))/12 `

or with Bessel’s correction:

` var(1/n, 2/n, ..., n/n, ddof=1) = (1 + 1.0/n)/12 `

See section “Variance stabilized weights” in reference for more detail: https://genome.cshlp.org/content/early/2017/10/06/gr.220640.117

Parameters:

n (int or np.ndarray) – Size of the input data

Returns:

Variance of the ranked input data with Bessel’s correction

Return type:

Union[int, np.ndarray]

cima.utils.matrices_comparison.compute_rmsd(A: ndarray, B: ndarray) float

Computes the Root Mean Square Deviation (RMSD) between two 2D matrices, accounting for NaN entries.

Parameters:
  • A (np.ndarray) – First matrix.

  • B (np.ndarray) – Second matrix.

Returns:

The RMSD value.

Return type:

float

cima.utils.matrices_comparison.getLogLinearRegression(vals1: ndarray, vals2: ndarray) tuple[float, float]

Get the linear regression parameters (intercept and slope) for the log10 transformed input values. The linear regression is only performed on the values where both log10 transformed values are finite.

Parameters:
  • vals1 (array-like) – First set of values.

  • vals2 (array-like) – Second set of values.

Returns:

A tuple containing the intercept and slope of the linear regression.

Return type:

tuple

cima.utils.matrices_comparison.makeAxesLabelsLog()

Make the axes labels in log scale with base 10, with the format of $10^{x}$.

cima.utils.matrices_comparison.pearson_corr(m1: ndarray, m2: ndarray) float

Compute the Pearson correlation coefficient between two matrices m1 and m2, only considering the upper triangle of the matrices and ignoring NaN values.

Parameters:
  • m1 (np.ndarray) – First matrix.

  • m2 (np.ndarray) – Second matrix.

Returns:

The Pearson correlation coefficient between the two matrices.

Return type:

float

cima.utils.matrices_comparison.plotLine(intercept: float, slope: float, xlim0: float, xlim1: float, num: int = 10, **kwargs)

Plot a line with the given intercept and slope within the specified x limits.

Parameters:
  • intercept (float) – The intercept of the line.

  • slope (float) – The slope of the line.

  • xlim0 (float) – The lower limit of the x-axis.

  • xlim1 (float) – The upper limit of the x-axis.

  • num (int, optional) – The number of points to generate for plotting the line (default is 10).

  • **kwargs (dict) – Additional keyword arguments to pass to the plt.plot() function.

cima.utils.matrices_comparison.plotLogWithKdeColors(arr1: ndarray, arr2: ndarray, log: bool = True, **kwargs) tuple[float, float]

Plot a scatter plot of the two input arrays with log10 transformation and color the points by their density estimated by KDE. The linear regression line is also plotted on the scatter plot.

Parameters:
  • arr1 (array-like) – First set of values.

  • arr2 (array-like) – Second set of values.

  • log (bool, optional) – Whether to apply log10 transformation (default is True).

  • **kwargs (dict) – Additional keyword arguments to pass to the plt.plot() function.

Returns:

A tuple containing the Pearson correlation coefficient and the slope of the linear regression.

Return type:

tuple

cima.utils.matrices_comparison.scc(m1: ndarray, m2: ndarray) float

Compute the stratum adjusted correlation coefficient (SCC) between two matrices m1 and m2.

Parameters:
  • m1 (np.ndarray) – First matrix.

  • m2 (np.ndarray) – Second matrix.

Returns:

The SCC value between the two matrices.

Return type:

float

cima.utils.matrices_comparison.spearman_corr(m1: ndarray, m2: ndarray) float

Compute the Spearman correlation coefficient between two matrices m1 and m2, only considering the upper triangle of the matrices and ignoring NaN values.

Parameters:
  • m1 (np.ndarray) – First matrix.

  • m2 (np.ndarray) – Second matrix.

Returns:

The Spearman correlation coefficient between the two matrices.

Return type:

float

cima.utils.misc module

cima.utils.misc.calculate_rmsd_matrix(matrices: list[ndarray]) ndarray

Computes a matrix of RMSD values between a list of 2D matrices.

Parameters:

matrices (list[np.ndarray]) – List of 2D matrices.

Returns:

A 2D array where the entry at (i, j) is the RMSD between matrices[i] and matrices[j].

Return type:

np.ndarray

cima.utils.misc.compute_rmsd(A: ndarray, B: ndarray) float

Computes the Root Mean Square Deviation (RMSD) between two 2D matrices, accounting for NaN entries.

Parameters:
  • A (np.ndarray) – First matrix.

  • B (np.ndarray) – Second matrix.

Returns:

The RMSD value.

Return type:

float

cima.utils.misc.find_column_names(file_path)

Find the column names in a file.

Parameters:

file_path (str) – The path to the file to search.

Returns:

A list of column names found in the file.

Return type:

list

cima.utils.misc.fof_bs_ct_reader(fof_core_file: Path, fof_trace_file: Path) list

Convert ball and stick FOF_CT files to CIMA format. The function does not write any file, it just returns a list of CIMA objects

Parameters:
  • fof_core_file (Path) – Path to the ball and stick FOF_CT core file.

  • fof_trace_file (Path) – Path to the ball and stick FOF_CT trace file.

Returns:

List of CIMA objects created from the FOF_CT files.

Return type:

list

cima.utils.misc.fof_volumetric_ct_reader(fof_core_file: Path, fof_trace_file: Path, fof_spot_file: Path) list

Convert volumetricFOF_CT files to CIMA format. The function does not write any file, it just returns a list of CIMA objects

Parameters:
  • fof_core_file (Path) – Path to the volumetric FOF_CT core file.

  • fof_trace_file (Path) – Path to the volumetric FOF_CT trace file.

  • fof_spot_file (Path) – Path to the volumetric FOF_CT spot file.

Returns:

List of CIMA objects created from the FOF_CT files.

Return type:

list

cima.utils.misc.fof_volumetric_ct_writer(table_folder: Path, data2write: list, fof_metadata: dict, reg_dict: dict, fluor: dict = {}, cima_metadata: dict = {'expID': 'experimentID', 'homolog': 'homolog', 'location': 'locationID', 'nucleus': 'nucleusID'}, file_suffix: str = '', volumetric_format: bool = True) None

Write FOF_CT files. It will write all the CIMA objects of the list to a single set of FOF_CT files.

Parameters:
  • table_folder (Path) – Path to the folder where the FOF_CT files will be written.

  • data2write (list) – List of CIMA objects to write to the FOF_CT files.

  • fof_metadata (dict) – Metadata to include in the FOF_CT files.

  • reg_dict (dict) – Dictionary of regions to include in the FOF_CT files. The keys are the timepoints (as in the CSV), and the values are tuples with the chromosome, start, and end positions.

  • fluor (dict) – Dictionary to link the fluorophore that was used at every timepoint. Example {1: “Alexa488”, 2: “Cy3”, 3: “Cy5”}.

  • cima_metadata (dict) – Metadata to include in the FOF_CT files, gathered from the CIMA objects. This is a translation dictionary to get the metadata from the CIMA objects and put it in the FOF_CT files. The keys are the names of the metadata in the FOF_CT files, and the values are the names of the metadata in the CIMA objects.

  • file_suffix (str) – Suffix to add to the FOF_CT files. This is useful to distinguish different sets of FOF_CT files, for example if you want to write different sets of FOF_CT files.

  • volumetric_format (bool) – If True, the function will write the FOF_CT files in the volumetric format, where each spot is associated with a region. If False, the function will write the FOF_CT files in the ball and stick format, where each spot is associated with a trace. The default is True.

cima.utils.misc.get_dtype_dict(columns_list: list, dict_dtype: dict) dict

Function to get a dictionary to facilitate the change of data types of pandas columns. This results in saving memory overall.

Parameters:
  • columns_list (list) – List of the columns in the DataFrame (you can use df_name.columns)

  • dict_dtype (dict) – Dictionary with the desired data types for specific columns.

Returns:

Dictionary with the desired data types for specific columns.

Return type:

dict

cima.utils.stats_function module

cima.utils.stats_function.Create_Random_Hom_Ratio(featuresel_list: list, seed: int = 42, samplesize: int = 1000) list | None

Create a list of random ratios between the values of two randomly selected features from the input list.

Parameters:
  • featuresel_list (list) – A list of features, where each feature is a tuple containing a name and a value.

  • seed (int, optional) – The seed for the random number generator (default is 42).

  • samplesize (int, optional) – The number of random ratios to generate (default is 1000).

Returns:

A list of random ratios between the values of two randomly selected features. If the number of features is greater than the number of generated ratios, a message is printed and None is returned.

Return type:

list | None

cima.utils.stats_function.Create_Random_Hom_Ratio_Paternal_Origin(listhomfeature: list, seed1: int = 42, seed2: int = 24, samplesize: int = 100, parentalnorm: bool = False) list | None
cima.utils.stats_function.Create_Random_Hom_Ratio_largeroversmaller(featuresel_list: list, seed: int = 42, samplesize: int = 1000)

Create a list of random ratios between the values of two randomly selected features from the input list, where the larger value is divided by the smaller value.

Parameters:
  • featuresel_list (list) – A list of features, where each feature is a tuple containing a name and a value.

  • seed (int, optional) – The seed for the random number generator (default is 1).

  • samplesize (int, optional) – The number of random ratios to generate (default is 1000).

Returns:

A list of random ratios between the values of two randomly selected features, where the larger value is divided by the smaller value. If the number of features is greater than the number of generated ratios, a message is printed and None is returned.

Return type:

list

cima.utils.stats_function.Create_Random_Hom_Ratio_smalloverlarger(featuresel_list, seed: int = 1, samplesize: int = 1000) list | None

Create a list of random ratios between the values of two randomly selected features from the input list, where the smaller value is divided by the larger value.

Parameters:
  • featuresel_list (list) – A list of features, where each feature is a tuple containing a name and a value.

  • seed (int, optional) – The seed for the random number generator (default is 1).

  • samplesize (int, optional) – The number of random ratios to generate (default is 1000).

Returns:

A list of random ratios between the values of two randomly selected features, where the smaller value is divided by the larger value. If the number of features is greater than the number of generated ratios, a message is printed and None is returned.

Return type:

list | None

cima.utils.stats_function.func(x, a, b, c, d)
cima.utils.stats_function.func2(x: float, a: float, b: float, c: float) float
cima.utils.stats_function.func_inverse1(x, m, c, c0)
cima.utils.stats_function.func_powerlaw(x: float, a: float, b: float, c0: float) float

Power law function with an offset.

Parameters:
  • x (float) – The input value.

  • a (float) – The coefficient.

  • b (float) – The exponent.

  • c0 (float) – The offset.

Returns:

The result of the power law function with an offset.

Return type:

float

cima.utils.stats_function.func_sqrt(x: float, a: float, b: float, c0: float) float

Root function with an offset.

Parameters:
  • x (float) – The input value.

  • a (float) – The coefficient.

  • b (float) – The exponent of the root.

  • c0 (float) – The coefficient.

  • c0 – The offset.

Returns:

The result of the root function with an offset.

Return type:

float

cima.utils.stats_function.getLogLinearRegression(vals1: ndarray, vals2: ndarray) tuple[float, float]

Get the linear regression parameters (intercept and slope) for the log10 transformed input values. The linear regression is only performed on the values where both log10 transformed values are finite.

Parameters:
  • vals1 (array-like) – First set of values.

  • vals2 (array-like) – Second set of values.

Returns:

A tuple containing the intercept and slope of the linear regression.

Return type:

tuple

cima.utils.stats_function.makeAxesLabelsLog()

Make the axes labels in log scale with base 10, with the format of $10^{x}$.

cima.utils.stats_function.plotLine(intercept: float, slope: float, xlim0: float, xlim1: float, num: int = 10, **kwargs)

Plot a line with the given intercept and slope within the specified x limits.

Parameters:
  • intercept (float) – The intercept of the line.

  • slope (float) – The slope of the line.

  • xlim0 (float) – The lower limit of the x-axis.

  • xlim1 (float) – The upper limit of the x-axis.

  • num (int, optional) – The number of points to generate for plotting the line (default is 10).

  • **kwargs (dict) – Additional keyword arguments to pass to the plt.plot() function.

cima.utils.stats_function.plotLogWithKdeColors(arr1: ndarray, arr2: ndarray, log: bool = True, **kwargs)

Plot a scatter plot of the two input arrays with log10 transformation and color the points by their density estimated by KDE. The linear regression line is also plotted on the scatter plot.

Parameters:
  • arr1 (array-like) – First set of values.

  • arr2 (array-like) – Second set of values.

  • log (bool, optional) – Whether to apply log10 transformation (default is True).

  • **kwargs (dict) – Additional keyword arguments to pass to the plt.plot() function.

Returns:

A tuple containing the Pearson correlation coefficient and the slope of the linear regression.

Return type:

tuple

cima.utils.stats_function.power_law(x: float, a: float, b: float) float

Power law function.

Parameters:
  • x (float) – The input value.

  • a (float) – The coefficient.

  • b (float) – The exponent.

Returns:

The result of the power law function.

Return type:

float

cima.utils.stats_function.statConvert(s: float) str

Convert a decimal p-value into stars. The thresholds are as follows: - p <= 0.0001: **** - p <= 0.001: *** - p <= 0.01: ** - p <= 0.05: * - p > 0.05: the p-value in scientific notation with 2 decimal places

Parameters:

s (float) – The p-value to convert.

Returns:

The corresponding star representation or the p-value in scientific notation.

Return type:

str

cima.utils.stats_function.welch_ttest(x, y)

cima.utils.vector module

class cima.utils.vector.Vector(x: float, y: float, z: float)

Bases: object

A class representing Cartesian 3-dimensonal vectors.

arg(vector: Vector) float

Return the angle between this vector and another vector in radians.

Parameters:

vector (Vector) – The vector to calculate the angle to.

Returns:

The angle between the two vectors in radians.

Return type:

float

copy() Vector
Return type:

A copy of Vector instance

cross(vector: Vector) Vector

Calculate the cross product of this and another vector specified as input parameter.

Parameters:

vector (Vector) – The vector to calculate the cross product with.

Return type:

A Vector instance of the cross product of this and another vector specified as input parameter

dist(vector: Vector) float

Return the distance between this and another vector.

Parameters:

vector (Vector) – The vector to calculate the distance to.

Returns:

The distance between this and another vector.

Return type:

float

dot(vector: Vector) float

Calculate the dot product of this and another vector specified as input parameter.

Parameters:

vector (Vector) – The vector to calculate the dot product with.

Returns:

The result of the dot product.

Return type:

float

matrix_transform(rot_mat)

Transform the vector using a transformation matrix.

Parameters:

rot_mat (matrix) – A 3x3 Python matrix instance.

Returns:

A Vector instance representing the transformed vector.

Return type:

Vector

mod() float
Returns:

The modulus (length) of the vector.

Return type:

float

reverse() Vector

Flip the direction of a Vector instance.

Return type:

A Vector instance.

times(factor: float) Vector

Multiplies a Vector instance by a scalar factor.

Parameters:

factor (float) – The scalar factor to multiply the vector by.

Returns:

A Vector instance

Return type:

Vector

to_atom()

Create an Atom instance based on Vector instance.

Returns:

An Atom instance based on the Vector instance.

Return type:

BioPyAtom

translate(x: float, y: float, z: float) Vector

Translate a Vector instance.

Parameters:
  • x (float) – Distance in Angstroms to translate the vector in the x direction.

  • y (float) – Distance in Angstroms to translate the vector in the y direction.

  • z (float) – Distance in Angstroms to translate the vector in the z direction.

Returns:

A Vector instance representing the translated vector.

Return type:

Vector

unit() Vector

Returns a Vector instance of a unit vector.

cima.utils.vector.align_2seqs(seq1, seq2)
cima.utils.vector.altTorsion(a: Vector, b: Vector, c: Vector) float

An alternate and better way to find the torsion angle between planes ab and bc.

Parameters:
  • a (Vector) – Vector instances.

  • b (Vector) – Vector instances.

  • c (Vector) – Vector instances.

Return type:

The torsion angle in radians

cima.utils.vector.axis_angle_to_euler(x: float, y: float, z: float, turn: float, rad: bool = False) tuple

Converts the axis angle rotation to an Euler form.

Parameters:
  • x (float) – axis of rotation (does not need to be normalised).

  • y (float) – axis of rotation (does not need to be normalised).

  • z (float) – axis of rotation (does not need to be normalised).

  • turn (float) – angle of rotation, in radians if rad=True, else in degrees.

  • rad (bool) – if True, the angle is in radians, otherwise in degrees.

Returns:

A 3-tuple (x,y,z) containing the Euler angles. .

Return type:

tuple

cima.utils.vector.axis_angle_to_matrix(x: float, y: float, z: float, turn: float, rad: bool = False) ndarray

Converts the axis angle rotation to a matrix form.

Parameters:
  • x – axis of rotation (does not need to be normalised).

  • y – axis of rotation (does not need to be normalised).

  • z – axis of rotation (does not need to be normalised).

  • turn (float) – angle of rotation.

  • rad (bool) – if True, the angle is in radians, otherwise in degrees.

Return type:

A 3X3 transformation matrix.

cima.utils.vector.calcMtrx(arr: list) list[list]

Calculate 3 x 4 transformation matrix from Euler angles and offset.

Parameters:

arr (list) – [psi,theta,phi,offsetx,offsety,offsetz].

Returns:

3 x 4 transformation matrix

Return type:

list of lists

cima.utils.vector.cps(mat_1, mat_2)

Find rotation and translation difference between two transformations. :param *mat_1: Transformation matrices. :param mat_2*: Transformation matrices.

Returns:

The translation and rotation differences

cima.utils.vector.euler_to_matrix(x_turn: float, y_turn: float, z_turn: float, rad: bool = False) ndarray

Converts an euler rotation to a matrix form.

Parameters:
  • x_turn (float) – Rotation angles around respective axis.

  • y_turn (float) – Rotation angles around respective axis.

  • z_turn (float) – Rotation angles around respective axis.

  • rad (bool) – if True, the angles are in radians, otherwise in degrees.

Return type:

A 3X3 transformation matrix.

cima.utils.vector.random_vector(min_v: float, max_v: float) Vector

Generate a random vector. The values for the vector component x, y, and z are randomly sampled between minimum and maximum values specified.

Parameters:
  • min_v (float) – Minimum and maximum values for the vector components.

  • max_v (float) – Minimum and maximum values for the vector components.

Returns:

A Vector instance with randomly generated components.

Return type:

Vector

cima.utils.vector.random_vector2(ul_list)

Generate a random vector. The values for the vector component x, y, and z are randomly sampled between minimum and maximum values specified in the list.

Parameters:

ul_list (list) – A list containing the minimum and maximum values for the vector components.

Returns:

A vector with randomly generated components.

Return type:

Vector

cima.utils.vector.torsion(a: Vector, b: Vector, c: Vector) float

Find the torsion angle between planes ab and bc.

Parameters:
  • a (Vector) – Vector instances.

  • b (Vector) – Vector instances.

  • c (Vector) – Vector instances.

Returns:

The torsion angle in radians

Return type:

float

cima.utils.visualization module

cima.utils.visualization.addLabelsOnPlot(ax, xs, ys, names)
cima.utils.visualization.getCoM(points)

Returns the center of mass of the points

cima.utils.visualization.getCustomColormap()

Return a colormap based on gist_ncar from which the upper 5% has been removed

cima.utils.visualization.plot3DConvexHull(seg: Segment, scalar: float = 0, smooth_shading: bool = True, plotter: Plotter | None = None) Plotter

Uses a pyvista.Plotter object to plot the polygon obtained from seg via convex hull, then returns the plotter

Parameters:
  • seg (Segment) – SegmentXYZ object to plot

  • scalar (float, optional) – Scalar assigned to the object. This allows to color different objects with different colors, according to cmap ‘viridis’ (default is 0)

  • smooth_shading (bool, optional) – Whether to smooth the surface of the shown object (default is True)

  • plotter (pyvista.Plotter, optional) – Pyvista.Plotter object to use for plotting. If None a new one will be created and returned (default is None)

Returns:

plotter – Pyvista.Plotter object on which the object was plotted

Return type:

pyvista.Plotter

cima.utils.visualization.plot3DMapAsCubes(m: Map, threshold: float = 0.5, resolution_multiplier: float = 0.5, plotter: Plotter | None = None, color=None) Plotter

Uses a pyvista.Plotter object to plot the set of voxels with value above threshold as cubes, then returns the plotter

Parameters:
  • m (Map) – Map object to plot

  • threshold (float, optional) – Density threshold used to define the contours of the map (default is 0.5)

  • scalar (float, optional) – Scalar assigned to the object. This allows to color different objects with different colors, according to cmap ‘viridis’ (default is 0)

  • resolution_multiplier (float, optional) – Multiplied by map resolution to define the side of the cubes. Values smaller than 1 make the object partially transparent allowing the user to see its inside volume (default is 0.5)

  • plotter (pyvista.Plotter, optional) – Pyvista.Plotter object to use for plotting. If None a new one will be created and returned (default is None)

  • color (str or tuple, optional) – Color to use for the cubes. If None, the colormap ‘viridis’ will be used (default is None)

Returns:

plotter – Pyvista.Plotter object on which the object was plotted

Return type:

pyvista.Plotter

cima.utils.visualization.plot3DMapAsPoints(m: Map, threshold: float = 0.5, scalar: float = 0, plotter: Plotter | None = None) Plotter

Uses a pyvista.Plotter object to plot the centers of voxels with value above the threshold, then returns the plotter.

Parameters:
  • m (Map) – Map object to plot

  • threshold (float, optional) – Density threshold used to define the contours of the map (default is 0.5)

  • scalar (float, optional) – Scalar assigned to the object. This allows to color different objects with different colors, according to cmap ‘viridis’ (default is 0)

  • plotter (pyvista.Plotter, optional) – Pyvista.Plotter object to use for plotting. If None, a new one will be created and returned (default is None)

Returns:

plotter – Pyvista.Plotter object on which the object was plotted

Return type:

pyvista.Plotter

cima.utils.visualization.plot3DMapMarchingCubes(m: Map, threshold: float = 0.5, scalar: float = 0, smooth_shading: bool = True, cmap2use: str = 'viridis', plotter: Plotter | None = None, **kwargs) Plotter

Uses a pyvista.Plotter object to plot the polygon obtained from m via marching cubes, then returns the plotter

Parameters:
  • m (Map) – Map object to plot

  • threshold (float, optional) – Density threshold used to define the contours of the map (default is 0.5)

  • scalar (float, optional) – Scalar assigned to the object. This allows to color different objects with different colors, according to the specified cmap (default is 0)

  • smooth_shading (bool, optional) – Whether to smooth the surface of the shown object (default is True)

  • cmap2use (str, optional) – The colormap used to color the Map (default is ‘viridis’)

  • plotter (pyvista.Plotter, optional) – Pyvista.Plotter object to use for plotting. If None a new one will be created and returned (default is None)

Returns:

plotter – Pyvista.Plotter object on which the object was plotted

Return type:

pyvista.Plotter

cima.utils.visualization.plot3DMultipleMapsMarchingCubes(m: Map, threshold: float = 0.5, scalars: list[float] | None = None, smooth_shading: bool = True, cmap2use: str = 'viridis', plotter: Plotter | None = None, labels: list[str] | None = None, labels_font: int = 20) Plotter

Uses a pyvista.Plotter object to plot the polygon obtained from m via marching cubes, then returns the plotter

Parameters:
  • m (Map) – Map object to plot

  • threshold (float, optional) – Density threshold used to define the contours of the map (default is 0.5)

  • scalar (float, optional) – Scalar assigned to the object. This allows to color different objects with different colors, according to cmap ‘viridis’ (default is 0)

  • smooth_shading (bool, optional) – Whether to smooth the surface of the shown object (default is True)

  • cmap (str, optional) – The colormap used to color the Map (default is ‘viridis’)

  • plotter (pyvista.Plotter, optional) – Pyvista.Plotter object to use for plotting. If None a new one will be created and returned (default is None)

  • labels (list of str, optional) – Labels to add to the plot (default is None)

  • labels_font (int, optional) – Font size for the labels (default is 20)

Returns:

plotter – Pyvista.Plotter object on which the object was plotted

Return type:

pyvista.Plotter

cima.utils.visualization.plotClustering2DProjections(coords: ndarray, labels: ndarray | None = None, background_coords: ndarray | None = None, show_noise: bool = False, annot: bool = False, cmap2use: str = 'gist_ncar', point_size_front: float = 0.1, point_size_back: float = 0.1, colorbar: bool = False, legend: bool = False, alpha: float = 0.1) Figure

Creates 3 scatter plots representing the 2d projections.

Parameters:
  • coords (np.ndarray) – 3d coordinates of the points

  • labels (np.ndarray | None, optional) – labels by which to color points. Those with label=-1 are considered noise

  • show_noise (bool, optional) – if false noise points are not diplayed

  • background_coords (np.ndarray | None, optional) – a set of coords to be displayed behind all the others and colored in gray

  • annot (bool, optional) – if true the center of mass of each cluster (the set of points with the same label value) is annotated with the cluster number

  • cmap2use (str, optional) – the cmap for coloring the points

  • point_size_front (float, optional) – size of the points in the foreground

  • point_size_back (float, optional) – size of the points in the background

  • colorbar (bool, optional) – if true show a colorbar near each subplot

  • legend (bool, optional) – whether to show the legend of colors

  • alpha (float, optional) – the transparency of points colors

Returns:

a figure containing the 3 subplots

Return type:

matplotlib.figure.Figure

cima.utils.visualization.plotClustering3D(coords: ndarray, true_labels: ndarray | None = None, single_noise_color: bool = True, show_noise: bool = False, plotter: Plotter | None = None, cmap2use: str = 'gist_ncar', opacity: ndarray | None = None, custom_clim: tuple | None = None, points_size: int | None = None, **kwargs) Plotter

Creates a 3-d pyvista scatter plot of the provided coordinates. The result is not diplayed automatically. You need to call the show() method of the returned object to display the result.

Parameters:
  • coords (np.ndarray) – 3d coordinates of the points

  • true_labels (np.ndarray | None) – labels by which to color points. Those with label<0 are considered noise

  • single_noise_color (bool) – whether to diplay all points with label<0 in grey or to apply cmap also on them. If show_noise is false this parameter has no effect

  • show_noise (bool) – if false, noise points are not diplayed,

  • plotter (pv.Plotter | None) – if not None it will be used to plot the points,

  • cmap2use (str) – the cmap for coloring the points

  • **kwargs (arguments that will be passed to pv.Plotter.add_points)

Return type:

a pyvista Plotter object with the coordinates plotted in it

cima.utils.visualization.plotDecodingTrace(decoding_seg: Segment, labels_col: str = 'locusID', plotter: Plotter | None = None, show_labels: bool = False) Plotter

Plots a segmented line connecting the centers of mass of the decoded loci. Plots in 3D using pyvista

Parameters:
  • decoding_seg (Segment) – Segment containing x,y,z and clusterID columns. Each cluster should represent a genomic region

  • labels_col (str, optional) – Column name to use for labels (default is ‘locusID’)

  • plotter (pyvista.Plotter, optional) – If provided, the output is plotted on it

  • show_labels (bool, optional) – Whether to show labels on clusters’ centers of mass

Returns:

plotter – Pyvista.Plotter object on which the decoding trace was plotted

Return type:

pyvista.Plotter

cima.utils.visualization.plotDecodingWithSpecifiedColors(decoding_seg: Segment, def_df: DataFrame, plotter: Plotter | None = None, cmap: str = 'bwr') Plotter

Displays localizations colored according to the value present in the column ‘value’ of def_df. Plots in 3D using pyvista

Parameters:
  • decoding_seg (Segment) – Segment containing x,y,z and clusterID columns. Each cluster should represent a genomic region

  • def_df (pandas.DataFrame) – DataFrame with columns name and value. name should contain strings composed as m%i where %i is one the clusterIDs of decoding_seg value should be a float in the range [0.0, 1.0] representing the value to use to color that genomic region

  • plotter (pyvista.Plotter, optional) – If provided, the output is plotted on it

  • cmap (str, optional) – Colormap to use for the coloring (default is ‘bwr’)

Returns:

plotter – Pyvista.Plotter object on which the decoding trace was plotted

Return type:

pyvista.Plotter

cima.utils.walk_features module

cima.utils.walk_features.MeanMatrix(arraylist, cmap='Greens', vmin=0, vmax=0.6, plot=True)

Given a list of matrices, this function computes the mean and the variation of the matrices in the list, and plots them as heatmaps.

Parameters:
  • arraylist (list of np.ndarray) – List of matrices to be analyzed.

  • cmap (str, optional) – Colormap to be used for the mean heatmap. Default is “Greens”.

  • vmin (float, optional) – Minimum value for the colormap. Default is 0.

  • vmax (float, optional) – Maximum value for the colormap. Default is 0.6.

  • plot (bool, optional) – Whether to plot the heatmaps. Default is True.

Returns:

A tuple containing the mean matrix, the variation matrix, and the collection matrix.

Return type:

tuple

cima.utils.walk_features.convertToFreq(pairs_features_df: DataFrame, value: str, filename: str, cluss: ndarray = array([5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]), cut0ff: float = 500.0) ndarray

Converts from long to wide format. Moreover it fills missing pairs with Nans and copies relation values for mirrored loci pairs. Pivots on columns ‘segment1_timepoint’ and ‘segment2_timepoint’, but only for the pairs of segments with the specified filename in the ‘flag’ column. Finally it converts the values to frequencies by applying a cutoff.

Parameters:
  • pairs_features_df (pd.DataFrame) – DataFrame in long format to be converted

  • value (str) – Column name of the values to be pivoted

  • filename (str) – Value in the ‘flag’ column to filter the pairs of segments to be included in the final matrix

  • cluss (np.ndarray, optional) – Array of all the timepoints to be included in the final matrix. By default it is set to np.arange(5, 21).

  • cut0ff (float, optional) – Value to use as cutoff for converting values to frequencies. By default it is set to 500.

Returns:

Array with the frequencies obtained by applying the cutoff to the values in the matrix obtained from pairs_features_df.

Return type:

np.ndarray

cima.utils.walk_features.convertToMatrix(pairs_features_df: DataFrame, value: str, cluss: ndarray = array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14])) DataFrame

Converts from long to wide format. Moreover it fills missing pairs with Nans and copies relation values for mirrored loci pairs. Pivots on columns ‘segment1_timepoint’ and ‘segment2_timepoint’.

Parameters:
  • pairs_features_df (pd.DataFrame) – DataFrame in long format to be converted

  • value (str) – Column name of the values to be pivoted

  • cluss (np.ndarray, optional) – Array of all the timepoints to be included in the final matrix. By default it is set to np.arange(1,15).

Returns:

DataFrame in wide format with index and columns given by timepoints and values given by the specified value column. The matrix is symmetric and missing pairs are filled with Nans.

Return type:

pd.DataFrame

cima.utils.walk_features.convertToMatrix2(pairs_features_df: DataFrame, value: str, filename: str, cluss: ndarray = array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14])) DataFrame

Converts from long to wide format. Moreover it fills missing pairs with Nans and copies relation values for mirrored loci pairs. Pivots on columns ‘segment1_timepoint’ and ‘segment2_timepoint’, but only for the pairs of segments with the specified filename in the ‘flag’ column.

Parameters:
  • pairs_features_df (pd.DataFrame) – DataFrame in long format to be converted

  • value (str) – Column name of the values to be pivoted

  • filename (str) – Value in the ‘flag’ column to filter the pairs of segments to be included in the final matrix

  • cluss (np.ndarray, optional) – Array of all the timepoints to be included in the final matrix. By default it is set to np.arange(1,15).

Returns:

DataFrame in wide format with index and columns given by timepoints and values given by the specified value column. The matrix is symmetric and missing pairs are filled with Nans.

Return type:

pd.DataFrame

cima.utils.walk_features.getEigenvecsFromContactMat(mat: ndarray, num_corr: int = 1, show: bool = False, eigenvector_number: int = 0) ndarray

Given a contact matrix, this function returns the specified eigenvector of the correlation matrix obtained by applying num_corr times the correlation operation on mat or on the previous correlation matrix.

Parameters:
  • mat (np.ndarray) – Square matrix to apply the procedure to (contact frequency matrix). All values in it must be finite

  • num_corr (int, optional) – Number of times the correlation matrix is built from mat or from the previous correlation matrix, by default 1

  • show (bool, optional) – Whether to show the matrix from which the eigenvector is extracted, by default False

  • eigenvector_number (int, optional) – Which eigenvector to take return (0: first, 1: second, 2: third, …), by default 0

Returns:

The specified eigenvector of the correlation matrix.

Return type:

np.ndarray

cima.utils.walk_features.getMergedMorphologicalAndSpatialSegmentsFeatures(single_features_df: DataFrame, pairs_features_df: DataFrame, metadata: DataFrame | None = None) DataFrame

Merges single segment features with features computed on pairs of segments. The dataframes this function requires can be obtained by using functions getSegmentsFeatures, getSpatialFeaturesIntraWalk and getSegmentsMetadata, respectively.

Parameters:
  • single_features_df (pd.DataFrame) – DataFrame with columns indicating feature values and index indicating identifiers of segments

  • pairs_features_df (pd.DataFrame) – DataFrame with columns ‘segment1’ and ‘segment2’ indicating the pair of segment and the rest of columns indicating feature values

  • metadata (pd.DataFrame | None) – DataFrame with columns indicating metadata of segments and index indicating identifiers of segments, or None if no metadata is provided

Returns:

DataFrame with segment identifiers on the index and all the columns indicating a morphological or spatial feature. Moreover if metadata is provied its columns are appended at the beginning of the DataFrame

Return type:

pd.DataFrame

cima.utils.walk_features.getSegmentFeatures(seg: Segment, map: Map, features: str | list = 'all', threshold: float = 0.5, verbose: bool = False) dict

Returns a dictionary with the specified features computed on seg or map at the specified threshold

Parameters:
  • seg (Segment) – The segment object on which to compute features.

  • map (Map) – The map that is expected to be computed from seg.

  • features (str or list, optional) – Which features to compute. Use string ‘all’ to compute all the available ones.

  • threshold (float, optional) – Used to specify the threshold.

  • verbose (bool, optional) – If True, prints progress messages.

Returns:

A dictionary with the computed features.

Return type:

dict

cima.utils.walk_features.getSegmentMetadata(seg: Segment) dict

Gets the segment metadata and stores it into a dictionary.

Parameters:

seg (Segment) – The segment object from which to extract metadata.

Returns:

A dictionary with all the metadata of seg.

Return type:

dict

cima.utils.walk_features.getSegmentsFeatures(segments, features: str | list = 'all', prec_mean: float = 45, factor: float = 1.0, threshold: float = 0.5, verbose: bool = False, n_jobs: int = 1) DataFrame

Returns a DataFrame with the specified features computed on segments at the specified threshold

Parameters:
  • segments (list) – list of Segment objects on which to compute features

  • features (str | list) – which features to compute. Use string ‘all’ to compute all the available ones

  • prec_mean (float) – resolution of the maps that are built from segments to compute features

  • factor (float) – used to specity the threshold. For each map the threshold is computed using DensityProprieties.calculate_map_threshold_SR with this factor

  • threshold (float) – used to specity the threshold. If factor is None, this is used for all the maps

  • n_jobs (int) – number of cpus to use for the processing

Returns:

a DataFrame with each row containing the features for a segment

Return type:

pd.DataFrame

cima.utils.walk_features.getSegmentsMetadata(segs: list) DataFrame

Gets the metadata for a list of segments and returns it as a DataFrame.

Parameters:

segs (list) – A list of Segment objects.

Returns:

A DataFrame with each row containing metadata for a segment.

Return type:

pd.DataFrame

cima.utils.walk_features.getSegmentsPairFeatures(maps: list, features: str | list = 'all', threshold1: float = 0.5, threshold2: float = 0.5) dict

Returns a dictionary with the specified features computed on maps at the specified threshold. The features available are ‘coms_distance’, ‘surface_distance’ and ‘entanglement’.

Parameters:
  • maps (list) – list of two maps on which to compute features

  • features (str | list) – which features to compute. Use string ‘all’ to compute all the available ones

  • threshold1 (float) – the threshold to use on the maps for the computation

  • threshold2 (float) – the threshold to use on the maps for the computation

cima.utils.walk_features.getSpatialFeaturesInterWalk(segments1: list[Segment], segments2: list[Segment], features: str | list = 'all', prec_mean: float = 45, factor: float = 1.0, threshold: float = 0.5, n_jobs: int = 1, verbose: bool = False, fill_missing_timepoints: bool = False, max_timepoint1: int | None = None, max_timepoint2: int | None = None) DataFrame

Returns a DataFrame with the specified features computed on pairs of segments at the specified threshold

Parameters:
  • segments1 (list) – list of Segment objects coming from the first walk

  • segments2 (list) – list of Segment objects coming from the second walk

  • features (str or list) – which features to compute. Use string ‘all’ to compute all the available ones

  • prec_mean (float) – resolution of the maps that are built from segments to compute features

  • factor (float) – used to specity the threshold. For each map the threshold is computed using DensityProprieties.calculate_map_threshold_SR with this factor

  • threshold (float) – used to specity the threshold. If factor is None, this is used for all the maps

  • n_jobs (int) – number of cpus to use for the processing

  • fill_missing_timepoints (bool) – whether to add rows, with Nan values, representing comparisongs between missing timepoints

  • max_timepoint1 (int or None) – if fill_missing_timepoints is True, this is used to determine the missing timepoints

  • max_timepoint2 (int or None) – if fill_missing_timepoints is True, this is used to determine the missing timepoints

Returns:

a DataFrame with each row containing the features for a pair of segments

Return type:

pd.DataFrame

cima.utils.walk_features.getSpatialFeaturesIntraWalk(segments: list, features: str | list = 'all', prec_mean: int = 45, factor: float = 1.0, threshold: float = 0.5, n_jobs: int = 1, verbose: bool = False, fill_missing_timepoints: bool = False, max_timepoint: int | None = None) DataFrame

Returns a DataFrame with the specified features computed on pairs of segments at the specified threshold

Parameters:
  • segments (list) – list of Segment objects on which to compute features

  • features (str | list) – which features to compute. Use string ‘all’ to compute all the available ones

  • prec_mean (int) – resolution of the maps that are built from segments to compute features

  • factor (float) – used to specity the threshold. For each map the threshold is computed using DensityProprieties.calculate_map_threshold_SR with this factor

  • threshold (float) – used to specity the threshold. If factor is None, this is used for all the maps

  • n_jobs (int) – number of cpus to use for the processing

  • fill_missing_timepoints (bool) – whether to add rows, with Nan values, representing comparisongs between missing timepoints

  • max_timepoint (int) – if fill_missing_timepoints is True, this is used to determine the missing timepoints

Returns:

a DataFrame with each row containing the features for a pair of segments

Return type:

pd.DataFrame

cima.utils.write_pdb module

cima.utils.write_pdb.writeCOM_toPDB(list_coord, writeout=True, filenameout='', factor=1.0, same_residue_id=False)

Returns a PDB file for ball and stick rapresentation

Parameters:
  • list_coord – list of COM coortinates

  • writeout – save the pdb file

  • filenameout – location and name of the output file

  • factor – divides coords by this number before writing. Necessary to avoid crossing width limits set by pdb format. If None a valid factor is automatically computed and used.

  • same_residue_id – whether to use 0 as the residue id for all the atoms, or to use instead the index included in list_coord

Return type:

pdb file format