cima.assessment package

Subset of modules that contain methods to layout the Hi-C information. For now, only Kamada-Kawai is implemented.

Submodules

cima.assessment.images_assessment module

Group of functions to perform different assessments on the structures, such as saturation, precision, cycle assessment and half dataset CCC. The functions are designed to be flexible and adaptable to different use cases, allowing the user to choose the parameters that best suit their needs. The output of the functions is a pandas DataFrame containing the results of the assessment, which can be easily exported to a file or used for further analysis. The functions also include an option to return the results as a list of lists, which can be useful for certain applications. Overall, these functions provide a comprehensive set of tools for assessing the quality and characteristics of genomic super resolution localisation data.

cima.assessment.images_assessment.Assessment_Chrom_Precision(StructureObj: Segment, chromatin_id: str, nuclei: str = '', date: str = '', hom: str = '', pathin: str | Path = '', writeout: bool = False, return_as_list: bool = False) → DataFrame | list[list[object]]

Assess chromatin structure precision over time and outputs summary data.

Parameters:

StructureObj (Segment) – The structure object containing chromatin data.
chromatin_id (str) – Identifier for the chromatin input structure.
nuclei (str, optional) – Label for the nuclei being assessed. Defaults to an empty string.
date (str, optional) – Label for the date of the experiment. Defaults to an empty string.
hom (str, optional) – Label for homology data in the experiment. Defaults to an empty string.
pathin (str, optional) – Directory path for saving the output file. Defaults to the folder where the input file is located.
writeout (bool, optional) – If True, writes the assessment data to a file. Defaults to True.
return_as_list (bool, optional) – If True, returns the assessment data as a list of lists instead of a DataFrame. Defaults to False.

Returns:

A DataFrame containing assessment data. Each row includes:

”chromatin_id”: Chromatin ID
”timepoint”: Timepoint label
”nuclei”: Nuclei label
”date”: Date of experiment
”hom”: Homology identifier
”mean_precision_x”: Mean precision in x axis
”mean_precision_y”: Mean precision in y axis
”mean_precision_z”: Mean precision in z axis
”numerosity_timepoint”: Number of localizations at the timepoint
”total_timepoints”: Total number of timepoints

Return type:

pd.DataFrame or list of lists

cima.assessment.images_assessment.Cycle_assessment(endstructure: Segment, prec_mean: float = 50.0, ccc_mode: int = 1, return_as_list: bool = False) → DataFrame | list[list[object]]

Performs cycle-based assessment of a structure using cross-correlation coefficient (CCC) calculations. This function evaluates the similarity between each cycle in the input structure (endstructure) and the overall structure by calculating the CCC. Each cycle is processed individually to compute the CCC score relative to the full structure, allowing for an assessment of structural changes over cycles.

Parameters:

endstructure (Segment) – The reference structure used for CCC calculations.
prec_mean (float, optional) – Precision value used in Gaussian blurring. Defaults to 0.0.
mode (str, optional) – Mode for Gaussian blurring (‘unit’ or other available modes). Defaults to “unit”.
ccc_mode (int, optional) –

Mode for calculating CCC between reference and cycle structures. Defaults to 1.
Mode options: - 1: Calculation on the complete maps. - 2: Calculation on the contoured maps. - 3: Calculation on the masked maps.
return_as_list (bool, optional) – If True, returns the results as a list of lists instead of a DataFrame. Defaults to False.

Returns:

A DataFrame containing the following columns:

”Cycle”: Cycle number (int)
”CCC”: CCC score (float)
”Nloc”: Number of atoms in the cycle (int)

Return type:

pd.DataFrame

cima.assessment.images_assessment.Saturation_by_experiment_pseudotime(endstructure: Segment, numberoflocations: int = 100, prec_mean: float = 50.0, ccc_mode: int = 1, getcom: bool = False, return_as_list: bool = False) → DataFrame | list[list[object]]

Calculates saturation over experimental pseudotime by evaluating the cross-correlation coefficient (CCC) between the reference structure and cumulative subsets of frames, allowing for an assessment of structural stability over pseudotime.

Pseudotime is defined as the cumulative number of localizations in the structure, independent of the cycle and zstep.

Parameters:

endstructure (Segment) – The reference structure used for CCC calculations.
numberoflocations (int, optional) – The number of locations to use in each chunk for CCC calculation, by default 100
prec_mean (float, optional) – Precision value used in Gaussian blurring, by default 0.
ccc_mode (int, optional) – Mode for calculating CCC between reference and partial structures, by default 1
getcom (bool, optional) – If True, includes the center of mass (COM) of partial structures in the output, by default False
return_as_list (bool, optional) – If True, returns the results as a list of lists instead of a DataFrame, by default False

Returns:

A list of lists, where each entry contains: - Pseudotime step (int) - CCC score (float) - Number of localizations in the chunk (int) - (Optional) COM of the partial structure (if getcom is True)

Return type:

list

cima.assessment.images_assessment.Saturation_to_final_structure(endstructure: Segment, prec_mean: float = 50.0, maxframes: int = 250, interval: int = 5, ccc_mode: int = 1, getcom: bool = False, selection_mode: str = 'frames', random_groups_num: int = 50, return_as_list: bool = False) → DataFrame | list[list[object]]

Calculates the saturation plot by evaluating the cross-correlation coefficient (CCC) between a reference structure and cumulative subsets of its frames, random groups or cycles, allowing the assessment of structural saturation over time or random groups.

Parameters:

endstructure (Segment) – The reference structure used for CCC calculations.
prec_mean (float, optional) – Precision value used in Gaussian blurring. Defaults to 50.0.
maxframes (int, optional) – Maximum number of frames to process, when in ‘frames’ mode. Defaults to 250.
interval (int, optional) – Number of frames in each chunk for CCC calculation, when in ‘frames’ mode. Defaults to 5.
ccc_mode (float, optional) – Mode for calculating CCC between reference and partial structures. Defaults to 1.
getcom (bool, optional) – If True, includes the center of mass (COM) of partial structures in the output. Defaults to False.
selection_mode (str, optional) – Mode of selection for frames (‘frames’, ‘random’ or ‘cycle’). Defaults to ‘frames’.
random_groups_num (int, optional) – Number of random groups used if selection_mode is ‘random’. Defaults to 50.
return_as_list (bool, optional) – If True, returns the results as a list of lists instead of a DataFrame. Defaults to False.

Returns:

pandas.DataFrame –
A DataFrame containing:
- ’Chunk’: Chunk number (int)
- ’CCC’: Cross-correlation coefficient (float)
- ’Nloc’: Number of localizations in the chunk (int)
- (Optional) ‘COM’: Center of mass of the partial structure (if getcom is True)
or
list of lists –
A list of lists (if return_as_list is True), where each inner list contains:
- Chunk number (int)
- CCC score (float)
- Number of localizations in the chunk (int)
- (Optional) COM of the partial structure (if getcom is True)

Raises:

ValueError – If selection_mode is not ‘frames’, ‘random’ or ‘cycle’.