Functions

statistical_analysis_method

Description

The fastccc.core.statistical_analysis_method function performs statistical analysis on cell-cell communication, supporting various distribution calculation methods. It offers options for filtering candidate LRIs and constructing communication scores, allowing users to customize analysis parameters and optionally save the results.

Function Signature

def statistical_analysis_method(
    database_file_path,
    celltype_file_path,
    counts_file_path,
    convert_type = 'hgnc_symbol',
    single_unit_summary = 'Mean',
    complex_aggregation = 'Minimum',
    LR_combination = 'Arithmetic',
    min_percentile = 0.1,
    style = None,
    meta_key = None, 
    select_list = [], 
    filter_ = False,
    use_DEG = False,
    save_path = None
)

Parameters

Parameter Type Default Value Description
database_file_path str   Path to the database directory containing the candidate LRIs.
celltype_file_path str   Path to the cell type annotation file. If the h5ad count file already contains cell type labels, this can be set to None, and the meta_key parameter should be specified instead.
counts_file_path str   Path to the normalized log1p-transformed matrix file in h5ad format.
convert_type str 'hgnc_symbol' Type of gene identifier used in your data, such as 'hgnc_symbol' or 'ensembl'.
single_unit_summary str 'Mean' Method for calculating single-unit expression summaries, options include 'Mean', 'Median' or 'Q2', 'Q3', 'Quantile_x', etc.
complex_aggregation str 'Minimum' Method for calculating multi-unit complex summaries, options include 'Minimum', 'Average'.
LR_combination str 'Arithmetic' Method for combining ligand and receptor score to calculate \(CS\), options include 'Arithmetic' and 'Geometric' average.
min_percentile float 0.1 Minimum non-zero expression percentile threshold for filtering genes within each cell type cluster. If 'Quantile_x' is selected, percentile will be set as max(min_percentile, 1-x).
meta_key str or None None Metadata key specifying the column in adata.obs that contains the cell type labels.
select_list list [] List of specific set of LRIs to select for analysis.
filter_ bool False Whether to enable data filtering. If set to True, empty cells and genes will be removed.
use_DEG bool False Whether to use differentially expressed genes (DEGs) for further filtering.
save_path str or None None Path to save the analysis results; if None, results are saved to the default path.

Returns

Return Value Type Description
interactions_strength pandas.DataFrame A dataframe containing the calculated \(CS\) between different LRIs between sender and receiver cell types.
pvals pandas.DataFrame A dataframe with \(p\)-values indicating the statistical significance of the interactions.
percents_analysis pandas.DataFrame A dataframe summarizing the percentage anaylsis of the interactions.

All these results will be also saved to the save_path folder.


Cauchy_combination_of_statistical_analysis_methods

Function Signature

def Cauchy_combination_of_statistical_analysis_methods(
    database_file_path,
    celltype_file_path,
    counts_file_path,
    convert_type = 'hgnc_symbol',
    single_unit_summary_list = ['Mean', 'Median', 'Q3', 'Quantile_0.9'],
    complex_aggregation_list = ['Minimum', 'Average'],
    LR_combination_list = ['Arithmetic', 'Geometric'],
    min_percentile = 0.1,
    save_path = None,
    meta_key = None, 
    select_list = [], 
    filter_ = False,
    use_DEG = False
)

Description

The fastccc.core.Cauchy_combination_of_statistical_analysis_methods function performs an advanced statistical analysis by combining multiple single-unit summary, complex aggregation, and ligand-receptor integration methods. It processes biological data to assess CCC using different statistical approaches and combines the results using the Cauchy combination method. The function offers flexibility by allowing users to specify various distribution methods and filtering options, making it suitable for comprehensive CCC analysis in scRNA-seq datasets.

Parameters

Parameter Type Default Value Description
database_file_path str   Path to the database directory containing the candidate LRIs.
celltype_file_path str   Path to the cell type annotation file. If the h5ad count file already contains cell type labels, this can be set to None, and the meta_key parameter should be specified instead.
counts_file_path str   Path to the normalized log1p-transformed matrix file in h5ad format.
convert_type str 'hgnc_symbol' Type of gene identifier used in your data, such as 'hgnc_symbol' or 'ensembl'.
single_unit_summary_list list[str] ['Mean', 'Median', 'Q3', 'Quantile_0.9'] List of methods for calculating single-unit summaries.
complex_aggregation_list list[str] ['Minimum', 'Average'] List of methods for calculating multi-unit complex summaries.
LR_combination_list list[str] ['Arithmetic', 'Geometric'] List of methods for ligand-receptor combination functions.
min_percentile float 0.1 Minimum non-zero expression percentile threshold for filtering genes within each cell type cluster. For each scoring method, if 'Quantile_x' is selected, percentile will be set as max(min_percentile, 1-x).
save_path str or None None Path to save analysis results; if None, results are saved to the default path.
meta_key str or None None Metadata key specifying the column in adata.obs that contains the cell type labels.
select_list list [] List of specific set of LRIs to select for analysis.
filter_ bool False Whether to enable data filtering. If set to True, empty cells and genes will be removed.
use_DEG bool False Whether to use differentially expressed genes (DEGs) in the analysis.

Returns

No variables will be returned directly. Instead, all scoring-specific results, along with the final combined results, will be saved to the user-specified folder.


Version Information

  • Author: Siyu Hou
  • Version: early access
  • Last Updated: 2025-02-25