Switch Search
safe_sum
def safe_sum(
X
):
Call self as a function.
build_isoform_metrics_table
def build_isoform_metrics_table(
adata:ad.AnnData, group_columns:Sequence[str] | str, comparisons:Sequence[Tuple[str, str]] | None=None,
group_vs_rest:bool=False, epsilon:float=1e-06, n_jobs:int=-1
)->pd.DataFrame:
Call self as a function.
delta_pi_one_gene_signed
def delta_pi_one_gene_signed(
dpsi:pd.Series
)->float:
Signed Tilgner Δπ (keeps ± sign of the stronger direction).
delta_pi_one_gene
def delta_pi_one_gene(
dpsi:pd.Series
)->float:
Unsigned Tilgner Δπ (always ≥ 0).
build_isoform_metrics_table
def build_isoform_metrics_table(
adata:ad.AnnData, group_columns:Sequence[str] | str, comparisons:Sequence[Tuple[str, str]] | None=None,
group_vs_rest:bool=False, epsilon:float=1e-06, n_jobs:int=-1
)->pd.DataFrame:
Call self as a function.
delta_pi_one_gene_signed
def delta_pi_one_gene_signed(
dpsi:pd.Series
)->float:
Signed Tilgner Δπ (keeps ± sign of the stronger direction).
volcano_grid
def volcano_grid(
df_all, n_cols:int=3, # How many columns in the grid.
figsize_per_panel:float=3.0, # inches
save:NoneType=None, # "all_volcanos.pdf" / ".svg" / None
show:bool=True, # Whether to display the figure inline.
panel_kw:VAR_KEYWORD, # Extra kwargs for `draw_volcano_panel` (cut-offs, sizes, etc.).
):
Draw every (group_1, group_2) combo in df_all on one figure grid.
draw_volcano_panel
def draw_volcano_panel(
ax, df_comp, # dataframe for ONE (group1, group2)
effect_col:str='effect_size', pval_col:str='adj_pval', fdr_cutoff:float=0.05, eff_cutoff:float=0.1, top_n:int=6,
bg_size:int=10, # marker areas (pt²)
sig_size:int=25, font_size:int=7
):
Draws a Δπ vs FDR volcano into ax* (NO fig.show/save here).*
filter_sample_replicated_transcripts
def filter_sample_replicated_transcripts(
adata, sample_col:str='sample', min_samples:int=2, min_umi:int=1
):
Filters out transcripts that are not detected above a minimum UMI threshold in at least a specified number of samples (replicates).
Parameters: adata: AnnData object containing transcript-level counts. sample_col: Column in adata.obs that identifies the replicate/sample. min_samples: Minimum number of samples in which the transcript must be expressed. min_umi: Minimum UMI count in a sample for the transcript to be considered “expressed”.
Returns: A new AnnData object containing only transcripts that replicate across samples.
API reference
SwitchSearch
def SwitchSearch(
anndata_obj:ad.AnnData, group_columns:Sequence[str] | str=('cell_type',), n_jobs:int=-1, fast_mode:bool=True,
precompute_metrics:bool=False
):
χ²-based isoform-switch screen supporting nested and 1-vs-rest designs, with auto-managed transcript-metrics caching. FP-control additions: - min_expected_count: χ² validity guard (default 5.0; set None to disable) - min_gene_total: require gene total counts per group (default None/off) - min_isoforms_gene: require >= this many isoforms for a gene (default 2) - min_present_isoforms: require >= this many isoforms pass count threshold (optional) New (optional): - min_prevalence_pct: require >= this % of spots/cells express an isoform (in either group) - min_prevalent_isoforms: require >= this many isoforms meet prevalence criterion (default 2) - prevalence_count_thr: isoform considered expressed in a cell/spot if count >= thr (default 1)
SwitchSearch.find_switches_chi2
def find_switches_chi2(
primary_col:Optional[str]=None, secondary_col:Optional[str]=None, within:str='primary', group_vs_rest:bool=False,
targets:Optional[Sequence[str]]=None, min_reads:int=30, # existing thresholds
fdr:float=0.05, min_expected_count:Optional[float]=5.0, # set None to disable
min_gene_total:Optional[int]=None, # per group; set None to disable
min_isoforms_gene:int=2, min_present_isoforms:Optional[int]=None, # set None to disable
present_count_thr:int=1, min_prevalence_pct:Optional[float]=None, # e.g. 1.0 or 5.0; set None to disable
min_prevalent_isoforms:int=2, # require >= N isoforms meet prevalence in either group
prevalence_count_thr:int=1, # isoform is "expressed" in cell/spot if count>=thr
return_transcript_metrics:bool=False, # outputs
calc_effect_size:bool=False, effect_size_mode:str='abs', # 'abs' or 'signed'
drop_unmatched_metrics:bool=True, n_jobs:Optional[int]=None
)->pd.DataFrame:
Call self as a function.
pseudobulk_diff_splice
def pseudobulk_diff_splice(
adata, group_col:str='cell_type', replicate_col:str='batch', layer:NoneType=None, # None = use .X
gene_col:str='geneId', comparisons:NoneType=None, # list of (g1, g2); None = all pairwise
group_vs_rest:bool=False, min_cells:int=10, # drop pseudobulk samples with fewer cells
min_transcript_counts:int=10, # min total counts across all pseudobulk samples
min_isoform_fraction:float=0.01, # min fraction of gene total (filters minor isoforms)
min_samples_expressed:int=1, # min pseudobulk samples with >0 counts
covariates:NoneType=None, # str or list of obs column names
output_level:str='gene', # "gene" or "transcript"
fdr:float=0.05, return_all:bool=False, # if True, skip FDR filter and return everything
prior_count:float=0.125, n_jobs:int=1
): # gene_id, n_transcripts, group_1, group_2,
stat, p_value, simes_p_value, adj_pval
Pseudobulk differential splicing test using edgePython’s diff_splice (QL F-test).
Counts are summed per (group_col, replicate_col), a quasi-likelihood GLM is fitted per comparison pair, and diff_splice tests whether each gene/transcript shows differential isoform usage beyond its overall expression change.