Readers
get_resource_path
def get_resource_path(
filename
):
Find the correct path to the resources/ directory based on execution context.
download_test_data
def download_test_data(
url:str='https://ftp.ncbi.nlm.nih.gov/geo/samples/GSM3748nnn/GSM3748087/suppl/GSM3748087%5F190c.isoforms.matrix.txt.gz', # URL to download the data from.
output_filename:str=None, # Name of the file to save the data as (default: name from the URL).
decompress:bool=True, # Whether to decompress the file if it is a gzip archive (default True).
)->str: # Path to the downloaded or decompressed file.
Download test data to the correct directory, dynamically adjusting based on the execution context. Optionally decompresses gzip files if detected.
iso_concat
def iso_concat(
data_inputs, batch_info:NoneType=None, batch_type:str='path'
):
Concatenates a list of AnnData objects or paths to AnnData objects based on the union of transcriptIds, while preserving geneId information which might be non-unique per transcriptId. Missing values are filled with zeros. Adds a batch column to .obs based on the file path, obs_names, or numeric.
Parameters: data_inputs (list of str or AnnData): List of paths to AnnData objects or AnnData objects to concatenate. batch_info (list of str, optional): List of batch identifiers for each AnnData object in data_inputs. If not provided, batch identifiers are extracted from file paths, obs_names, or a numeric sequence. batch_type (str, optional): Specifies which type of batch information to use. One of [‘path’, ‘obs_names’, ‘numeric’]. Defaults to ‘path’.
Returns: AnnData: A single concatenated AnnData object with harmonized features, geneId annotations, and batch info.
API reference
read_sicelore_isomatrix
def read_sicelore_isomatrix(
file_path:str, # Path to the isomatrix file (tab-delimited).
gene_id_label:str='geneId', # Row/column label used for gene IDs (default "geneId").
transcript_id_label:str='transcriptId', # Row/column label used for transcript IDs (default "transcriptId").
remove_undef:bool=True, # Whether to remove rows with transcriptId="undef" (default True).
sparse:bool=False, # Whether to store the matrix in sparse format (default False).
)->AnnData: # An AnnData object containing numeric data in `.X` and metadata in `.var`.
Read a SiCeLoRe isomatrix file (tab-delimited) and convert it into a scanpy-compatible AnnData object.
process_mouse_data
def process_mouse_data(
):
Downloads test data, reads two mouse isoform count matrices, and merges them into a single AnnData object. It also reads a CSV file containing barcode-to-cell_type mappings, merges this information into the AnnData object’s obs DataFrame, and filters out entries with no cell_type assigned.
Returns: combined_mouse_data (AnnData): The merged and annotated AnnData object.