Cell Type Annotation

Overview

Automatically annotates cell types in single-cell RNA sequencing data using pre-trained CellTypist models. This block takes normalized gene expression data as input and assigns cell type labels to individual cells based on learned gene expression patterns. As output it generates cell type annotations with confidence scores for each cell.

Pipeline context

This block can follow Dimensionality Reduction, taking a raw count matrix as input. Its outputs are often used in downstream compositional analysis or differential expression blocks.

 Blocks                                 Result pool
                                       ┌───────────
             ┌─────────────────────────┤
             │                         │
             v                         │
 ╔═══════════════════════════╗ exports │
 ║  scRNA-seq Preprocessing  ║────->─-─┤ Count Matrices, Gene & Cell Properties
 ╚═══════════════════════════╝         │ --------------------------------------
                                       │
                                       ├ [sampleId][cellId][geneId] -> raw & normalized counts
             ┌─────────────────────────┤
             │                         │
             v                         │
 ╔═══════════════════════════╗ exports │
 ║   Cell Type Annotation    ║────->─-─┤ Cell type labels & scores
 ╚═══════════════════════════╝         │ -------------------------
                                       │
                                       ├ [sampleId][cellId] -> cellType, confidenceScore
             ┌─────────────────────────┤
             │                         │
             v                         │
 ╔═══════════════════════╗             │
 ║  Downstream Analysis  ║             │
 ╚═══════════════════════╝             │

Core structure: axes and p-columns

The block consumes a raw count matrix and produces p-columns containing cell type annotations and confidence scores.

Primary axes

Axis Name	Type	Description
`pl7.app/sampleId`	`String`	Uniquely identifies the sample.
`pl7.app/cellId`	`String`	Uniquely identifies a single cell within a sample.

Input p-columns

The block requires a raw gene expression count matrix as input.

1. Raw Counts

P-column name: pl7.app/rna-seq/countMatrix
Description: The raw number of reads (or UMIs) for each gene in each cell.
Requirement: Required.
Specification: The input p-column must have the pl7.app/rna-seq/normalized domain key set to "false".

# --- Core Identity ---
name: pl7.app/rna-seq/countMatrix
valueType: Long

# --- Axes ---
axesSpec:
  - name: pl7.app/sampleId
    type: String
  - name: pl7.app/sc/cellId
    type: String
  - name: pl7.app/geneId
    type: String

# --- Domain ---
domain:
  pl7.app/rna-seq/normalized: "false"

# --- Annotations ---
annotations:
  pl7.app/label: "Raw Count Matrix"

Exported P-Columns

The block exports a p-frame containing the cellType and cellTypeConfidenceScore p-columns that can be used by downstream blocks (e.g. Cluster Markers, Compositional Analysis).

1. Cell Type

P-column name: pl7.app/rna-seq/cellType
Description: The predicted cell type label for each cell.
Requirement: Required.
Specification:

# --- Core Identity ---
name: pl7.app/rna-seq/cellType
valueType: String

# --- Axes ---
axesSpec:
  - name: pl7.app/sampleId
    type: String
  - name: pl7.app/cellId
    type: String

# --- Domain ---
domain:
  pl7.app/blockId: "..." # a unique identifier for the block run

# --- Annotations ---
annotations:
  pl7.app/label: "Cell type"

2. Confidence Score

P-column name: pl7.app/rna-seq/cellTypeConfidenceScore
Description: The confidence score associated with the cell type prediction.
Requirement: Required.
Specification:

# --- Core Identity ---
name: pl7.app/rna-seq/cellTypeConfidenceScore
valueType: Double

# --- Axes ---
axesSpec:
  - name: pl7.app/sampleId
    type: String
  - name: pl7.app/cellId
    type: String

# --- Domain ---
domain:
  pl7.app/blockId: "..." # a unique identifier for the block run

# --- Annotations ---
annotations:
  pl7.app/label: "Cell type confidence score"

Summary of Exported P-Columns

P--Column Name	Description	Axes	Requirement
`pl7.app/rna-seq/cellType`	The predicted cell type label for each cell.	`[sampleId][cellId]`	Required
`pl7.app/rna-seq/cellTypeConfidenceScore`	The confidence score for the prediction.	`[sampleId][cellId]`	Required

Overview​

Pipeline context​

Core structure: axes and p-columns​

Primary axes​

Input p-columns​

1. Raw Counts​

Exported P-Columns​

1. Cell Type​

2. Confidence Score​

Summary of Exported P-Columns​