Skip to main content

Cell Type Annotation

Overview

Automatically annotates cell types in single-cell RNA sequencing data using pre-trained CellTypist models. This block takes normalized gene expression data as input and assigns cell type labels to individual cells based on learned gene expression patterns. As output it generates cell type annotations with confidence scores for each cell.

Pipeline context

This block can follow Dimensionality Reduction, taking a raw count matrix as input. Its outputs are often used in downstream compositional analysis or differential expression blocks.

 Blocks                                 Result pool
┌───────────
┌─────────────────────────┤
│ │
v │
╔═══════════════════════════╗ exports │
║ scRNA-seq Preprocessing ║────->─-─┤ Count Matrices, Gene & Cell Properties
╚═══════════════════════════╝ │ --------------------------------------

├ [sampleId][cellId][geneId] -> raw & normalized counts
┌─────────────────────────┤
│ │
v │
╔═══════════════════════════╗ exports │
║ Cell Type Annotation ║────->─-─┤ Cell type labels & scores
╚═══════════════════════════╝ │ -------------------------

├ [sampleId][cellId] -> cellType, confidenceScore
┌─────────────────────────┤
│ │
v │
╔═══════════════════════╗ │
║ Downstream Analysis ║ │
╚═══════════════════════╝ │

Core structure: axes and p-columns

The block consumes a raw count matrix and produces p-columns containing cell type annotations and confidence scores.

Primary axes

Axis NameTypeDescription
pl7.app/sampleIdStringUniquely identifies the sample.
pl7.app/cellIdStringUniquely identifies a single cell within a sample.

Input p-columns

The block requires a raw gene expression count matrix as input.

1. Raw Counts

  • P-column name: pl7.app/rna-seq/countMatrix
  • Description: The raw number of reads (or UMIs) for each gene in each cell.
  • Requirement: Required.
  • Specification: The input p-column must have the pl7.app/rna-seq/normalized domain key set to "false".
# --- Core Identity ---
name: pl7.app/rna-seq/countMatrix
valueType: Long

# --- Axes ---
axesSpec:
- name: pl7.app/sampleId
type: String
- name: pl7.app/sc/cellId
type: String
- name: pl7.app/geneId
type: String

# --- Domain ---
domain:
pl7.app/rna-seq/normalized: "false"

# --- Annotations ---
annotations:
pl7.app/label: "Raw Count Matrix"

Exported P-Columns

The block exports a p-frame containing the cellType and cellTypeConfidenceScore p-columns that can be used by downstream blocks (e.g. Cluster Markers, Compositional Analysis).

1. Cell Type

  • P-column name: pl7.app/rna-seq/cellType
  • Description: The predicted cell type label for each cell.
  • Requirement: Required.
  • Specification:
# --- Core Identity ---
name: pl7.app/rna-seq/cellType
valueType: String

# --- Axes ---
axesSpec:
- name: pl7.app/sampleId
type: String
- name: pl7.app/cellId
type: String

# --- Domain ---
domain:
pl7.app/blockId: "..." # a unique identifier for the block run

# --- Annotations ---
annotations:
pl7.app/label: "Cell type"

2. Confidence Score

  • P-column name: pl7.app/rna-seq/cellTypeConfidenceScore
  • Description: The confidence score associated with the cell type prediction.
  • Requirement: Required.
  • Specification:
# --- Core Identity ---
name: pl7.app/rna-seq/cellTypeConfidenceScore
valueType: Double

# --- Axes ---
axesSpec:
- name: pl7.app/sampleId
type: String
- name: pl7.app/cellId
type: String

# --- Domain ---
domain:
pl7.app/blockId: "..." # a unique identifier for the block run

# --- Annotations ---
annotations:
pl7.app/label: "Cell type confidence score"

Summary of Exported P-Columns

P--Column NameDescriptionAxesRequirement
pl7.app/rna-seq/cellTypeThe predicted cell type label for each cell.[sampleId][cellId]Required
pl7.app/rna-seq/cellTypeConfidenceScoreThe confidence score for the prediction.[sampleId][cellId]Required