Cell Type Annotation
Overview
Automatically annotates cell types in single-cell RNA sequencing data using pre-trained CellTypist models. This block takes normalized gene expression data as input and assigns cell type labels to individual cells based on learned gene expression patterns. As output it generates cell type annotations with confidence scores for each cell.
Pipeline context
This block can follow Dimensionality Reduction, taking a raw count matrix as input. Its outputs are often used in downstream compositional analysis or differential expression blocks.
Blocks Result pool
┌───────────
┌─────────────────────────┤
│ │
v │
╔═══════════════════════════╗ exports │
║ scRNA-seq Preprocessing ║────->─-─┤ Count Matrices, Gene & Cell Properties
╚═══════════════════════════╝ │ --------------------------------------
│
├ [sampleId][cellId][geneId] -> raw & normalized counts
┌─────────────────────────┤
│ │
v │
╔═══════════════════════════╗ exports │
║ Cell Type Annotation ║────->─-─┤ Cell type labels & scores
╚═══════════════════════════╝ │ -------------------------
│
├ [sampleId][cellId] -> cellType, confidenceScore
┌─────────────────────────┤
│ │
v │
╔═══════════════════════╗ │
║ Downstream Analysis ║ │
╚═══════════════════════╝ │
Core structure: axes and p-columns
The block consumes a raw count matrix and produces p-columns containing cell type annotations and confidence scores.
Primary axes
| Axis Name | Type | Description |
|---|---|---|
pl7.app/sampleId | String | Uniquely identifies the sample. |
pl7.app/cellId | String | Uniquely identifies a single cell within a sample. |
Input p-columns
The block requires a raw gene expression count matrix as input.
1. Raw Counts
- P-column name:
pl7.app/rna-seq/countMatrix - Description: The raw number of reads (or UMIs) for each gene in each cell.
- Requirement: Required.
- Specification: The input p-column must have the
pl7.app/rna-seq/normalizeddomain key set to"false".
# --- Core Identity ---
name: pl7.app/rna-seq/countMatrix
valueType: Long
# --- Axes ---
axesSpec:
- name: pl7.app/sampleId
type: String
- name: pl7.app/sc/cellId
type: String
- name: pl7.app/geneId
type: String
# --- Domain ---
domain:
pl7.app/rna-seq/normalized: "false"
# --- Annotations ---
annotations:
pl7.app/label: "Raw Count Matrix"
Exported P-Columns
The block exports a p-frame containing the cellType and cellTypeConfidenceScore p-columns that can be used by downstream blocks (e.g. Cluster Markers, Compositional Analysis).
1. Cell Type
- P-column name:
pl7.app/rna-seq/cellType - Description: The predicted cell type label for each cell.
- Requirement: Required.
- Specification:
# --- Core Identity ---
name: pl7.app/rna-seq/cellType
valueType: String
# --- Axes ---
axesSpec:
- name: pl7.app/sampleId
type: String
- name: pl7.app/cellId
type: String
# --- Domain ---
domain:
pl7.app/blockId: "..." # a unique identifier for the block run
# --- Annotations ---
annotations:
pl7.app/label: "Cell type"
2. Confidence Score
- P-column name:
pl7.app/rna-seq/cellTypeConfidenceScore - Description: The confidence score associated with the cell type prediction.
- Requirement: Required.
- Specification:
# --- Core Identity ---
name: pl7.app/rna-seq/cellTypeConfidenceScore
valueType: Double
# --- Axes ---
axesSpec:
- name: pl7.app/sampleId
type: String
- name: pl7.app/cellId
type: String
# --- Domain ---
domain:
pl7.app/blockId: "..." # a unique identifier for the block run
# --- Annotations ---
annotations:
pl7.app/label: "Cell type confidence score"
Summary of Exported P-Columns
| P--Column Name | Description | Axes | Requirement |
|---|---|---|---|
pl7.app/rna-seq/cellType | The predicted cell type label for each cell. | [sampleId][cellId] | Required |
pl7.app/rna-seq/cellTypeConfidenceScore | The confidence score for the prediction. | [sampleId][cellId] | Required |