Using p-columns in a Workflow

The workflow, written in a Tengo-based scripting language, is where the core data processing logic of a block is described. It defines input data requirements, executes bioinformatics tools, processes the results, and exports new p-columns and p-frames for use by the UI or other blocks.

Platforma SDK provides a rich set of libraries for all these tasks.

A Typical Bioinformatics Workflow

Most downstream analysis blocks follow a similar data processing pipeline, orchestrated by the workflow.

This guide will walk through each of these stages.

Workflow Execution Model

A workflow is a Tengo script that runs on the Platforma backend. For complex operations, the logic can be split across multiple script files (*.tpl.tengo). One script can execute another by calling render.create(...), passing the output of one stage as the input to the next. This creates a chain of execution.

A common first step in this chain is resolving all the necessary input data. To simplify this, the workflow library provides a convenience utility: wf.prepare.

Instead of creating a separate template just to resolve inputs, you can use wf.prepare to declare your data dependencies upfront. The Platforma engine ensures that all the p-columns you query for are available before executing the main wf.body function. The resolved data is then passed as an argument to wf.body.

Step-by-Step Workflow Implementation

Essential Libraries

For p-frame operations, these are the most common libraries to import:

// It's standard practice to import all necessary libraries at the top of the script.
wf      := import("@platforma-sdk/workflow-tengo:workflow")     // Core workflow functions (prepare, body).
exec    := import("@platforma-sdk/workflow-tengo:exec")         // For running external tools.
pframes := import("@platforma-sdk/workflow-tengo:pframes")      // To build and manipulate p-frames.
pSpec   := import("@platforma-sdk/workflow-tengo:pframes.spec") // To create/manipulate p-column specifications.
xsv     := import("@platforma-sdk/workflow-tengo:pframes.xsv")  // To convert between p-frames and XSV (CSV/TSV) files.
assets  := import("@platforma-sdk/workflow-tengo:assets")       // To import assets like templates or software.

1. Querying required p-columns

We use the wf.prepare step and a p-column bundle builder to declare our data requirements. A bundle allows you to query for a set of related p-columns, often "anchored" to a primary p-column selected by the user.

// from workflow/src/main.tpl.tengo
wf.prepare(func(args) {
    // args contains the arguments selected by the user in the UI.
    // Let's assume args.datasetRef is a PlRef to an anchor p-column.

    bundleBuilder := wf.createPBundleBuilder()

    // 1. Set the anchor. This establishes the context for the query.
    bundleBuilder.addAnchor("main", args.datasetRef)

    // 2. Request other p-columns relative to the anchor.
    bundleBuilder.addSingle({
        axes: [ { anchor: "main", idx: 0 }, { anchor: "main", idx: 1 } ],
        annotations: {"pl7.app/isAbundance": "true"}
    }, "abundanceColumn") // The handle to access this column later

    // The returned map tells the system what p-columns to resolve.
    // We name the resolved resource 'bundle'.
    return { bundle: bundleBuilder.build() }
})

2. Converting p-frames to XSV

Many bioinformatics tools require flat files (like CSV or TSV) as input. Instead of exporting an entire p-frame, you can use the pframes.xsvFileBuilder to construct a custom XSV file. This gives you full control over the output columns, including the names of the axis columns.

wf.body(func(args) {
    // args.bundle is a p-frame containing the resolved p-columns.

    // 1. Create a file builder for a TSV file.
    tableBuilder := pframes.tsvFileBuilder()

    // 2. Set custom headers for the axes that will be included in the output.
    // We get the axis names from the spec of the anchor column ("main").
    anchorSpec := args.bundle.getSpec("main")
    tableBuilder.setAxisHeader(anchorSpec.axesSpec[0].name, "SampleID")
    tableBuilder.setAxisHeader(anchorSpec.axesSpec[1].name, "CloneID")

    // 3. Add the data column. Its axes will be added implicitly.
    // We also provide a new header for this specific data column.
    tableBuilder.add(args.bundle.getColumn("abundanceColumn"), { header: "Abundance" })

    // 4. Build the TSV. The output will have columns: SampleID, CloneID, Abundance.
    toolInputFile := tableBuilder.build()
    // ...
})

3. Running the Bioinformatics Tool

With the input file ready, you can use the exec library to run your tool.

wf.body(func(args) {
    // ... toolInputFile from previous step ...

    toolRun := exec.builder().
        software(assets.importSoftware("my-org/my-tool:1.0.0")).
        addFile("input.csv", toolInputFile).
        arg("--input").arg("input.csv").
        arg("--output").arg("output.tsv").
        saveFile("output.tsv").
        run()

    outputFile := toolRun.getFile("output.tsv")
    // ...
})

4. Parsing Tool Output into a p-frame

After the tool runs, you'll have an output file that needs to be brought back into the Platforma ecosystem. The xsv.importFile utility converts a tabular file into a p-frame.

wf.body(func(args) {
    // ... outputFile from previous step ...
    // ... Get spec for an axis from our input to reuse it ...
    inputAxisSpec := args.bundle.getSpec("main").axesSpec[0]

    // Import the TSV file.
    // We use `splitDataAndSpec: true` to get a nested map of p-columns ({colName: {spec:..., data:...}}),
    // which is easier to iterate over in the next step.
    resultsPf := xsv.importFile(outputFile, "tsv", {
        splitDataAndSpec: true, // Returns a nested map instead of a flat one.
        axes: [
            // Map a column from TSV to an axis, reusing a spec from our input.
            { column: "sample_id_col_in_tsv", spec: inputAxisSpec },
            // Map another column from TSV to a new axis.
            { column: "cluster_col_in_tsv", spec: { name: "my.domain/clusterId", type: "String" } }
        ],
        columns: [
            // Map a column from TSV to a new p-column.
            {
                column: "value_col_in_tsv",
                spec: { name: "my.domain/myValue", valueType: "Double" }
            }
        ]
    })
    // resultsPf is now a map of p-column objects.
    // ...
})

5. Building and Modifying p-frames

The pframes.pFrameBuilder() is a powerful tool for constructing p-frames programmatically or modifying existing ones. A common use case is to add provenance information (a "trace") to columns that have just been imported from a tool's output.

The pl7.app/trace annotation is important for tracking data lineage, which is crucial for debugging and ensuring reproducibility.

wf.body(func(args) {
    // ... resultsPf is the nested map of p-columns from the previous step ...
    anchorSpec := args.bundle.getSpec("main")

    // 1. Create a trace object that links our new data to an input p-column.
    trace := pSpec.makeTrace(anchorSpec, {
        type: "my-org.my-block",
        importance: 30,
        label: "My Block Processing"
    })

    // 2. Use a p-frame builder to create a new p-frame with the trace injected.
    tracedResultsPfBuilder := pframes.pFrameBuilder()
    for colName, col in resultsPf {
        // For each p-column in our map, add it to the builder
        // and inject the trace into its spec.
        tracedResultsPfBuilder.add(colName, trace.inject(col.spec), col.data)
    }
    finalPf := tracedResultsPfBuilder.build()

    // ...
})

The pSpec.makeTrace function takes two arguments: the spec of the parent p-column and a map of options describing the operation:

type: A string identifying the type of operation, usually the block's package name (e.g. my-org.my-block).
label: A human-readable string describing the processing step.
importance: A number that helps Platforma decide which trace step is most significant when automatically generating labels in the UI. Higher numbers give the step more priority to appear in labels.
id: An optional unique ID for the operation, often the block instance ID.

In the Platforma UI the pl7.app/trace is used to automatically generate informative column labels when multiple columns share the same pl7.app/label. Unique labels are constructed by finding the smallest set of trace step labels that makes all available column labels unique, prioritizing steps by their importance value. When creating trace steps, consider how you want your block to appear in labels and choose labels that will help distinguish your columns from others with similar processing.

6. Exporting p-frames

Finally, the p-frames you've created must be returned from wf.body. There are two destinations: outputs and exports.

outputs: P-frames in this map are only for the block's own UI. The data must be made accessible to the client.
exports: P-frames in this map are made available to the Result Pool for downstream blocks to use.

wf.body(func(args) {
    // ... finalPf is the p-frame we want to export ...
    return {
        outputs: {
            // Use pframes.exportFrame to move the p-frame data to client-accessible storage.
            tableData: pframes.exportFrame(finalPf)
        },
        exports: {
            // No exportFrame needed here. The transfer happens later, on-demand.
            processedPFrame: finalPf
        }
    }
})

The pframes.exportFrame method is crucial for outputs. It instructs the backend to move the underlying p-frame data from temporary working storage to the main storage, from which the Platforma Desktop client can access it. This is not needed for exports, as the data transfer can happen asynchronously when a downstream block requests it.

A Typical Bioinformatics Workflow​

Workflow Execution Model​

Step-by-Step Workflow Implementation​

Essential Libraries​

1. Querying required p-columns​

2. Converting p-frames to XSV​

3. Running the Bioinformatics Tool​

4. Parsing Tool Output into a p-frame​

5. Building and Modifying p-frames​

6. Exporting p-frames​