Run Software in the Workflow

In our playground blocks, we execute various commands from the NCBI BLAST software suite, such as makeblastdb, blastn, and blastp. Within the workflow, this is accomplished using the exec library, which translates execution commands from the workflow into instructions for the Runner Controller.

Below, we focus on the Make BLAST Database block workflow to explore the exec library in detail.

See complete workflow

workflow/src/main.tpl.tengo
// makeblastdb workflow

wf := import("@platforma-sdk/workflow-tengo:workflow")
exec := import("@platforma-sdk/workflow-tengo:exec")
file := import("@platforma-sdk/workflow-tengo:file")
assets := import("@platforma-sdk/workflow-tengo:assets")
times := import("times")

// import makeblastdb from the BLAST software package
blastSw := assets.importSoftware("@platforma-open/milaboratories.software-blast:makeblastdb")

wf.body(func(args) {

  // import fasta file
  fImport := file.importFile(args.fastaFile)
  
  // build exec command
  makedbCmd := exec.builder().
    software(blastSw).
    arg("-in").arg("input.fasta").      
    arg("-parse_seqids").
    arg("-out").arg("db").
    arg("-dbtype").arg(args.dataType).
    arg("-blastdb_version").arg("5").
    addFile("input.fasta", fImport.file).
    saveFileSet("db", "^db.*").
    printErrStreamToStdout().
    cache(48 * times.hour).
    run()

  dbFiles := makedbCmd.getFileSet("db")
        
  return {
    outputs: {
      fastaHandle: fImport.handle,
      log: makedbCmd.getStdoutStream()
    },
    exports: {
      db: {
        data: dbFiles,
        spec: {
          kind: "fileSet",
          annotations: {
            "pl7.app/type": "blastDB",
            "pl7.app/label": args.title,
            "pl7.app/blast/alphabetType": args.dataType,
            "pl7.app/blast/dbTitle": args.title
          }
        }
      }
    }
  }
})

Exec builder

The exec library provides a single method, builder(), which returns a builder object that allows you to construct a command for execution. The builder's run() method returns the resulting instance of the exec resource.

The first step is to specify the software you intend to run. The builder offers two methods for this:

software(softwareAsset): Use the software available in the package registry configured for the backend.
cmd(localCommand): Directly use a command available in the PATH of the host running the backend.

Platforma's software packages provide a reproducible and reliable way of distributing and using software. Local commands with cmd are typically used only for development purposes.

Using software packages

A software package is a set of executable resources (binaries or container images) for specific software stored in Platforma's package registries. The software executable asset is imported into the workflow using the assets library:

// import assets library
assets := import("@platforma-sdk/workflow-tengo:assets")

// import makeblastdb from the BLAST software package
blastSw := assets.importSoftware("@platforma-open/milaboratories.software-blast:makeblastdb")

The BLAST software package has multiple entry points, and to select a specific entry point (in this case, makeblastdb), we use a colon followed by the entry point name.

Additionally, specify the dependency on the BLAST software package in the pnpm-workspace.yaml and the workflow's package.json:

pnpm-workspace.yaml
catalog:
  "@platforma-open/milaboratories.software-blast": ^1.0.0

workflow/package.json
  "devDependencies": {
    "@platforma-open/milaboratories.software-blast": "catalog:"
  }

Under the hood, Platforma downloads the software binaries from the registry and places them into the working directory of the exec when rendering the workflow.

note

Platforma caches downloaded software, but the initial download from the package registry may take time, especially for large software packages.

The NCBI BLAST software used in this block is part of the platforma-open package registry, but you can create your own software packages and publish them in private package registries as well. Learn more about this.

Using local commands

Local commands can be used for development purposes along with the development setup of Platforma's backend. In this case, the backend will run the provided command directly on the host. For example, assuming that the makeblastdb command is available in the PATH of the host system, you can use the following exec:

// build exec command
makedbCmd := exec.builder().
  cmd("makeblastdb"). // specify your local command
  ...
  run()

Cmd on Windows

Keep in mind that cmd will run the provided command on the host running the backend. If the backend is running on Windows, even basic shell commands like grep may not work.

When using cmd, remember that you can't use standard bash piping or redirects directly within the cmd(). For example, the following command will not work:

exec.builder().
    cmd("cat in.file | grep '>' | wc -l").
    ...
    run()

To construct such a command, use env:

exec.builder().
  cmd("/usr/bin/env").
  arg("bash").arg("-c").
  arg("cat in.file | grep '>' | wc -l").
  ...
  run()

In general, we recommend using cmd only for development purposes and always wrapping your software into Platforma packages before distributing it to users.

Command arguments

Arguments can be passed to the command using the arg(argument) method. You can chain calls to pass multiple arguments.

note

Keep in mind that each command argument should be passed with a separate arg(...) call. For example, using arg("arg1 arg2") will add one argument to the command containing a space.

Adding files to the workdir

Platforma runs the software in a freshly allocated working directory. To bring files into the workdir required to run the software, the exec builder provides the following methods:

addFile(name, file): Adds a given file under a specified name into the working directory.
addFiles(fileMap): Adds a given map of files into the working directory.
writeFile(name, fileContent): Creates a file with a specified name and content (serialized in bytes).
writeFiles(fileContentMap): Creates files with specified names (map keys) and content (map values).

In our block, we add a FASTA file with the reference sequences as input for makeblastdb. To do this, we first have to import the file into the backend storage:

// import fasta file
fImport := file.importFile(args.fastaFile)

makedbCmd := exec.builder().
  ...
  // add imported file into the workdir
  addFile("input.fasta", fImport.file).
  ...
  run()

note

Keep in mind that we can't directly pass args.fastaFile as an argument to addFile, because args.fastaFile is just a pointer to the file in external space (e.g., the user's computer), and we need to import it into Platforma's storage before it can be used in the exec.

Under the hood, Platforma ensures that the file appears in the workdir, initiating a data transfer from the primary storage into the workdir if necessary.

Of course, there is no need to import files that already exist in Platforma's main or workdir storage, such as files generated by other commands:

cmd1 := exec.builder().
      cmd(someSoftware).
      saveFile("file.txt").
      run()

cmd2 := exec.builder().
      software(otherSoftware).
      // add file.txt from cmd1 in the workdir of cmd2
      addFile("file.txt", cmd1.getFile("file.txt")).
      run()

Here, Platforma will reuse the same file already existing in the workdir space, without initiating any unnecessary transfers.

Saving files from the workdir

By default, all files generated by the software in the workdir are considered temporary, and Platforma will garbage collect them. If you want to use files in the workflow results, instruct Platforma to save them. This can be done in several ways.

`saveFile(fileName)`, `getFile(fileName)`

Exec's saveFile(fileName) saves a specified file from the working directory. Internally, Platforma will keep it persisted and move it to the primary storage if necessary. To retrieve the saved file later in the workflow, use the getFile(fileName) method of the resulting exec instance.

`saveFileContent(fileName)`, `getFileContent(fileName)`

With saveFileContent(fileName), Platforma saves the content of the specified file as a plain bytes resource directly in the core database. This is useful for small files, such as JSON, that can be directly used in the outputs.

Size limit

The maximum file size that can be saved with saveFileContent(fileName) is 4 megabytes.

`saveFileSet(name, regex)`, `getFileSet(name)`

This allows saving a set of files from the working directory whose names match a specified regex pattern. This is what we do in our workflow:

makedbCmd := exec.builder().
  ...
  saveFileSet("db", "^db.*").
  ...
  run()

Here, we save all files that start with the "db" prefix. We label the resulting set as db and can retrieve it from the exec object using the makedbCmd.getFileSet("db") method. It returns a resource object with fields corresponding to the names of saved files.

`processWorkdir(name, template, args)`

This method allows running a specified template, passing the content of the working directory as one of the template arguments. For details, see the exec library docs.

Getting log streams

There are two methods to obtain log stream resources: getStdoutStream() and getStderrStream(). In our example, we use the printErrStreamToStdout() instruction, which redirects the error stream to stdout, so we only use the stdout stream in the results:

workflow/src/main.tpl.tengo
wf.body(func(args) {
  ...
  return {
    outputs: {
      ...
      log: makedbCmd.getStdoutStream()
    },
    ...
  }
})

In our example, we use this result in the model to get the last 100 lines of the log:

model/src/index.ts
export const model = BlockModel.create<BlockArgs>()
  ...
  .output('log', (ctx) => ctx.outputs?.resolve('log')?.getLastLogs(100))
  ...
  .done();

Then, in the UI app, we render logs using the PlLogView component:

ui/src/pages/MainPage.vue
<template>
  ...
  <PlLogView :value="app.model.outputs.log" />
  ...
</template>

tip

Instead of showing the last 100 lines of the makeblastdb log, you can show the full log using the log resource handle:

model/src/index.ts
export const model = BlockModel.create<BlockArgs>()
  ...
  .output('log', (ctx) => ctx.outputs?.resolve('log')?.getLogHandle())
  ...
  .done();

And using the same PlLogView component, which will now use the log API to pull the logs directly from the backend.

Cache

A very important feature of the exec builder is the cache(period) method. This instructs Platforma to cache the results of the exec for a specified period. Even if the user deletes the block or the project entirely, all exec results will be retained for the given period.

Exec builder​

Using software packages​

Using local commands​

Command arguments​

Adding files to the workdir​

Saving files from the workdir​

saveFile(fileName), getFile(fileName)​

saveFileContent(fileName), getFileContent(fileName)​

saveFileSet(name, regex), getFileSet(name)​

processWorkdir(name, template, args)​

Getting log streams​

Cache​

Exec builder

Using software packages

Using local commands

Command arguments

Adding files to the workdir

Saving files from the workdir

`saveFile(fileName)`, `getFile(fileName)`

`saveFileContent(fileName)`, `getFileContent(fileName)`

`saveFileSet(name, regex)`, `getFileSet(name)`

`processWorkdir(name, template, args)`

Getting log streams

Cache