Run Software in the Workflow
In our playground blocks, we execute various commands from the NCBI BLAST software suite, such as makeblastdb, blastn, and blastp. Within the workflow, this is accomplished using the exec library, which translates execution commands from the workflow into instructions for the Runner Controller.
Below, we focus on the Make BLAST Database block workflow to explore the exec library in detail.
See complete workflow
// makeblastdb workflow
wf := import("@platforma-sdk/workflow-tengo:workflow")
exec := import("@platforma-sdk/workflow-tengo:exec")
file := import("@platforma-sdk/workflow-tengo:file")
assets := import("@platforma-sdk/workflow-tengo:assets")
times := import("times")
// import makeblastdb from the BLAST software package
blastSw := assets.importSoftware("@platforma-open/milaboratories.software-blast:makeblastdb")
wf.body(func(args) {
// import fasta file
fImport := file.importFile(args.fastaFile)
// build exec command
makedbCmd := exec.builder().
software(blastSw).
arg("-in").arg("input.fasta").
arg("-parse_seqids").
arg("-out").arg("db").
arg("-dbtype").arg(args.dataType).
arg("-blastdb_version").arg("5").
addFile("input.fasta", fImport.file).
saveFileSet("db", "^db.*").
printErrStreamToStdout().
cache(48 * times.hour).
run()
dbFiles := makedbCmd.getFileSet("db")
return {
outputs: {
fastaHandle: fImport.handle,
log: makedbCmd.getStdoutStream()
},
exports: {
db: {
data: dbFiles,
spec: {
kind: "fileSet",
annotations: {
"pl7.app/type": "blastDB",
"pl7.app/label": args.title,
"pl7.app/blast/alphabetType": args.dataType,
"pl7.app/blast/dbTitle": args.title
}
}
}
}
}
})
Exec builder
The exec library provides a single method, builder(), which returns a builder object that allows you to construct a command for execution. The builder's run() method returns the resulting instance of the exec resource.
The first step is to specify the software you intend to run. The builder offers two methods for this:
-
software(softwareAsset): Use the software available in the package registry configured for the backend. -
cmd(localCommand): Directly use a command available in thePATHof the host running the backend.
Platforma's software packages provide a reproducible and reliable way of distributing and using software. Local commands with cmd are typically used only for development purposes.
Using software packages
A software package is a set of executable resources (binaries or container images) for specific software stored in Platforma's package registries. The software executable asset is imported into the workflow using the assets library:
// import assets library
assets := import("@platforma-sdk/workflow-tengo:assets")
// import makeblastdb from the BLAST software package
blastSw := assets.importSoftware("@platforma-open/milaboratories.software-blast:makeblastdb")
The BLAST software package has multiple entry points, and to select a specific entry point (in this case, makeblastdb), we use a colon followed by the entry point name.
Additionally, specify the dependency on the BLAST software package in the pnpm-workspace.yaml and the workflow's package.json:
catalog:
"@platforma-open/milaboratories.software-blast": ^1.0.0
"devDependencies": {
"@platforma-open/milaboratories.software-blast": "catalog:"
}
Under the hood, Platforma downloads the software binaries from the registry and places them into the working directory of the exec when rendering the workflow.
Platforma caches downloaded software, but the initial download from the package registry may take time, especially for large software packages.
The NCBI BLAST software used in this block is part of the platforma-open package registry, but you can create your own software packages and publish them in private package registries as well. Learn more about this.
Using local commands
Local commands can be used for development purposes along with the development setup of Platforma's backend. In this case, the backend will run the provided command directly on the host. For example, assuming that the makeblastdb command is available in the PATH of the host system, you can use the following exec:
// build exec command
makedbCmd := exec.builder().
cmd("makeblastdb"). // specify your local command
...
run()
Keep in mind that cmd will run the provided command on the host running the backend. If the backend is running on Windows, even basic shell commands like grep may not work.
When using cmd, remember that you can't use standard bash piping or redirects directly within the cmd(). For example, the following command will not work:
exec.builder().
cmd("cat in.file | grep '>' | wc -l").
...
run()
To construct such a command, use env:
exec.builder().
cmd("/usr/bin/env").
arg("bash").arg("-c").
arg("cat in.file | grep '>' | wc -l").
...
run()
In general, we recommend using cmd only for development purposes and always wrapping your software into Platforma packages before distributing it to users.
Command arguments
Arguments can be passed to the command using the arg(argument) method. You can chain calls to pass multiple arguments.
Keep in mind that each command argument should be passed with a separate arg(...) call. For example, using arg("arg1 arg2") will add one argument to the command containing a space.
Adding files to the workdir
Platforma runs the software in a freshly allocated working directory. To bring files into the workdir required to run the software, the exec builder provides the following methods:
-
addFile(name, file): Adds a given file under a specified name into the working directory. -
addFiles(fileMap): Adds a given map of files into the working directory. -
writeFile(name, fileContent): Creates a file with a specified name and content (serialized in bytes). -
writeFiles(fileContentMap): Creates files with specified names (map keys) and content (map values).
In our block, we add a FASTA file with the reference sequences as input for makeblastdb. To do this, we first have to import the file into the backend storage:
// import fasta file
fImport := file.importFile(args.fastaFile)
makedbCmd := exec.builder().
...
// add imported file into the workdir
addFile("input.fasta", fImport.file).
...
run()
Keep in mind that we can't directly pass args.fastaFile as an argument to addFile, because args.fastaFile is just a pointer to the file in external space (e.g., the user's computer), and we need to import it into Platforma's storage before it can be used in the exec.
Under the hood, Platforma ensures that the file appears in the workdir, initiating a data transfer from the primary storage into the workdir if necessary.
Of course, there is no need to import files that already exist in Platforma's main or workdir storage, such as files generated by other commands:
cmd1 := exec.builder().
cmd(someSoftware).
saveFile("file.txt").
run()
cmd2 := exec.builder().
software(otherSoftware).
// add file.txt from cmd1 in the workdir of cmd2
addFile("file.txt", cmd1.getFile("file.txt")).
run()
Here, Platforma will reuse the same file already existing in the workdir space, without initiating any unnecessary transfers.
Saving files from the workdir
By default, all files generated by the software in the workdir are considered temporary, and Platforma will garbage collect them. If you want to use files in the workflow results, instruct Platforma to save them. This can be done in several ways.
saveFile(fileName), getFile(fileName)
Exec's saveFile(fileName) saves a specified file from the working directory. Internally, Platforma will keep it persisted and move it to the primary storage if necessary. To retrieve the saved file later in the workflow, use the getFile(fileName) method of the resulting exec instance.
saveFileContent(fileName), getFileContent(fileName)
With saveFileContent(fileName), Platforma saves the content of the specified file as a plain bytes resource directly in the core database. This is useful for small files, such as JSON, that can be directly used in the outputs.
The maximum file size that can be saved with saveFileContent(fileName) is 4 megabytes.
saveFileSet(name, regex), getFileSet(name)
This allows saving a set of files from the working directory whose names match a specified regex pattern. This is what we do in our workflow:
makedbCmd := exec.builder().
...
saveFileSet("db", "^db.*").
...
run()
Here, we save all files that start with the "db" prefix. We label the resulting set as db and can retrieve it from the exec object using the makedbCmd.getFileSet("db") method. It returns a resource object with fields corresponding to the names of saved files.
processWorkdir(name, template, args)
This method allows running a specified template, passing the content of the working directory as one of the template arguments. For details, see the exec library docs.
Getting log streams
There are two methods to obtain log stream resources: getStdoutStream() and getStderrStream(). In our example, we use the printErrStreamToStdout() instruction, which redirects the error stream to stdout, so we only use the stdout stream in the results:
wf.body(func(args) {
...
return {
outputs: {
...
log: makedbCmd.getStdoutStream()
},
...
}
})
In our example, we use this result in the model to get the last 100 lines of the log:
export const model = BlockModel.create<BlockArgs>()
...
.output('log', (ctx) => ctx.outputs?.resolve('log')?.getLastLogs(100))
...
.done();
Then, in the UI app, we render logs using the PlLogView component:
<template>
...
<PlLogView :value="app.model.outputs.log" />
...
</template>
Instead of showing the last 100 lines of the makeblastdb log, you can show the full log using the log resource handle:
export const model = BlockModel.create<BlockArgs>()
...
.output('log', (ctx) => ctx.outputs?.resolve('log')?.getLogHandle())
...
.done();
And using the same PlLogView component, which will now use the log API to pull the logs directly from the backend.
Cache
A very important feature of the exec builder is the cache(period) method. This instructs Platforma to cache the results of the exec for a specified period. Even if the user deletes the block or the project entirely, all exec results will be retained for the given period.