Step Type: Run Script
The Run Script step runs a user-authored Python script inside an isolated subprocess during procedure execution. It is the escape hatch for work that doesn't fit into the other step types: bespoke data transformation, custom export formats, or writing artifact files for downstream tools to pick up.
When to Use
- To export data in a custom format (for example, a JSON payload for a platform upload) using an earlier step's output
- To perform a one-off data transformation that would be awkward as a formula
- To produce a file artifact alongside the execution (reports, archives, manifests)
- To call into a Python library that isn't exposed as a formula function
If you can achieve the same result with a formula, prefer the formula. Run Script is for the cases a formula can't reach.
Configuration
Script
The Python source to execute, edited in the in-product code editor. The script runs in a fresh Python subprocess with the same packages Precision Bridge itself uses (pandas, requests, the standard library, etc.), so you can import anything that's available.
A small helper module named pb is pre-imported into the script's globals. See The pb module below for the full API.
Every procedure variable available at this step is automatically exposed to the script as pb.inputs[name], keyed by its variable name. There is no aliasing or binding configuration to manage — if the variable exists at this step, the script can read it.
Outputs
Declare the name and data type of each variable this script produces. Values are set at runtime by the script via pb.set_output(name, value) — there is no default value or formula to enter, because the whole point of the step is to produce the values programmatically. The script must call pb.set_output for every declared output before it finishes, and calling it with a name that isn't declared raises an error.
Settings
| Setting | Description |
|---|---|
| Timeout (seconds) | Hard wall-clock limit on the subprocess. When reached, the script is killed and the step fails. Capped at one hour. |
The pb module
The pb module is the only contract between your script and Precision Bridge. Everything else about the Python environment is ordinary.
Data attributes
| Attribute | Type | Meaning |
|---|---|---|
pb.inputs |
dict[str, Any] |
Every procedure variable available at this step, keyed by its variable name. |
pb.batch |
dict \| None |
Per-chunk context when the script is invoked by the per-chunk hook on a Migrate Records step. Always None for a stand-alone Run Script step. See Per-Chunk Scripts for the batch shape and pb.set_batch semantics. |
pb.artifact_dir |
str |
Absolute path to this step execution's artifact folder. Already exists and is writable. |
pb.shared_artifact_dir |
str |
Absolute path to the procedure-level shared artifact folder for this execution. Already exists and is writable. Files here roll up under Procedure Artifacts in the execution viewer and are visible to every step in the procedure. |
pb.step_id |
str |
Internal ID of this step. Useful for logging. |
pb.step_identifier |
str |
Human-readable step name. |
Values in pb.inputs are copies. Mutating them inside the script doesn't leak back into the procedure — use pb.set_output to return data.
Functions
pb.set_output(name, value)
Records an output. name must match one of the variables on the step's Outputs tab. value must be JSON-serialisable.
pb.set_output("row_count", len(rows))
pb.set_output("summary", {"ok": 42, "bad": 1})
pb.artifact_path(filename)
Returns an absolute path inside this step's artifact folder and creates any missing parent directories. Use this to write files the next step (or the user) will pick up. Rejects absolute paths, drive letters, and .. traversal at both validation time and runtime.
path = pb.artifact_path("export/records.jsonl")
with open(path, "w") as f:
for rec in records:
f.write(json.dumps(rec) + "\n")
pb.set_output("export_file", path)
Each resolved path is tracked and shown in the execution viewer under Step Artifacts.
pb.shared_artifact_path(filename)
Returns an absolute path inside the procedure-level shared artifact folder and creates any missing parent directories. Same path-validation rules as pb.artifact_path (no absolute paths, drive letters, or .. traversal).
Use this when an artifact belongs to the procedure as a whole rather than to a single step — for example, when several steps append rows to the same summary file, or when a downstream step needs to read a file produced by an earlier step without hard-coding the per-step folder layout.
path = pb.shared_artifact_path("summary.csv")
with open(path, "a") as f:
f.write(f"{pb.step_identifier},{len(rows)}\n")
Files written here surface in the execution viewer under Procedure Artifacts. The viewer deduplicates by filename across the whole execution, so successive writes to the same name appear as one entry showing the latest size.
pb.log(message)
Append a line to the step-scoped log shown under Script log in the execution viewer. Separate from print(), which also goes to stdout (captured) but doesn't get its own collapsible panel.
pb.log(f"Processed {len(rows)} rows")
How It Works
- The step writes a control file to the execution's artifact folder containing the script, resolved input values, declared output names, and the artifact directory path.
- Precision Bridge spawns a new Python subprocess (with the same interpreter PB itself is running on) and passes the control file as an argument.
- The subprocess loads the
pbmodule into globals, runs the script, and writes a result file back. - PB reads the result, applies the outputs to this step's output variables, and captures stdout, stderr, and the
pb.logoutput into the execution record. - On success, the control file is removed. On failure, it stays on disk next to the artifacts for debugging.
The subprocess has the same filesystem and network access as the PB process. On customer machines that's whatever the PB install can see; customer-authored scripts are trusted the same way a custom formula function is trusted.
Failure Modes
| Symptom | Meaning |
|---|---|
Script did not set declared output(s): ['x'] |
You forgot a pb.set_output call for an output declared on the step. |
'x' is not a declared output |
pb.set_output("x", ...) was called but no output variable with that name exists on the step. |
Output values are not JSON-serialisable: ... |
One of your output values has a custom class with no JSON representation. Convert to a dict, list, or string first. |
artifact_path() filename must be relative, not absolute |
Pass a filename like "out.json" or "sub/out.json", not /abs/path. |
Script timed out after N seconds. |
Raise the timeout on the Settings tab, or speed the script up. |
Script subprocess did not produce a result file. |
The worker crashed hard (OS kill, out-of-memory). Rare — check stderr in the execution viewer. |
Example: per-execution JSON export
A procedure extracts records and then exports them as a single JSON file for a downstream platform upload.
Step configuration
- An earlier Extract Records step produces a records list variable.
- Outputs tab: export_path (String), count (Integer)
- Settings: timeout 300
Script
import json
rows = pb.inputs["records"]
pb.log(f"Exporting {len(rows)} records")
path = pb.artifact_path("export.json")
with open(path, "w") as f:
json.dump(rows, f, indent=2, default=str)
pb.set_output("export_path", path)
pb.set_output("count", len(rows))
After the step runs, export_path and count are available to downstream steps as normal procedure variables, and the JSON file appears under the execution's Artifacts list for manual upload or inspection.
Comments
0 comments
Please sign in to leave a comment.