Advanced Options

The Migrate Records step includes several advanced features for handling complex migration scenarios. This article covers each feature in detail.

Mapping Variables

Mapping variables are intermediate calculated fields that are processed during migration but not written to the target table. They allow you to break complex transformations into smaller, reusable steps.

How They Work

Create a mapping variable with the Add Mapping Variable button in the toolbar above the field mapping table (on narrow panes this action is in the vertical ellipsis menu)
Define a mapping for the variable (e.g., a formula that extracts a value)
Reference the mapping variable by name in other field mappings

Use Cases

Reusable calculations: Compute a value once and reference it in multiple target field mappings, avoiding duplicated logic
Multi-step transformations: Break a complex transformation into stages — e.g., first extract a JSON value into a mapping variable, then format it in the target field mapping
Conditional routing: Calculate a flag in a mapping variable, then use it in multiple field mappings to determine output values

Example

Suppose you need to extract a user's department from a JSON payload and use it in two different target fields:

Create mapping variable dept with formula: json_path('$.department.name', [user_data])
Target field department_name: Mapping Variable → dept
Target field dept_code: Formula → dept[:3].upper()

Validation Rules

Validation rules check each record before it is inserted or updated in the target system. They provide a safety net to catch data quality issues during migration.

Configuring Rules

Open the Migrate Records step and navigate to the validation rules section
Add a new rule with a condition (formula expression that evaluates to True or False)
Set the failure action

Failure Actions

Action	Behaviour
Skip Record	The record is skipped (not inserted/updated). A log entry is created. Migration continues.
Log Warning	The record is processed normally, but a warning is logged. Useful for flagging potential issues without blocking migration.
Log Error	An error is logged against the record. Depending on your error handling configuration, this may stop the step or be recorded against the record and execution continues. Use for critical data integrity violations that must be investigated.

Example Rules

Required field: [email] is not None and [email] != '' — Skip Record when no email is set
Value range: int([priority]) >= 1 and int([priority]) <= 5 — Log Error on invalid priority values
Format check: '@' in str([email]) — Log Warning on emails without an @ symbol

Formula Validation

Rule conditions are checked for broken variable references in the editor. If a rule references a procedure variable that has since been renamed or removed, the Migrate Records step reports a validation error identifying the affected rule, so the problem is caught before the migration runs rather than at execution time. Update the formula to reference a valid variable to clear the error.

Attachment Migration

When migrating records that have associated file attachments (common in ServiceNow, Jira, and other systems), you can enable automatic attachment migration.

How It Works

Enable the attachment migration toggle on the Migrate Records step
When a record is migrated, PB checks for attachments on the source record
Attachments are downloaded from the source system and uploaded to the corresponding target record

Considerations

Performance: Attachment migration is significantly slower than record migration due to file download/upload operations. Large attachments (10+ MB) may take several seconds each.
Storage: Ensure the target system has sufficient storage for migrated attachments.
Supported systems: Attachment migration is available for ServiceNow and other systems with attachment APIs.

Batch Sizing

Batch sizing controls how many records are sent to the target system in each API call or database operation.

Trade-offs

Setting	Effect
Smaller batches (e.g., 10-50)	More API calls, less memory usage, easier to pinpoint errors, slower overall
Larger batches (e.g., 200-500)	Fewer API calls, more memory usage, faster overall, harder to isolate errors

Guidelines

Default: The default batch size is appropriate for most migrations
API rate limits: If you're hitting rate limits (429 errors), reduce batch size
Memory constraints: If you're running out of memory on large records (e.g., records with big text fields), reduce batch size
Database migrations: Database-to-database migrations can typically handle larger batch sizes (500+)

Keyset Pagination

Keyset pagination is an alternative to traditional offset-based pagination for extracting records from the source system.

Purpose

Traditional offset pagination (OFFSET 1000 LIMIT 100) can produce inconsistent results if records are added or removed from the source table during extraction. Keyset pagination avoids this by using the source table's primary key to track position.

How It Works

Instead of "skip N records, take M", keyset pagination uses "get records where key > last_seen_key, take M". This is reliable even when the source data changes during extraction.

When to Use

Large datasets: 100,000+ records where extraction takes minutes or hours
Active source systems: When records may be created/modified during extraction
Reliable ordering needed: When you need deterministic, repeatable extraction

Configuration

Keyset pagination is a single toggle on the Migrate Records step — there is no separate "key field" selector. PB uses the source table's pre-configured primary key (e.g. sys_id for ServiceNow, the primary key column for databases) as the keyset cursor.

The toggle is only available when the source table's adaptor declares keyset pagination support in its metadata. If the toggle is disabled or absent, the adaptor or the specific table does not support keyset pagination.

Sampling Validation

Sampling validation checks the accuracy of a migration by re-querying a random sample of the already-migrated records from the target system and comparing each retrieved value against what PB expected to write. It is a post-migration safety check; for a pre-migration dry-run of the same step against a sample of source records (no writes), see Previewing a Migration.

To limit how many records get processed, use the Limit option on the Filter tab, or apply a tightly-scoped filter (see Common Filter Patterns) to extract a small subset for testing.

How to Use

Open the Migrate Records step and go to the Validation tab
Enable Sampling Target Check
Set the Sampling Percentage (1–100). 5–10% is a sensible starting point; higher percentages produce more confidence but cost more time and API calls
Run the step. After the records are migrated, PB queries the target for a random sample and reports any mismatches in the execution report

You can also exclude individual field mappings from sampling validation (e.g. fields the target system normalises or transforms server-side) by toggling Exclude from sampling validation on the field mapping.

Benefits

End-to-end confirmation: Detects drift between what PB sent and what the target system actually stored — including silent server-side transforms, truncation, or type coercion.
Per-field reporting: Mismatches are reported field-by-field so you can pinpoint mapping errors.
Configurable scope: Tune the percentage to balance confidence against migration time.

Per-Chunk Scripts

Per-chunk scripts run a user-authored Python script in an isolated subprocess once per chunk of records during a Migrate Records step. The script receives the chunk's source and target records, can mutate the target records, and can write artifact files to the execution.

This is the same script runner as the Run Script step, with two differences:

The script is invoked once per chunk instead of once per execution.
pb.batch is populated with the chunk being processed, and pb.set_batch writes (optionally mutated) target records back into the step.

A Migrate Records step can carry at most one script per timing slot (zero, one, or two scripts in total).

Timing Slots

Timing	When it runs	Mutations
Pre-Insert (`per_chunk_pre_insert`)	After target records have been assembled by the field mappings, before the chunk is upserted to the target.	Mutations written via `pb.set_batch` flow through into what is sent to the target.
Post-Insert (`per_chunk_post_insert`)	After the chunk has been written to the target. Target primary keys assigned by the target system are populated on the records.	Mutations are accepted but only affect in-memory state used by downstream validation, attachments, and output collection. They do not write back to the target.

Use pre-insert to apply last-mile transformations a formula can't express (for example, a transformation that needs a Python library, or a cross-record calculation against the full chunk). Use post-insert to observe the post-upsert state of records, write per-chunk artifact files, or accumulate summary data across chunks.

Records Visible to the Script

Records that errored or were skipped earlier in the step (by validation rules, key mapping rules, or field mapping failures) are excluded from pb.batch. The script cannot see them and cannot resurrect them. Scripts that return a batch must return exactly the records they were given — inserting or removing records from the chunk is not supported.

The Batch Payload

The chunk is exposed to the script as pb.batch:

pb.batch = {
    "chunk_index": 0,            # int, zero-based
    "total_chunks": 5,           # int | None, when knowable
    "source_records": [
        {"data": {...}, "primary_key_value": ...},
        ...
    ],
    "target_records": [
        {"data": {...}, "primary_key_value": ...},
        ...
    ],
}

To mutate target records, modify target_records and call pb.set_batch:

for record in pb.batch["target_records"]:
    record["data"]["full_name"] = (
        f"{record['data']['first_name']} {record['data']['last_name']}"
    )
pb.set_batch({"target_records": pb.batch["target_records"]})

If the script does not call pb.set_batch, the in-memory records are left untouched and the chunk proceeds with whatever the field mappings produced.

Configuration

Open the Migrate Records step and go to the Scripts tab on the Advanced area.
Add a script and choose a timing slot (Pre-Insert or Post-Insert).
Edit the script body in the in-product code editor. Each timing slot opens with a starter template documenting the available pb.* API and the batch shape.
Set the Timeout (seconds) for each invocation (default 60, max one hour). The script is killed if any single chunk invocation exceeds the timeout, which fails the step.

Failure Handling

A failed per-chunk script aborts the step. The chunk does not proceed to upsert (for pre-insert) or to downstream handling (for post-insert), and the execution status is set to failed. Stdout, stderr, and any pb.log output from the script are captured in the execution log so you can diagnose the failure.

Artifacts

Per-chunk scripts have access to both artifact surfaces exposed by the script runner:

pb.artifact_path(filename) writes into this step's artifact folder, surfaced under Step Artifacts for this step in the execution viewer.
pb.shared_artifact_path(filename) writes into the procedure-level shared folder, surfaced under Procedure Artifacts at the top of the execution's Artifacts page. Use this when several steps in the same procedure contribute to one output (for example, a running summary appended by both pre-insert and post-insert scripts), or when a downstream step needs to read a file produced by an earlier one without hard-coding the per-step layout.

The viewer deduplicates by filename and reflects the latest on-disk size, so an append-only summary written by post-insert grows safely across chunks. See Step Type: Run Script for the full pb API.

Migrate Records Overview — introduction to the Migrate Records step
Configuring Field Mappings — setting up field mappings
Formula Language — formula syntax for validation rules and mapping variables
Step Type: Run Script — stand-alone version of the same script runner, including the full pb module API
Performance Optimization — tuning batch size and pagination

Advanced Options

Mapping Variables

How They Work

Use Cases

Example

Validation Rules

Configuring Rules

Failure Actions

Example Rules

Formula Validation

Attachment Migration

How It Works

Considerations

Batch Sizing

Trade-offs

Guidelines

Keyset Pagination

Purpose

How It Works

When to Use

Configuration

Sampling Validation

How to Use

Benefits

Per-Chunk Scripts

Timing Slots

Records Visible to the Script

The Batch Payload

Configuration

Failure Handling

Artifacts

Related Documentation