Teradata stored procedures, macros & UDFs to BigQuery
Re-home procedural logic-BTEQ-era routines, Teradata stored procedures, macros, and UDFs-into BigQuery routines with an explicit behavior contract and validation harness so results don’t drift after cutover.
- Input
- Teradata Stored procedure / UDF migration logic
- Output
- BigQuery equivalent (validated)
- Common pitfalls
- Macro misuse: macros used as “functions” with hidden dependencies; migration must make dependencies explicit.
- Row-by-row loops/cursors: common in procedural code; often needs set-based refactors for BigQuery cost/runtime.
- Session coupling: reliance on session variables, volatile tables, or execution context assumptions.
Why this breaks
Teradata procedural assets are rarely “just code.” They carry operational behavior from decades of platform constraints: macros used as reusable query units, stored procedures that manage control tables and restartability, and UDFs that encode business logic and typing assumptions. When migrated naïvely, routines may compile in BigQuery but drift in results or operational behavior.
Common symptoms after migration:
- Procedures compile but results drift due to NULL/type coercion differences inside branches
- Macro expansion behavior changes; parameter and identifier resolution becomes inconsistent
- Dynamic SQL behaves differently (quoting, binding, or execution context)
- Error handling and return codes no longer match, breaking orchestration and restart flows
- Side effects (control tables, audit logs) aren’t modeled, so reruns double-apply or silently skip
A successful migration turns implicit behavior into a testable contract and validates it under real rerun/backfill scenarios.
How conversion works
- Inventory & classify routines: stored procedures, macros, SQL UDFs, and their call graph (ETL jobs, BI tools, schedulers).
- Extract the behavior contract: inputs/outputs, side effects, state assumptions, error semantics, and performance constraints.
- Choose target form per asset:
- BigQuery SQL UDF for pure expressions
- BigQuery JavaScript UDF for complex transformation logic
- BigQuery stored procedure (SQL scripting) for multi-statement control flow
- Refactor to set-based SQL where procedural loops/cursors exist
- Translate macro patterns into views/UDFs/procedures with explicit parameter contracts.
- Validate with a harness: golden inputs/outputs, branch coverage, failure-mode tests, and side-effect assertions (control/audit tables).
Supported constructs
Representative Teradata procedural constructs we commonly migrate to BigQuery routines (exact coverage depends on your estate).
| Source | Target | Notes |
|---|---|---|
| Teradata SQL UDFs | BigQuery SQL UDFs | Pure expression logic mapped with explicit casts and NULL behavior. |
| Teradata stored procedures | BigQuery stored procedures (SQL scripting) | Control flow and side effects modeled explicitly. |
| Teradata macros | Views/UDFs/procedures with explicit parameter contracts | Macro expansion replaced by reusable, testable assets. |
| Control tables for restartability | Applied-window tracking + idempotency markers | Reruns are safe and auditable under failures/backfills. |
| Dynamic SQL patterns | EXECUTE IMMEDIATE with parameter binding | Normalize identifier resolution; reduce injection and drift. |
| Volatile tables inside routines | Temp tables / staged materializations | Concurrency-safe without hidden session coupling. |
How workload changes
| Topic | Teradata | BigQuery |
|---|---|---|
| Reusable logic | Macros and stored procedures often act as reusable building blocks | Routines must have explicit inputs/outputs and dependency contracts |
| State and restartability | Control tables + operational conventions encode rerun behavior | Idempotency markers + deterministic ordering enforced |
| Performance model | Procedural loops sometimes tolerated under platform constraints | Set-based refactors often required for cost/latency |
| Dynamic SQL | String-built SQL patterns common | Prefer parameter binding and explicit identifier rules |
Examples
Representative patterns for translating Teradata UDF/macro logic into BigQuery routines. Adjust datasets, types, and identifiers to match your environment.
-- Teradata UDF (illustrative)
-- (Exact syntax varies by TD version; treat as representative)
CREATE FUNCTION util.safe_div(n DECIMAL(18,6), d DECIMAL(18,6))
RETURNS DECIMAL(18,6)
RETURN CASE WHEN d IS NULL OR d = 0 THEN NULL ELSE n / d END;Common pitfalls
- Macro misuse: macros used as “functions” with hidden dependencies; migration must make dependencies explicit.
- Row-by-row loops/cursors: common in procedural code; often needs set-based refactors for BigQuery cost/runtime.
- Session coupling: reliance on session variables, volatile tables, or execution context assumptions.
- Exception semantics mismatch: differences in how failures propagate and what counts as success.
- Type drift in branches: CASE/IF returns mixed types; BigQuery may require explicit casts.
- Dynamic SQL injection risk: string concatenation without parameter binding and escaping.
- Side effects not modeled: control-table updates and audit logging missing, breaking restartability.
Validation approach
- Compile + interface checks: every routine compiles in BigQuery; argument/return types match the contract.
- Golden tests: curated input sets validate outputs, including NULL-heavy and boundary cases.
- Branch + failure-mode coverage: exercise key branches and expected failures (missing rows, invalid inputs).
- Side-effect verification: assert expected writes to control/log/audit tables and idempotency under retries.
- Integration replay: run routines within representative ETL slices and compare downstream KPI aggregates.
- Performance gate: confirm no hidden row-by-row scans; set-based refactors validated with cost/latency baselines.
Migration steps
- 01
Inventory routines and build the call graph
Collect Teradata stored procedures, macros, and UDFs, plus their callers and dependencies. Identify side effects: control tables, audit logs, temp/volatile usage, and dynamic SQL paths.
- 02
Define the behavior contract
For each asset, specify inputs/outputs, error behavior, side effects, restartability semantics, and performance expectations. Choose the BigQuery target form (UDF/view/procedure/refactor).
- 03
Convert logic with safety patterns
Rewrite control flow, casts, and dynamic SQL using parameter binding and explicit identifier rules. Replace cursor/loop logic with set-based SQL where feasible to control cost and runtime.
- 04
Build a validation harness
Create golden inputs, boundary cases, and failure-mode tests. Validate outputs and side effects deterministically so behavior isn’t debated during cutover.
- 05
Integrate, replay, and cut over
Run routines within representative ETL slices, compare downstream KPIs, and verify rerun/restart behavior. Cut over behind gates with rollback-ready criteria.
We inventory your Teradata procedures/macros/UDFs, choose the right BigQuery target form per asset, migrate a representative subset, and deliver a harness that proves parity-including side effects and restart behavior.
Get an actionable conversion plan, review markers for ambiguous intent, and validation artifacts so Teradata routine cutover is gated by evidence and rollback criteria.