Workload

Snowflake stored procedures & UDFs to BigQuery

Re-home procedural logic-control flow, dynamic SQL, exception handling, and reusable UDFs-from Snowflake Scripting into BigQuery routines with an explicit behavior contract and validation harness.

At a glance
Input
Snowflake Stored procedure / UDF migration logic
Output
BigQuery equivalent (validated)
Common pitfalls
  • Row-by-row loops: cursor/loop patterns that should become set-based SQL (performance + cost).
  • Session state coupling: reliance on session variables, temporary objects, or execution context.
  • Dynamic SQL injection risk: string concatenation without parameter binding and escaping.
Context

Why this breaks

Stored procedures and UDFs are where Snowflake estates hide their most fragile behavior: dynamic SQL, exception handling, session state assumptions, and side effects like logging, temp tables, or control tables. BigQuery has strong support for routines, but the execution model differs-so direct translation is rarely enough.

Common symptoms after migration:

  • Procedures compile but produce different results because NULL/cast semantics changed inside branches
  • Dynamic SQL behavior drifts due to differences in quoting, parameter binding, or identifier resolution
  • Error handling no longer matches (silent continue vs fail-fast), breaking downstream orchestration
  • Temp table/session-variable assumptions don’t hold, causing intermittent failures under concurrency
  • UDF outputs drift because implicit typing/collation or timezone assumptions were never explicit
Approach

How conversion works

  1. Inventory & classify routines (Snowflake procedures, SQL UDFs, JavaScript UDFs), call graph, and dependency surfaces (tables, views, tasks, control tables).
  2. Extract the behavior contract: inputs/outputs, side effects, error handling rules, state assumptions, and performance constraints.
  3. Choose target form per routine:
    • BigQuery SQL UDF for pure expressions
    • BigQuery JavaScript UDF for complex logic with arrays/objects
    • BigQuery Stored procedure (SQL scripting) for control flow and multi-statement logic
    • Refactor to set-based SQL where procedural loops can be eliminated
  4. Rewrite dynamic SQL safely using parameter binding patterns and explicit identifier rules.
  5. Validate with a harness: golden inputs/outputs, branch coverage, failure-mode tests, and replayable datasets for deterministic sign-off.

Supported constructs

Representative Snowflake procedure/UDF constructs we commonly migrate to BigQuery routines (exact coverage depends on your estate).

SourceTargetNotes
Snowflake SQL UDFsBigQuery SQL UDFsPure expressions mapped with explicit casts and NULL behavior.
Snowflake JavaScript UDFsBigQuery JavaScript UDFsUsed when logic is non-trivial or object/array shaping is required.
Snowflake stored procedures (Scripting)BigQuery stored procedures (SQL scripting)Control flow rewritten; state and side effects modeled explicitly.
Dynamic SQL (EXECUTE IMMEDIATE)EXECUTE IMMEDIATE with parameter binding patternsAvoid string injection; normalize identifier resolution rules.
Exception handling / return codesBigQuery error handling + explicit success/fail contractFail-fast vs continue-on-error is made explicit per routine.
Temp tables / staging in proceduresTemporary tables + scoped datasets / refactored stagingConcurrency-safe and replayable without hidden session coupling.

How workload changes

TopicSnowflakeBigQuery
Execution modelProcedures often rely on session context and implicit stateRoutines should be explicit about state, inputs, and side effects
Dynamic SQLConcatenated SQL strings commonPrefer parameter binding and normalized identifier rules
PerformanceRow-by-row loops sometimes toleratedSet-based refactors usually required for cost/latency
Error handlingTRY/CATCH patterns and return codes vary by teamExplicit failure semantics + harnessed tests
Execution model: We convert hidden assumptions into testable contracts.
Dynamic SQL: Reduces drift and injection risk.
Performance: Procedural loops are often the largest hidden cost after migration.
Error handling: Downstream orchestration depends on predictable failures.

Examples

Representative patterns for translating Snowflake scripting and UDF logic into BigQuery routines. Adjust datasets, types, and identifiers to match your environment.

-- Snowflake SQL UDF
CREATE OR REPLACE FUNCTION util.safe_div(n NUMBER, d NUMBER)
RETURNS NUMBER
AS
$$
  IFF(d = 0 OR d IS NULL, NULL, n / d)
$$;
Avoid

Common pitfalls

  • Row-by-row loops: cursor/loop patterns that should become set-based SQL (performance + cost).
  • Session state coupling: reliance on session variables, temporary objects, or execution context.
  • Dynamic SQL injection risk: string concatenation without parameter binding and escaping.
  • Exception semantics mismatch: differences in TRY/CATCH, RAISE, return codes, and what counts as “success.”
  • Type drift in CASE/IF branches: branches return mixed types; BigQuery may require explicit casts.
  • Timezone assumptions: using session timezone or NTZ/LTZ behavior implicitly inside routines.
  • Side effects not modeled: logging/control tables updated differently, breaking idempotency or audit trails.
Proof

Validation approach

  • Compile + unit checks: every routine compiles in BigQuery; interfaces (args/return types) match the contract.
  • Golden tests: curated input sets validate outputs for each routine (including boundary values and NULL-heavy cases).
  • Branch + failure-mode coverage: exercise key branches and expected failures (invalid inputs, missing rows, constraint violations).
  • Side-effect verification: assert expected writes to audit/control/log tables and idempotency behavior under retries.
  • Integration replay: run routines as part of a representative pipeline slice and compare downstream KPIs/aggregates.
  • Performance gate: confirm no hidden row-by-row scans; set-based refactors validated with cost/latency baselines.
Execution

Migration steps

A sequence that keeps behavior explicit, testable, and safe to cut over.
  1. 01

    Inventory routines and build the call graph

    Collect procedures/UDFs, their callers, dependent objects, and execution contexts (tasks, dbt, apps). Identify side effects: writes to control/log tables, temp objects, and dynamic SQL paths.

  2. 02

    Define the behavior contract

    For each routine, specify inputs/outputs, expected error behavior, state assumptions, and performance expectations. Decide the target form (SQL UDF, JS UDF, stored procedure, or refactor to set-based SQL).

  3. 03

    Convert logic with safety patterns

    Rewrite control flow, casts, and dynamic SQL with parameter binding and explicit identifier rules. Replace cursor/loop logic with set-based SQL where feasible to control cost and runtime.

  4. 04

    Build a validation harness

    Create golden test inputs, boundary cases, and failure-mode tests. Validate outputs and side effects deterministically so behavior is not debated during cutover.

  5. 05

    Integrate, replay, and cut over

    Run routines within representative pipelines, compare downstream KPIs, and verify operational behavior (retries, alerts, idempotency). Cut over behind gates with rollback-ready criteria.

Workload Assessment
Migrate procedural logic with a test harness

We inventory your Snowflake procedures and UDFs, choose the right BigQuery target form per routine, migrate a representative subset, and deliver a harness that proves parity-including side effects and failure behavior.

Migration Acceleration
Cut over routines with proof-backed sign-off

Get an actionable conversion plan, review markers for ambiguous intent, and validation artifacts so stored procedure and UDF cutover is gated by evidence and rollback criteria.

FAQ

Frequently asked questions

Do we have to rewrite everything as BigQuery stored procedures?+
Not necessarily. Many Snowflake routines can be simplified into set-based SQL, SQL UDFs, or small BigQuery procedures. We choose the target form per routine to minimize operational risk and cost.
What about dynamic SQL-heavy procedures?+
We migrate dynamic SQL using safe patterns: normalized identifier rules, parameter binding, and explicit escaping. Where dynamic SQL is overused, we often refactor into deterministic templates.
How do you prove behavior parity?+
We build a replayable harness: golden inputs/outputs, branch coverage, expected failure cases, and side-effect checks. Integration replay validates downstream metrics before cutover.
Will performance change after migration?+
It can. Row-by-row loops are the most common risk. We refactor to set-based SQL where possible and gate performance with cost/latency baselines so routines don’t become hidden BigQuery spend.