Validation & reconciliation for Snowflake -> BigQuery
Turn "looks close" into a measurable parity contract. We prove correctness for batch and incremental systems with golden queries, KPI diffs, and integrity simulations-then gate cutover with rollback-ready criteria.
- Input
- Snowflake Validation & reconciliation logic
- Output
- BigQuery equivalent (validated)
- Common pitfalls
- Over-reliance on spot checks: a few sampled rows miss drift in edge cohorts and ties.
- No tolerance model: teams argue over “small” diffs because thresholds were never defined.
- Ignoring incremental behavior: parity looks fine on a static snapshot but fails under retries/late data.
Why this breaks
Most migration failures aren’t syntax problems-they’re undetected semantic drift. Teams cut over because queries compile, dashboards “look right,” and spot checks pass. Weeks later, KPIs diverge because the correctness rules were implicit and never stress-tested.
Common drift drivers in Snowflake -> BigQuery:
- Implicit time contracts: session timezone/NTZ behavior vs BigQuery’s explicit conversions
- Type coercion differences: CASE/COALESCE branches and join keys silently cast differently
- Windowing ambiguity: top-1/QUALIFY logic without deterministic tie-breakers
- Semi-structured extraction: VARIANT access relies on implicit casts; BigQuery requires explicit JSON typing
- Incremental behavior: retries, reprocessing windows, and late-arrival corrections weren’t validated
Validation must treat the system as incremental + operational, not a one-time batch.
How conversion works
- Define the parity contract: which outputs must match (tables, KPIs, dashboards) and what tolerances apply (exact vs threshold).
- Build the validation dataset: golden inputs, edge cohorts (ties, null-heavy segments), and representative time windows (including boundary days).
- Run execution gates: compile + run checks, schema/type alignment, dependency readiness.
- Run parity gates: KPI aggregates, checksum aggregates, dimensional rollups, and targeted row-level sampling diffs.
- Validate incremental integrity (where applicable): idempotency reruns, late-arrival injections, and backfill simulations.
- Gate cutover: define pass/fail thresholds, canary strategy, rollback triggers, and monitoring for post-cutover regressions.
Supported constructs
Representative validation and reconciliation mechanisms we apply in Snowflake -> BigQuery migrations.
| Source | Target | Notes |
|---|---|---|
| Golden queries / reports | Golden query harness + repeatable parameter sets | Codifies business sign-off into runnable tests. |
| Row counts and profiles | Partition-level counts + null/min/max/distinct profiles | Cheap, high-signal early drift detection. |
| KPI validation | Aggregate diffs by key dimensions + tolerance thresholds | Aligns validation with business outcomes. |
| Row-level diffs | Targeted sampling diffs + edge cohort tests | Use deep diffs only where aggregates signal drift. |
| Incremental pipelines | Idempotency reruns + late-arrival/backfill simulations | Validates behavior under retries and late data. |
| Operational sign-off | Canary gates + rollback criteria + monitors | Prevents "successful cutover" from becoming a long tail of KPI debates. |
How workload changes
| Topic | Snowflake | BigQuery |
|---|---|---|
| Drift drivers | Session-level time & implicit casting often tolerated | Explicit timezone conversions and casting required |
| Cost of validation | Warehouse credits; some diff styles cheaper | Bytes scanned + slot time; design layered gates |
| Incremental semantics | Streams/Tasks often encode retry/late behavior | Behavior must be simulated explicitly |
Examples
Illustrative patterns for parity checks in BigQuery. Replace datasets, keys, and KPI definitions to match your migration.
-- Partition/window row counts (BigQuery)
SELECT
DATE(event_ts) AS d,
COUNT(*) AS rows
FROM `proj.mart.fact_orders`
WHERE DATE(event_ts) BETWEEN @start_date AND @end_date
GROUP BY 1
ORDER BY 1;Common pitfalls
- Over-reliance on spot checks: a few sampled rows miss drift in edge cohorts and ties.
- No tolerance model: teams argue over “small” diffs because thresholds were never defined.
- Ignoring incremental behavior: parity looks fine on a static snapshot but fails under retries/late data.
- Unstable ordering: window functions without complete ORDER BY; results drift under parallelism.
- Wrong level of comparison: comparing raw rows when the business cares about KPI rollups (or vice versa).
- Cost-blind validation: exhaustive row-level diffs can become expensive; use layered gates (cheap->deep).
Validation approach
Gate set (layered)
Gate 0 - Readiness
- BigQuery datasets, permissions, and target schemas exist
- Dependencies (UDFs/routines, reference tables) are deployed
Gate 1 - Execution
- Converted jobs compile and run reliably
- Deterministic ordering + explicit casts enforced in high-risk areas
Gate 2 - Structural parity
- Row counts by partition/window
- Null/min/max/distinct profiles for key columns
Gate 3 - KPI parity
- KPI aggregates by key dimensions
- Top-N rankings and funnel metrics validated on edge windows
Gate 4 - Integrity (incremental systems)
- Idempotency: rerun same batch -> no net change
- Late-arrival: inject late updates -> only expected rows change
- Backfill: historical windows replay without drift
Gate 5 - Cutover & monitoring
- Canary rollout criteria + rollback triggers
- Post-cutover monitors: latency, scan bytes/slot time, diff sentinels on critical KPIs
Migration steps
- 01
Define the parity contract
Decide what must match (tables, dashboards, KPIs), at what granularity, and with what tolerance thresholds. Identify golden queries and business sign-off owners.
- 02
Create validation datasets and edge cohorts
Select representative time windows, boundary days, and cohorts that trigger edge behavior (ties, null-heavy segments, timezone boundaries, schema drift).
- 03
Implement layered gates
Start with cheap checks (counts/profiles), then KPI diffs, then deep diffs only where needed. Codify gates into runnable jobs so validation is repeatable.
- 04
Validate incremental integrity
Run idempotency reruns, late-arrival injections, and backfill simulations. These are the scenarios that typically break after cutover if not tested.
- 05
Gate cutover and monitor
Establish canary/rollback criteria and post-cutover monitors for critical KPIs and pipeline health (latency, failures, scan bytes/slot time).
We define your parity contract, build the golden-query set, and implement layered reconciliation gates-including reruns and late-data simulations-so KPI drift is caught before production cutover.
Get a validation plan, runnable gates, and sign-off artifacts (diff reports, thresholds, monitors) so Snowflake->BigQuery cutover is controlled and dispute-proof.