Back to Projects
70% query performance improvement (p95: 42s → 11s), 40% cost reduction, 100TB migrated with zero downtime

100TB Warehouse Migration — Redshift & Oracle → Snowflake + BigQuery

Led migration of 100+ TB from on-premise Oracle and legacy AWS Redshift to Snowflake and BigQuery using dual-write validation strategy. Re-modeled physical layer with micro-partition clustering and incremental ELT using dbt. Achieved 70% query performance improvement (p95: 42s → 11s) and 40% cost reduction with zero-downtime cutover.

View on GitHub

Problem

Legacy Oracle and Redshift infrastructure struggling with growing data volumes. T+1 batch windows causing business delays. P95 query times at 42 seconds. No proper incremental ETL patterns. Rising infrastructure costs.

Solution

Implemented dual-write validation strategy allowing zero-downtime cutover. Re-architected physical data model with micro-partition clustering for Snowflake and column ordering for BigQuery. Converted all pipelines to incremental ELT using dbt. Established automated data validation framework.

Architecture

Source Systems (Oracle + Redshift) → Dual-write validation → Snowflake (micro-partitions) + BigQuery (columnar) → dbt incremental ELT → BI layer

Key Challenges

  • Ensuring data consistency during dual-write phase across 100+ tables
  • Optimizing Snowflake clustering keys and BigQuery partitioning for query patterns
  • Converting complex stored procedures to dbt models while maintaining logic
  • Achieving zero-downtime cutover with rollback plan for critical business operations

Tech Stack

SnowflakeBigQuerydbtApache AirflowAirbytePythonSQL