Open to Senior DE Roles
Senior · Staff · Principal Data Engineer

Pipelines that
scale. Systems
that don't fail.

6+ years owning production data infrastructure that processes 1B+ events/day — Kafka streaming, Databricks lakehouses, and 100TB+ warehouse migrations on AWS · GCP · Databricks. Correctness and fault tolerance are non-negotiable.

Kafka → Spark Streaming: 1B+ events/day, exactly-once, sub-5s latency on AWS EMR
Databricks medallion lakehouse: 60% faster pipelines, 50+ sources unified, Unity Catalog
100TB warehouse migration: Redshift → Snowflake, p95 query time 42s → 11s, zero downtime
Vasudev Rao
Senior DE

Vasudev Rao

Data Engineer · 6+ Years

1B+

Events/Day

< 5s

Stream Latency

99.9%

Pipeline SLA

Book Strategy Audit
Databricks
Apache Spark
PySpark
Snowflake
BigQuery
Delta Lake
Apache Kafka
Apache Airflow
dbt
AWS Glue
GCP Dataflow
Apache Iceberg
PostgreSQL
Python
Terraform
Docker
Kubernetes
Great Expectations
Databricks
Apache Spark
PySpark
Snowflake
BigQuery
Delta Lake
Apache Kafka
Apache Airflow
dbt
AWS Glue
GCP Dataflow
Apache Iceberg
PostgreSQL
Python
Terraform
Docker
Kubernetes
Great Expectations

What I Deliver

Stack, Skills & Experience

Processing 50TB+ of batch and streaming data daily across production pipelines on AWS and GCP.

1B+

Events / Day

Kafka → Spark Structured Streaming → Delta Lake

Exactly-once delivery · late-event watermarks · sub-5s latency

Apache KafkaSpark SSDelta LakeAWS EMR

40%

Platform Cost Saved

Spark optimization + storage tiering + auto-remediation

Isolation Forest anomaly detection · 20+ Databricks workspaces · 90 days

DatabricksTerraformAWS Cost ExplorerGCP

99.9%

Pipeline SLA

Great Expectations · self-healing · auto-restart

Structured alerting · PagerDuty · data quality gates at every layer

Great ExpectationsAirflowPagerDuty

Core Specialisms

Streaming Systems

End-to-end Kafka → Spark Structured Streaming pipelines with exactly-once delivery, watermark-based late-event handling, and idempotent MERGE writes to Delta Lake.

Apache KafkaSpark SSDelta LakeAWS EMR

Lakehouse Architecture

Bronze → Silver → Gold medallion architecture on Databricks. Unity Catalog governance, dbt schema contracts, Great Expectations quality gates, and Photon-powered Gold layer.

DatabricksDelta LakedbtUnity Catalog

Warehouse Modernisation

Redshift / Oracle → Snowflake + BigQuery migrations using dual-write validation strategy. Zero-downtime cutover, incremental ELT redesign, and physical data model optimisation.

SnowflakeBigQuerydbtAirbyte

Cost Engineering

Automated frameworks for Spark cluster rightsizing, S3 → Glacier storage tiering, and cross-workspace cost anomaly detection. 40% platform spend reduction in 90 days.

TerraformPythonDatabricksAWS + GCP

Production Data Stack

Languages & Processing

PythonPySparkApache SparkSQL

Streaming & Messaging

Apache KafkaSpark Structured StreamingGCP Pub/SubAWS Kinesis

Lakehouse & Storage

DatabricksDelta LakeApache IcebergAWS S3GCS

Warehousing

SnowflakeBigQueryRedshiftPostgreSQL

Orchestration & Quality

Apache AirflowdbtGreat ExpectationsMLflow

Cloud & Infrastructure

AWSGCPTerraformDockerKubernetesAWS GlueGCP Dataflow

Experience

Senior Data EngineerFintech Platform (100K+ apps/day)2023–Present

Kafka → PySpark real-time credit decisioning. 48h batch → <2min streaming. 200+ risk features computed in-flight. 95%+ model accuracy at 100K+ apps/day.

KafkaPySparkSpark SSPostgreSQLAWS
Data Platform EngineerEnterprise Data Platform (50+ sources)2022–2023

Unified 50+ AWS Glue jobs into Databricks medallion lakehouse. 60% pipeline runtime reduction. Unity Catalog governance across 8 teams. Zero schema conflicts.

DatabricksDelta LakedbtUnity CatalogAWS Glue
Data Engineering LeadCloud Migration Programme (100TB)2020–2022

Redshift + Oracle → Snowflake + BigQuery. Dual-write validation, zero downtime. p95 query time 42s → 11s. 40% cost reduction.

SnowflakeBigQuerydbtAirflowAirbyte

Featured Work

Selected Projects

< 5s Latency · 0 Data Loss

10M events/day Kafka → Spark Streaming Pipeline

Exactly-once · Late-event watermarks · Sub-5s end-to-end latency · AWS EMR

Production Kafka → Spark Structured Streaming pipeline at 10M+ events/day with exactly-once delivery to Delta Lake. Implemented watermark-based late-event handling (30-min tolerance), idempotent MERGE upserts, and a dead-letter queue with automatic replay. Reduced end-to-end data latency from 8 minutes to under 5 seconds.

Apache KafkaSpark Structured StreamingDelta LakeAWS EMRPySpark
STREAMING PIPELINE — EXACTLY-ONCEKafka10M+ events/daySpark SSlate-event · dedupDelta Lakeexactly-once writesLooker< 5s freshSpark Structured Streaming · late-event watermark · idempotent upsertsKey GuaranteesExactly-OnceLate Events OKAuto-DedupMinutes → < 5s latency · 10M+ events/day · 0 data loss
-60% Pipeline Runtime

Enterprise Lakehouse — Databricks Medallion Architecture

50+ siloed AWS Glue jobs → Bronze/Silver/Gold · Unity Catalog · 60% faster pipelines

Replaced a fragmented multi-warehouse topology (50+ isolated AWS Glue jobs, 8 engineering teams, no shared catalog) with a unified Delta Lake medallion architecture on Databricks. Unity Catalog for governance, automated schema contracts via dbt, Photon-powered Gold layer for BI and ML. Pipeline runtime dropped 60%, schema conflicts eliminated.

DatabricksDelta LakePySparkUnity CatalogdbtApache Spark
DELTA LAKE MEDALLION ARCHITECTUREKafka EventsS3 RawDB CDC50+ APIsBronzeDelta Lakeraw · CDC · ingestSilverDelta Lakeclean · typed · testedGoldDelta LakeKPIs · features · martsUnity Catalogdbt ModelsPhoton EngineDatabricks SQL-60% runtime50+ sources unified0 schema conflicts
BEFORE — Redshift + Oracle · T+1 · full scans · 100TBRedshiftOracleAFTER — DUAL-WRITE MIGRATION · ZERO DOWNTIMESnowflakemicro-partition clusteringauto-suspend · zero-copy cloneBigQuerycolumnar + partitioned42s → 11s p95 queriesdbt + Airflowincremental ELTdual-write validation70%Query Perf40%Cost Saved100TBMigrated100TB migrated · zero downtime · dual-write validated
70% Faster Queries

100TB Warehouse Migration — Redshift & Oracle → Snowflake + BigQuery

Dual-write validation · Zero downtime · p95 query time 42s → 11s

Led migration of 100+ TB from on-premise Oracle and legacy AWS Redshift to Snowflake and BigQuery using a dual-write validation strategy. Re-modelled physical layer (micro-partition clustering, column ordering, incremental ELT with dbt). Achieved 70% query performance improvement and 40% cost reduction with zero-downtime cutover.

SnowflakeBigQuerydbtAirflow
ML FEATURE STORE — DUAL-MODE SERVINGFeature StoreDatabricks · 1,000+ featurespoint-in-time correctOffline StoreDelta Lake · PySparkOnline StoreRedis · p99 < 8msTraining JobsMLflow TrackingInference API1,000+ features · 4 ML teams · p99 < 8ms online serving
p99 < 8ms · 4 Teams Served

ML Feature Store — 1,000+ Features, p99 < 8ms Online Serving

Dual-mode offline/online · Point-in-time correct · Zero training-serving skew

Centralised dual-mode feature platform on Databricks: Delta Lake offline store (point-in-time correct for training) and Redis online store (p99 < 8ms for inference) backed by identical feature definitions. Eliminated training-serving skew across 4 ML teams, cut feature engineering time from days to hours.

Databricks Feature StoreMLflowPySparkRedis
BEFORE — overnight batch · 48h decision latencyBatch scoring job · runs at 2am · 48 hours until decisionAFTER — REAL-TIME DECISIONING · < 2 MINUTESApplicants100K+/dayKafkaeventsPySparkfeature eng.ML Score95% accuracyLatency Reduction48h → < 2 minThroughput100K+ apps/day48h → < 2min · 100K+/day · 95%+ model accuracy maintained
48h → < 2min Decisioning

Real-Time Credit Decisioning — 48h Batch → < 2min Streaming

Kafka + PySpark micro-batch feature engineering · 100K+ apps/day · 95%+ accuracy

Replaced overnight batch credit scoring with a Kafka-driven real-time pipeline. PySpark micro-batch feature engineering computes 200+ credit risk signals in real time, integrated with a REST model serving layer. Reduced decisioning latency from 48 hours to under 2 minutes while maintaining 95%+ model accuracy at 100K+ applications/day.

PySparkApache KafkaPostgreSQLAWS

System Design

Architecture Patterns

Production patterns I design, operate, and own in production — built for correctness, scale, and operational simplicity.

Streaming

Lambda + Kappa Architecture

Unified batch + streaming pipelines. Kafka for real-time ingestion, Spark Structured Streaming for micro-batch processing, Delta Lake for atomic writes. Exactly-once via idempotent MERGE operations.

1
Kafka (RF=3)
2
Spark SS + Watermarks
3
Delta Lake MERGE
4
Looker / BI
Lakehouse

Medallion Lakehouse

Bronze → Silver → Gold architecture on Databricks. Raw CDC in Bronze, type-safe contracts in Silver via dbt + Great Expectations, business-ready aggregations in Gold served via Photon.

1
Bronze (raw · CDC)
2
Silver (dbt · tested)
3
Gold (Photon · KPIs)
4
Unity Catalog lineage
ML Infra

Dual-Mode Feature Store

Point-in-time correct offline training store (Delta Lake) and low-latency online serving store (Redis, p99 < 8ms) backed by identical feature definitions. Eliminates training-serving skew.

1
PySpark batch → Delta
2
Redis online serving
3
MLflow lineage
4
p99 < 8ms inference

Open to Senior Data Engineering Roles

Let's build data systems
that don't break at 3am.

Targeting senior and staff-level data engineering roles — fintech, ML infrastructure, and platform teams. Also available for consulting on streaming architecture, Spark performance, and lakehouse design.

✓ Apache Spark / PySpark✓ Kafka + Flink✓ Databricks / Delta Lake✓ Snowflake / BigQuery✓ AWS · GCP