Back to Projects
40% platform spend reduction in 90 days, 90% reduction in idle resource costs, full cost attribution per team

Cost Engineering Framework — 40% Platform Spend Reduction

Automated framework for Spark cluster rightsizing, S3 → Glacier storage tiering, and cross-workspace cost anomaly detection using Isolation Forest ML. Built centralized cost analytics aggregating AWS Cost Explorer, Databricks, and Snowflake usage. Achieved 40% platform spend reduction in 90 days.

View on GitHub

Problem

Cloud data costs growing 40% YoY with no visibility into spend drivers. Idle Databricks clusters running 24/7. No automated cost anomaly detection. Teams had no cost accountability or chargeback.

Solution

Developed cost engineering framework with: (1) Automated Spark cluster rightsizing recommendations, (2) S3 lifecycle policies with intelligent tiering, (3) Isolation Forest ML for cost anomaly detection, (4) Team-level chargeback reports with Slack alerts.

Architecture

Cost APIs (AWS/Databricks/Snowflake/GCP) → ETL Pipeline (Python) → Cost Database (PostgreSQL) → ML Anomaly Detection → Alerts + Dashboards

Key Challenges

  • Normalizing cost data across multiple cloud platforms and billing models
  • Building ML models to detect cost anomalies without excessive false positives
  • Implementing fair cost allocation for shared resources across teams
  • Creating actionable recommendations that don't disrupt data SLAs

Tech Stack

PythonTerraformDatabricksAWS Cost ExplorerGCPSnowflakescikit-learn