Senior Data Engineer

Building Scalable
Data Pipelines

Transforming raw data into actionable insights. Specializing in distributed computing with Apache Spark, robust SQL architectures, and intuitive BI Dashboards.

Technical Arsenal

Tools and technologies I use to solve complex data problems.

Core Engineering

  • Python (Pandas, PySpark)
  • Advanced SQL (Window Functions, CTEs)
  • Apache Spark (Optimization)
  • Airflow / Dagster

BI & Visualization

  • Microsoft Power BI (DAX)
  • Tableau (Storytelling)
  • Metabase (Self-service)
  • Data Modeling (Star Schema)

Cloud & Infra

  • AWS (S3, EMR, Redshift)
  • Docker & Kubernetes
  • Linux / Bash Scripting
  • Git / CI/CD Pipelines

Featured Projects

End-to-end data solutions from ingestion to visualization.

Real-Time ETL Pipeline with Spark

Designed a scalable pipeline ingesting 50k+ events/sec from Kafka. Processed data using PySpark on AWS EMR, transformed it into a Delta Lake format, and loaded it into Snowflake for analytics.

PySpark Kafka AWS EMR Delta Lake

Pipeline Architecture Diagram

Enterprise Sales Dashboard

Built a comprehensive BI suite connecting to a SQL Server warehouse. Created complex DAX measures for YoY growth and cohort analysis. Deployed interactive dashboards in Power BI and Metabase for stakeholders.

Power BI SQL Server Metabase DAX

Dashboard Preview

Let's Connect

Currently open to opportunities where I can leverage my experience in Big Data and Analytics to drive business value.