Role:Data Engineer
Focus:Cloud Architecture
Tools:Spark, Airflow, SQL
Passion:Big Data & Pipelines
AI-powered career intelligence for developers—maps your stack into a live knowledge graph, matches you against startups, surfaces hackathons, runs skill gap analysis, and grounds chat in a personal wiki. HydraDB persists everything across sessions. Built for WikiThon 2026. Live at devradar-seven.vercel.app.
Visual workflow builder for desktop AI agents in a containerized virtual desktop. Agents remember across sessions, recover from failures, and adapt strategy via Remember, Recall, Recover, and Plan nodes—powered by HydraDB, Groq, and Playwright browser automation. Built for the Agents Under Pressure hackathon.
Customer data pipeline implementing Bronze, Silver, and Gold medallion architecture in Snowflake with DML operations for incremental loads and historical tracking.
End-to-end pipeline for ingesting, transforming, and analyzing news data in Snowflake with reporting and analytics capabilities.
Batch ETL pipeline for car rental domain data into Snowflake—ingestion, staging, and warehouse layers for analytics and reporting.
Delta Live Tables (DLT) pipeline implementing medallion architecture for healthcare data—Bronze to Gold with quality checks and lineage.
Data warehouse design for travel booking with Type 2 Slowly Changing Dimensions (SCD2) for historical tracking and point-in-time reporting.
Event-driven data pipeline on Databricks for e-commerce events—streaming ingestion, processing, and analytics with scalable architecture.
FastAPI microservice that accepts financial transactions, computes cash-flow summaries, evaluates risk flags (negative net flow, large outflows, NSF risk), and returns a readiness classification—strong, structured, or requires clarification. Built for the Daxita Backend Engineering Challenge, containerized with Docker.
Cross-platform desktop app for real-time system resource monitoring (CPU, RAM, Storage) built with Electron, React, and TypeScript—featuring interactive Recharts, system tray, and builds for macOS, Windows, and Linux.
Why partition skew can turn a 4-hour job into a 45-minute one—and how to find and fix it. Plus when to cache (and when not to), and how to read the Spark UI like a pro.
ACID on object storage, time travel for 2 a.m. debugging, and schema evolution without breaking pipelines. When to choose Delta over raw Parquet—and when not to.
How partition keys drive both ordering and scale, exactly-once semantics without the headache, and why consumer lag is your best early-warning signal. Tune first, then scale.
Turn data quality into executable checks that run in your pipeline. Custom expectations that matter, integration with DAGs, and how to avoid alert fatigue so people actually act on failures.
Lift-and-shift vs. redesign, where cost surprises really come from, and a phased cutover plan that includes validation and rollback. Use the move as a chance to fix technical debt.
Repeatable, versioned infra for clusters and buckets; secrets in the vault, not in code; and how to catch drift before it becomes a fire. Same code for dev, staging, and prod.