Data and Platform Engineer Building cloud systems that are clear, trusted, and usable.
Hi, my name is
I am a data and platform engineer specializing in production lakehouse architectures, cloud infrastructure, streaming and CDC pipelines, and analytics systems that are meant to be used by real people.
Enterprise workflow
What I build: data moving safely from source systems into trusted analytics products.
multi-environment platform foundation
CI/CD, validation, and controlled deployments
orchestration, scheduling, operational flow
governance, external access, fine-grained control
application records and transactional data
event capture, streaming, and reliable delivery
raw zones and historical storage for traceability and replay
validated, deduplicated, analytics-ready records
business-ready marts for reporting and reuse
trusted models exposed for analytics; Athena serves SQL over the lake
dashboards, APIs, and natural-language analytics
01. About
I build reliable data platforms with a strong focus on cloud architecture, medallion data layers, CDC pipelines, analytics serving, and developer workflows. My best work sits at the intersection of engineering depth and clarity: I care about systems that not only run, but can also be understood, maintained, and improved by others.
Recently, that has meant building a full AWS enterprise data platform repo by repo, then recreating the same ideas in GCP to deepen my platform understanding in a new cloud environment. I like work that goes beyond isolated ETL jobs into full platform design: infrastructure, identity, transformation, orchestration, serving, and docs.
In addition to platform engineering, I have extended into LLM application development, shipping a natural-language analytics agent on top of a curated Gold layer so users can ask questions in plain English and get answers, SQL, charts, and reports back.
02. Experience
Architecting a production AWS medallion lakehouse platform: PostgreSQL via DMS CDC into Bronze, Silver, and Gold S3 layers, Glue PySpark transformations, dbt on Athena, and Redshift Serverless with Spectrum; provisioned through modular Terraform across Dev, Staging, and Prod with GitHub Actions CI/CD.
Built a real-time IoT data platform using Databricks Asset Bundles and Amazon Kinesis, processing over one million elevator telemetry records daily with sub-minute latency. Established GitHub Actions CI/CD for Databricks and dbt, enforced data quality, and built Power BI dashboards for operational decisions.
Built ELT workflows using dbt and AWS to centralize and standardize real estate data, replacing manual processes with automated pipelines. Deployed dashboards that contributed to stronger lead generation and better decision support.
03. Work
A full multi-repo platform with PostgreSQL via DMS CDC into Bronze, Glue PySpark into Silver, dbt on Athena into Gold, and an analytics agent on top.
Open platform docsNetworking, data lake buckets, IAM, serving infrastructure, monitoring, and the overall AWS platform foundation.
View repoNatural-language analytics with SQL guardrails, cost tracking, charts, interactive sessions, and PDF reporting.
View repoShared CDC reconciliation logic and six Glue jobs that turn raw change events into clean current-state Silver tables.
View repoGitHub Actions workflows that bring the platform up for a working session and tear it down cleanly to control cost.
View repoPrivate ongoing work translating the same platform ideas into GCP with foundation, bootstrap, BigLake governance, and strong multi-environment discipline.
More about the GCP lab04. Contact
If you are interested in data platforms, cloud data engineering, analytics systems, or practical enterprise architecture, the links below are the best places to reach me or browse my work.
For collaboration, platform discussions, or project questions.
nweke.edeh@gmail.com
Send emailBackground, experience timeline, and career highlights.
linkedin.com/in/edeh
Open profileRepositories, implementation details, and platform experiments.
github.com/ChuquEmeka
Open profile