You will be embedded directly into a FinTech client’s Data Pipeline team (Staff Augmentation model), working toward a critical July 2026 DaaS product launch. The client is actively migrating their data platform from PostgreSQL to Databricks. Your job is to prove the new architecture performs better, under real load, before launch.
There is little to no existing performance testing framework. Assume you are building it from scratch, establishing the baselines, and delivering the quantified evidence that defines launch readiness.
Pipeline Performance
Validating the speed and stability of Spring Java ETL jobs and Spark workloads running on Databricks under realistic production load. Identifying bottlenecks, profiling job execution, and defining acceptable throughput thresholds.
Data Loads & Transformations
Benchmarking both incremental and full data reload cycles. Stress-testing specific transformation logic including business key fetch operations — establishing SLA baselines where none currently exist.
Databricks Write Speed & Migration Validation
Measuring write throughput to S3 Delta tables and benchmarking directly against the existing PostgreSQL setup. This is not abstract performance testing — the client is migrating from PostgreSQL to Databricks and needs quantified proof that the new architecture delivers the expected performance lift before July launch.