What are the responsibilities and job description for the Senior Data Engineer position at JustinBradley?
JustinBradley’s client, a leading source of mortgage financing, is seeking a Senior Data Engineer to join our team. This role is critical for setting up and managing Change Data Capture (CDC) for multiple types of databases to hydrate a data lake. You will work closely with teams to orchestrate the flow of raw CDC data and perform ETL transformations to ensure the data is transformed into a usable, query-able form for analytics. The ideal candidate will have hands-on experience with Apache Spark for both batch and streaming data processing and will be well-versed in performance tuning and Big Data concepts.
Responsibilities :
- Set up and manage Change Data Capture (CDC) for various databases to ensure data flows seamlessly into a data lake.
- Implement ETL transformations using Apache Spark, handling both streaming and batch processing of data.
- Work with Apache Spark DataFrames, Spark SQL, and Spark Streaming to design and develop robust data pipelines.
- Orchestrate the transformation of raw CDC data into structured, analytics-ready datasets.
- Collaborate with cross-functional teams to understand data requirements and ensure data is correctly transformed and made available for downstream analysis.
- Optimize performance of data pipelines, ensuring efficient data processing and storage.
- Work with AWS services, including EMR, Glue Data Catalog, Lambda, and S3 to integrate, store, and manage data.
- Utilize Apache Airflow to orchestrate and automate workflows for data processing.
- Keep up-to-date with the latest trends and technologies in Big Data and cloud computing to improve system performance and scalability.
Requirements :
JustinBradley is an EO employer - Veterans / Disabled and other protected employees.