What are the responsibilities and job description for the ETL lead with PySpark expertise position at OKAYA INFOCOM?
ETL lead with PySpark expertise
Locations : Minneapolis, MN or Atlanta, GA or Dallas, TX or Charlotte, NC
Full time role
ETL Developer to create data pipeline ETL Jobs using PySpark within the financial services industry.
Responsibilities:
Work with a scrum team(s) to deliver product stories according to priorities set by the business and the Product Owners.
Interact with stakeholders.
Provide knowledge transfer to other team members.
Creating and testing pipeline jobs locally using aws glue interactive session.
Performance tuning of PySpark jobs.
AWS Athena to perform data analysis on Lake data populated into aws glue data catalog through aws glue crawlers.
Must Haves:
Responsible for designing, developing, and maintaining ETL processes to support data integration and business intelligence initiatives.
Need to closely work with stakeholders to understand data requirements and ensure efficient data flow and transformation using ETL tools and PySpark
Develop and implement ETL processes using with one of ETL tool and PySpark to extract, transform, and load data.
4 years of experience in ETL development with knowledge on Pyspark
5 years as an ETL Developer
SQL expert
AWS Glue WITH Python ( PySpark )
PySpark Dataframe API
Spark SQL
Knowledge in AWS services (e.g. DMS, S3, RDS, Redshift, Step Function).
Nice to Haves:
ETL development experience with tools e.g. SAP BODS, Informatica.
Good understanding of version control tools like Git, GitHub, TortoiseHg.
Financial services experience
Agile