What are the responsibilities and job description for the Data Engineer || Hybrid in Dallas TX, Charlotte NC position at 1 Point System?
Job Details
REQUIREDSQLSparkPythonLegacy ETL tooling (SSIS preferred, Ab Initio, Informatica etc.)Cloud experience (AWS/Google Cloud Platform/Azure etc.) Project Overview: Enables data for reporting and downstream consumption (both operational and analytics workstreams). Currently on a legacy platform and working to set up in Google Cloud. Tech Requirements:SQLPythonSparkLegacy ETL tooling (SSIS preferred, Ab Initio, Informatica)Cloud experience (Cloud-native warehousing, Google Cloud Platform preferred, Azure/AWS okay.. Does not need a Cloud Data Engineer) Work Overview: Have over 150 data sources, mostly on-prem relational databases (SQL Server, some Oracle), Teradata, and some files.Existing data pipelines are batch-driven, using SSIS, Ab Initio, and Informatica, and her existing team has experience in these tools. Refactoring existing data movements and ETL jobs into Python/Spark pipelines.They do not own the Python/Spark framework and will not be making modifications to it, but adopting it for their data pipelines.Majority of initial work will be around migrating SSIS packages to Spark, needs strong SQL skills.In tandem, data architecture team will be setting up Google Cloud Platform environment, will eventually "reroute" pipelines to BigQuery/BigTable and introduce DataProc. Will be using Dremio or Starburst for virtualization, not finalized yet.Following traditional medallion architecture, ingesting into bronze/raw layer, silver layer usage for more operational workflows, gold layer for reporting/analytics.Mostly batch processing now but will get into event-driven architecture down the road a little.