What are the responsibilities and job description for the Lead Data Engineer (Databricks, Azure Data Lake, Python) position at MSys Technologies - USA?
Job Details
Title : Technical Lead / Data Engineering Specialist (Databricks, Azure Data Lake, Python)
Location : Greenfield, IN 46140 (Onsite)
Rate: DOE
We are hiring a highly skilled Technical Lead/Data Engineering Specialist with extensive experience in Cloud technologies, DevOps development practices, Data Engineering to support and enhance RDAP initiatives.
Key Responsibilities
- Design, develop, and maintain Databricks Lakehouse solutions sourcing from Cloud platforms such as Azure Synapse and Google Cloud Platform
- Implement and manage DevOps and CICD workflows using tools like GitHub
- Apply best practices in test-driven development, code review, branching strategies, and deployment processes
- Build, manage, and optimize Python packages using tools like setup, poetry, wheels, and artifact registries
- Develop and optimize data pipelines and workflows in Databricks, utilizing PySpark and Databricks Asset Bundles
- Manage and query SQL databases (Unity Catalog, SQL Server, Hive, Postgres)
- Implement orchestration solutions using Databricks Workflows, Airflow, and Dagster
- Work with event-driven architectures using Kafka, Azure Event Hub, and Google C4 Cloud Pub/Sub
- Develop and maintain Change Data Capture (CDC) solutions using tools like Debezium
- Extensive experience in design and implementation of data migration projects specifically involving Azure Synapse and Databricks Lakehouse
- Manage Cloud storage solutions, including Azure Data Lake Storage (ADLS) and Google Cloud Storage (GCS)
- Configure and manage identity and access solutions using Azure Active Directory, including AD Groups, Service Principals, and Managed Identities
- Effective interactions in Customer for understanding requirements, participating in design discussions and translating requirements into deliverables by working with the development team at Offshore; Effective in collaborating with cross-functional teams across development, operations, and business units; Strong interpersonal skills to build and maintain productive relationships with team members
- Problem-Solving and Analytical Thinking Capability to troubleshoot and resolve issues efficiently; Analytical mindset for optimizing workflows and improving system performance
- Ability to convey complex technical concepts in a clear and concise manner to both technical and non-technical stakeholders; Strong documentation skills for creating process guidelines, technical workflows, and reports.
Technologies & Skills & Experience
- Databricks (PySpark, Databricks Asset Bundles)
- Python package builds(setup, poetry, wheels, artifact registries)
- Open File Formats (Delta/Parquet/Iceberg/etc )
- SQL Databases (Unity Catalog, SQL Server, Hive, Postgres)
- Orchestration Tools (Databricks Workflows, Airflow, Dagster)
- Azure Data Lake Storage, Azure Active Directory (AD groups, Service Principles, Managed Identities)
- Secondary/Other Skills/Good To have Kafka, Azure Event Hub, Cloud Pub/Sub; Change Data Capture (Debizum) and Google Cloud Storage
- Bachelor's Degree good in Computer Science, Information Technology or related with 12 years of experience
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.