What are the responsibilities and job description for the Azure Data Engineer position at Pronix Inc?
Job Details
Role: Data Engineer
Location: NYC, NY, 10007
Onsite / Remote: Onsite Hybrid (2-3 days from office)
Duration: 12 Months Possible extension
Years of experience: 12 years minimum 15 years maximum
Number of Internal interviews: 2(Virtual F2F Interview)
Tentative start date ASAP
Job Description:
- We are seeking a highly skilled Data Engineer / Data Architect with expertise in designing and implementing scalable data solutions on Azure Cloud.
- The ideal candidate will have a strong background in Azure Data Factory, Azure Databricks, Delta Lake, and Azure Blob Storage for large-scale data processing.
- This role requires proficiency in ETL/ELT pipeline development, data lake architecture, and machine learning model deployment using MLflow and Kubernetes.
- Additionally, the candidate should have experience in Power BI for data visualization and business insights, along with a proven track record of delivering high-impact data solutions for banking, financial services, and supply chain management.
Key Responsibilities:
- Data Architecture & Engineering: Design and implement scalable, high-performance data architectures on Azure Cloud. Develop and maintain ETL/ELT pipelines using Azure Data Factory and Azure Databricks. Architect and manage data lakes and Delta Lake storage for efficient data processing and analytics. Optimize data storage solutions using Azure Blob Storage, Azure SQL, and Synapse Analytics.
- Data Processing & Machine Learning Deployment: Develop real-time and batch data processing solutions using Apache Spark on Azure Databricks. Deploy and manage machine learning models using MLflow, Kubernetes, and Azure Machine Learning. Implement data governance, security, and compliance best practices for data solutions.
- Data Visualization & Business Insights: Create interactive Power BI dashboards to provide actionable business insights. Integrate data models, reports, and analytics for decision-making in banking, financial services, and supply chain management. Collaborate with business stakeholders to define KPIs and data-driven strategies.
- Performance Optimization & Automation: Optimize query performance, data indexing, and partitioning strategies for large-scale data sets. Automate data workflows and monitoring using Azure DevOps and CI/CD pipelines. Implement best practices for data modeling, storage, and retrieval to improve system efficiency.
Required Skills & Qualifications:
Cloud & Data Technologies:
- Expertise in Azure Data Factory (ADF), Azure Databricks, Data Lake, and Azure Blob Storage.
- Strong knowledge of Azure Synapse Analytics, Azure SQL Database, and Cosmos DB.
- Hands-on experience with Apache Spark, Python, Scala, and SQL for data processing.
ETL/ELT & Data Pipelines:
- Proven experience in building end-to-end ETL/ELT data pipelines.
- Strong understanding of data lake architecture, structured & unstructured data management.
Machine Learning & AI Integration: Good to have
- Experience with MLflow and Kubernetes for deploying machine learning models.
- Understanding of MLOps best practices in an enterprise environment.
Data Visualization & BI:
- Proficiency in Power BI for creating dashboards, data models, and reports.
- Ability to integrate multiple data sources for in-depth analytics.
Big Data & Performance Optimization: Good to have
- Experience in data partitioning, indexing, and query optimization techniques.
- Familiarity with parallel computing, distributed data processing, and real-time analytics.
Soft Skills:
- Excellent problem-solving, analytical thinking, and communication skills.
- Ability to collaborate with cross-functional teams, stakeholders, and leadership.
Preferred Qualifications:
- Bachelor's/master's degree in computer science, Data Engineering, or a related field.
- Certifications in Azure Data Engineering (DP-203), Azure Solutions Architect (AZ-305), or Azure Data Engineer. Good to have.
- Experience with Kafka, Event Hubs, or other real-time data streaming technologies. Good to have.
- Knowledge of data security, GDPR, and compliance regulations. Good to have.