What are the responsibilities and job description for the Data Engineer - Columbus, OH - Only Locals - Independent Candidates Only !! position at Radiantze?
Job Details
We are seeking a highly skilled Data Engineer with extensive experience in building and managing data pipelines on the AWS Cloud platform, particularly using Databricks. The ideal candidate will have a strong background in designing and optimizing data pipelines, ETL processes, and working with large datasets. This role requires hands-on experience with AWS services and Databricks, ensuring seamless data flow and integration for advanced analytics and reporting.
Key Responsibilities:
Design, build, and manage robust, scalable, and high-performance data pipelines on AWS and Databricks.
Implement ETL/ELT processes to extract, transform, and load large datasets from multiple sources into data lakes or data warehouses.
Collaborate with data scientists and analysts to ensure data availability and accessibility for analytics and machine learning projects.
Optimize and troubleshoot performance of existing data pipelines and workflows to ensure efficiency and reliability.
Work with AWS services such as S3, Redshift, Glue, Lambda, and RDS to manage data storage, transformation, and processing.
Develop, maintain, and enhance Databricks notebooks for data processing, transformation, and analytics.
Ensure data integrity, governance, and security across all data pipelines and workflows.
Create and maintain detailed technical documentation for all data processes and architectures.
Implement data quality checks and monitoring solutions to ensure data accuracy and consistency.
Stay up to date with emerging technologies and best practices in data engineering and cloud-based infrastructure.
Qualifications:
Bachelor s degree in Computer Science, Engineering, or a related field.
3 years of experience as a Data Engineer, specifically working with AWS Cloud and Databricks.
Strong experience with AWS services such as S3, Redshift, Glue, Lambda, and RDS.
Proficiency in building and optimizing data pipelines and ETL processes.
Solid experience with Databricks and Apache Spark for large-scale data processing.
Strong programming skills in Python or Scala for data manipulation and transformation.
Proficiency with SQL for data querying and transformation.
Familiarity with distributed computing and parallel processing techniques.
Experience working with data lake and data warehouse architectures.
Knowledge of data governance, security best practices, and data quality principles.
Excellent problem-solving skills and attention to detail.
Strong communication skills and the ability to work collaboratively in a team environment.
Preferred Qualifications:
Experience with streaming data pipelines (e.g., using Kafka or Kinesis).
Familiarity with DevOps practices, CI/CD pipelines, and infrastructure as code (Terraform or CloudFormation).
Certifications in AWS (e.g., AWS Certified Data Analytics Specialty) are a plus.
Experience with other data tools and platforms such as Snowflake or Apache Airflow.
Regards,
Radiantze Inc