What are the responsibilities and job description for the Hadoop Data Engineer W2 Only position at nTech Workforce?
Term of Employment
• Contract, 12 months
• This position is 100% remote. Candidates based in Maryland, Washington, DC, or Virginia are preferred, but not required.
Overview
Our client is seeking a Senior Data Engineer who will be responsible for orchestrating, deploying, maintaining and scaling cloud (Relational and NoSQL, distributed and converged) with emphasis on reliability, automation and performance. This role will focus on developing solutions and helping transform the company's platforms deliver data-driven, meaningful insights and value to company.
Responsibilities
• Develops and maintains infrastructure systems (e.g., data warehouses, data lakes) including data access APIs. Prepares and manipulates data using multiple technologies.
• Interprets data, analyzes results using statistical techniques, and provides ongoing reports. Executes quantitative analyses that translate data into actionable insights. Provides analytical and data-driven decision-making support for key projects. Designs, manages, and conducts quality control procedures for data sets using data from multiple systems.
• Develops data models by studying existing data warehouse architecture; evaluating alternative logical data models including planning and execution tables; applying metadata and modeling standards, guidelines, conventions, and procedures; planning data classes and subclasses, indexes, directories, repositories, messages, sharing, replication, back-up, retention, and recovery.
• Creates data collection frameworks for structured and unstructured data.
• Improves data delivery engineering job knowledge by attending educational workshops; reviewing professional publications; establishing personal networks; benchmarking state-of-the-art practices; participating in professional societies.
• Applies data extraction, transformation and loading techniques in order to connect large data sets from a variety of sources.
• Applies and implements best practices for data auditing, scalability, reliability and application performance.
Required Skills & Experience
• Bachelor's Degree in Computer Science, Information Technology or Engineering or related field.
• 5 years of experience with database design and developing modeling tools.
• Hands-on experience developing and updating ETL/ELT scripts using Ab Initio preferably in Hadoop ecosystem.
• Experience with HIVE DB, Spark and AWS ecosystem is preferred.
• Hands-on experience with application development, relational database layout, development, data modeling.
• In lieu of a Bachelor's degree, an additional 4 years of relevant work experience is required in addition to the required work experience.
• Strong knowledge of Cloudera Ecosystem, HIVE DB, AWS Platform - Glue, Aurora Postgress, RedShift, and S3.
• Knowledge and understanding of at least one programming language (i.e., SQL, NoSQL, Python).
• Knowledge and understanding of database design and implementation concepts.
• Knowledge and understanding of data exchange formats.
• Knowledge and understanding of data movement concepts.
• Knowledge and understanding of Ab Initio, Ab Initio CDC.
• Knowledge and understanding of CI/CD (preferably Jenkins).
• Knowledge and understanding of HIVE DB, Spark, Cloudera and AWS ecosystems.
• Strong technical and analytical and problem solving skills to troubleshoot to solve a variety of problems.
• Requires strong organizational and communication skills, written and verbal, with the ability to handle multiple priorities.
• Prior experience in healthcare.