What are the responsibilities and job description for the Data Lake Architect [Locals to MI are highly preferred] position at Info Dinamica Inc?
Job Details
Role: Data Lake Architect
Location: Auburn Hills, MI (Onsite from Day 1) Locals are highly preferred
Job Type: Contract
Job Requirements:
- Minimum of 10 years experience in advanced technologies including a minimum of 5 years as data lake admin/architect.
- Manage and maintain Data Lake clusters infrastructure on premise and in cloud: installation, configuration, performance tuning and monitoring of Hadoop clusters.
- Minimum 5 years work experience in Hadoop ecosystems (Horton HDP or Cloudera s CDP).
- Should demonstrate strong concepts in Unix/Linux, Windows OS, cloud platforms (AWS, Google Cloud Platform), Kubernetes, Open Shift & Docker.
- Must have good exposure to Cloudera manager, Cloudera Navigator or similar cluster management tool.
- Collaborate and assist developers in successful implementation of their code, monitor and fine tune their process for optimum resource utilization on cluster, ability to automate run time process.
- Must have good knowledge of HDFS, Ranger/Sentry, Hive, Impala, Spark, HBase, Kudu, Kafka, Kafka Connect, Schema Registry, Ni-Fi, Sqoop and other Hadoop related services.
- Exposure to Data Science collaborative tools such as data science workbench, CML, anaconda, etc.
- Strong Networking concepts: topology, proxy, F5, firewall.
- Strong security concepts: Active directory ,Kerberos, LDAP, SAML, SSL, data encryption @rest.
- Programming language concepts: Java, Perl, python, PySpark and Unix shell scripting.
- Possess experience in cluster management, perform cluster upgrade, migration, and testing.
- Perform periodic updates to cluster and keeping the stack current.
- Ability to expand clusters by adding new nodes and rebalance cluster storage systems.
- Manage application databases, application integration, users, roles, permissions within cluster.
- Collaborate with OpenShift, Unix, network, database and security teams on cluster related matters.
Technical Experience:
- Solid experience in Cloudera data lake environments both on prem and cloud.
- Solid experience in administration and set up including security topics related to a data lake.
- Strong experience architecting and designing solutions for new business needs.
- Thorough understanding and hands-on experience with implementing robust logging and tracing implementation for end-to-end systems traceability.
- Familiarity with Cloudera s BDR tool to perform and monitor backups of critical data and able to restore data when in need.
- Willing and ready to get hands on code development with dev team for developing and troubleshooting, doing quick proof of concepts for exploring new solutions, products etc.
- Experience in tuning and optimizing Hadoop environment in keeping clusters healthy and available for end users and applications with maximum cluster uptime as defined in SLA.
- Deep knowledge and related experience with Hadoop and its ecosystem components i.e., HDFS, Yarn, Hive, MapReduce, Pig, Sqoop, Oozie, Kafka, Spark, Presto and other Hadoop components.
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.