What are the responsibilities and job description for the Sr Google Cloud Platform Data Engineer with Pyspark & Scala position at Data Capital Inc?
Job Details
Responsibilities: As a Senior Data Engineer,
- Design and develop big data applications using the latest open-source technologies.
- Develop logical and physical data models for big data platforms.
- Automate workflows using Apache Airflow.
- Create data pipelines using Apache Hive, Apache Spark, Apache Kafka.
- Provide ongoing maintenance and enhancements to existing systems and participate in rotational on-call support.
- Experience building data pipelines in Google Cloud Platform
- Google Cloud Platform Dataproc, GCS & BIGQuery experience
- 5 years of hands-on experience developing a distributed data processing platform with Hadoop, Hive or Spark, Airflow or a workflow orchestration solution are required 5 years of hands-on experience in modeling and designing schema for data lakes or for RDBMS platforms. Experience with programming languages: Python, Java, Scala, etc.
- Experience with scripting languages: Perl, Shell, etc.
- Practice working with, processing, and managing large data sets (multi TB/PB scale). Exposure to test driven development and automated testing frameworks
Nice to Have:
Gitflow
Atlassian products BitBucket, JIRA, Confluence etc.
Continuous Integration tools such as Bamboo, Jenkins, or TFS
Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.