What are the responsibilities and job description for the Big Data Developer position at Donato Technologies, Inc.?
Job Description
Donato Technologies, established in 2012, excels as a comprehensive IT service provider renowned for delivering an exceptional staffing experience and prioritizing the needs of both clients and employees. We specialize in staffing, consulting, software development, and training, catering to small and medium-sized enterprises. While our core strength lies in Information Technology, we also deeply understand and address the unique business requirements of our clients, leveraging IT to effectively meet those needs. Our commitment is to provide high-quality, customized solutions using the optimal combination of technologies.
Responsibilities:
Job Summary β
PRIMARY COMPETENCY : Big Data Technologies PRIMARY SKILL : Apache Spark PRIMARY SKILL PERCENTAGE : 80 SECONDARY COMPETENCY : Python SECONDARY SKILL : Python
Please share your resumes at resumes@donatotech.net
Donato Technologies, established in 2012, excels as a comprehensive IT service provider renowned for delivering an exceptional staffing experience and prioritizing the needs of both clients and employees. We specialize in staffing, consulting, software development, and training, catering to small and medium-sized enterprises. While our core strength lies in Information Technology, we also deeply understand and address the unique business requirements of our clients, leveraging IT to effectively meet those needs. Our commitment is to provide high-quality, customized solutions using the optimal combination of technologies.
Responsibilities:
Job Summary β
- Develop and maintain data platforms using Python, Spark, and PySpark.
- Handle migration to PySpark on AWS.
- Design and implement data pipelines.
- Work with AWS and Big Data.
- Produce unit tests for Spark transformations and helper methods.
- Create Scala/Spark jobs for data transformation and aggregation.
- Write Scaladoc-style documentation for code.
- Optimize Spark queries for performance.
- Integrate with SQL databases (e.g., Microsoft, Oracle, Postgres, MySQL).
- Understand distributed systems concepts (CAP theorem, partitioning, replication, consistency, and consensus).
- Proficiency in Python, Scala (with a focus on functional programming), and Spark.
- Familiarity with Spark APIs, including RDD, DataFrame, MLlib, GraphX, and Streaming.
- Experience working with HDFS, S3, Cassandra, and/or DynamoDB.
- Deep understanding of distributed systems.
- Experience with building or maintaining cloud-native applications.
- Familiarity with serverless approaches using AWS Lambda is a plus
PRIMARY COMPETENCY : Big Data Technologies PRIMARY SKILL : Apache Spark PRIMARY SKILL PERCENTAGE : 80 SECONDARY COMPETENCY : Python SECONDARY SKILL : Python
Please share your resumes at resumes@donatotech.net