What are the responsibilities and job description for the Data Engineer_only on W2 position at Chelsoft Solutions Co.?
Data Engineer
Location: Hybrid in Palo Alto, CA
Type: Contract
Duration: 6 months
Required Skills : SQL, Epic domain knowledge. Python Spark or Pyspark APIs – know how to asynch and concurrent – Asynch IO is their bread and butter Manipulate JSON Documentation
What You Will Do
Location: Hybrid in Palo Alto, CA
Type: Contract
Duration: 6 months
Required Skills : SQL, Epic domain knowledge. Python Spark or Pyspark APIs – know how to asynch and concurrent – Asynch IO is their bread and butter Manipulate JSON Documentation
What You Will Do
- Build end-to-end data pipelines and infrastructure used by the Data Science team and others at SHC.
- Understand the requirements of data processing and analysis pipelines and make appropriate technical design and interface decisions. Elucidating these requirements will require training, developing, and validating researcher-built or vendor provided machine learning algorithms on hospital data as well as working with other members of the data science team.
- Understand data flows among the SHC applications and use this knowledge to make recommendations and design decisions for languages, tools, and platforms used in software and data projects.
- Troubleshoot and debug environment and infrastructure problems found in production and non-production environments for projects by the Data Science Team.
- Work with other groups at SHC and the Technology and Digital Solutions (TDS) group to ensure servers and system maintenance based on updates, system requirements, data usage, and security requirements
- Bachelor’s or Master’s degree in Computer Science, Engineering, or related, or equivalent working experience
- Bachelor’s or Master’s degree in Computer Science, Engineering, or related, or equivalent working experience
- 5 years’ experience in building data infrastructure for analytics teams, including ability to write code for processing large datasets in distributed cloud environments
- Experience with SQL, Spark, Python, PySpark
- Strong in APIs - Async IO
- Experience manipulating JSON objects
- Experience with cloud deployment strategies and CI/CD
- Experience building and working with data infrastructure in a SaaS environment
- Knowledge of multiple programming languages, commitment to choosing languages based on project-specific requirements, and willingness to learn new programming languages as necessary.
- Knowledge of resource management and automation approaches such as workflow runners.
- Collaborative mentality and excitement for iterative design working closely with the Data Science team.