What are the responsibilities and job description for the Data Engineer position at SSi People?
Job Title: Senior Analytics Engineer
Work Location: 50% onsite at Summit West
Duration: 12 Months
Top Skills:
Advanced SQL skills (5 years)
2 years experience working with dbt
5 years working with relational databases
MS in Computer Science, Chemical Engineering, Biostatistics or similar with 6 years industry experience or PhD in Computer Science, Chemical Engineering, Biostatistics or similar with 3 years industry experience
Intermediate python skills
Software engineering technical skills and working in an Agile environment
Experience building and hosting python apps
Intermediate visualization (tableau, dashboarding) experience
Responsibilities
Independently works with stakeholders and ensures alignment cross functionally (and through management)
Performs data engineering, preprocessing, exploratory data analysis, and model development by interacting with a variety of databases
Completes milestones per commitments and is proactive in communicating any delays
Has a team mindset and is flexible with the need to adjust business processes
Able to independently find solutions and follow up with colleagues on dependencies and actions
Has a positive improvement mindset that helps the team grow and become more efficient
Responsible for ingestion, integration and delivery of data across multiple platforms
Works to maintain and uphold data integrity and clean data principles
Responsible for leading team code review and improving team programming practices
Responsible for independently coordinating and managing analytics projects across several departments and with cross functional stakeholders
Ability to work on a global team and communicate across several time zones
Communicates with team members regularly to provide updates and collaborate on deliverables.
Accountable for leading, documenting and managing analytics URS and UAT through execution for GPO
Lead and engage colleagues who complete data related activities
Design and deliver digital solutions that streamline access to analytics & data
Work with domain SMEs to derive insight and value to improve manufacturing related data transformations and improvement initiatives.
Displays a high level of teamwork and collaboration both within and across functions
Utilizes supervised or unsupervised methods, learning from vast amounts of unlabeled data to drive insight
Experience working with unstructured text
Ensures life cycle management of code is maintained through version control and associated repositories.
Develops high quality analytical and statistical models, insights, patterns, visualizations, that can be used to improve decision making in manufacturing operations.
Responsible for documentation of all technical work both within and outside of formal document management systems
Independently develops code and analytical models to automate data transformation and analysis
Experience developing applications using python and javascript
Experience with REST APIs
Requirements:
MS in Computer Science, Chemical Engineering, Biostatistics or similar with 3-6 years industry experience or PhD in Computer Science, Chemical Engineering, Biostatistics or similar with 3 years industry experience
Dashboard development experience (Tableau, Spotfire, DASH)
Proficient in writing and developing analytical and machine learning models using python modules including pandas, numpy, scikitlearn, and tensorflow. Experiencing developing and implementing MLOps pipelines.
Experience building analytical and statistical models to answer key business questions
Experience using git via the command line
Strong understanding of core statistical concepts to solve real world problems
Intermediate to advanced proficiency (3 years post academia experience as an independent contributor designing and delivering data solutions) in SQL.
Experience interacting with various data warehouses and large-scale, complex datasets using ETL and BI tools and platforms.
Self-motivated to identify and propose novel methodologies that will drive increased efficiency
demonstrate expert knowledge in machine learning and rule-based systems as applied to computational linguistics and natural language processing, as well as development and execution of annotation tasks with teams of experts
Proficiency in mathematics with the skill to translate complex mathematical algorithms into usable computational methods
Experience with data mining and analysis techniques across disparate data sources
Experience working in LINUX/UNIX environments
Experience interacting with PostgresSQL, Oracle, Impala Cloudera, Okera or similar databases
Experience with JupyterLabs, Anaconda, and RStudio
Intermediate proficiency with python
Experience developing visualizations using a variety of methods (plotly, matplotlib, seaborn)
Experience working within Domino Data Lab projects
Technical knowledge of performance tuning and query optimization across large data sets.
Experience with data cataloguing and enablement through APIs
Experience with a variety of computer science languages (C , Java, html/css)
Exposure to bioprocess engineering/cell therapy data
Knowledge of GxP requirements (preferably related to data and code management)
Experience with Program/Project Management. SCRUM experience highly desired
Preferred:
Familiar with NET/SAP
Knowledge of deep learning methods for NLP (quantitative area of study, Computer Science, preferred)
Strong background and demonstratable experience in Natural Language Processing and Computational Linguistics is required
Experience working with the pharmaceutical industry
Experience working with ERP systems